Is there linear regression algorithms written by Q language?

I am looking for  linear regression, and k-mean algorithm for KDB, but cannot find it. 
I guess someone must have already implemented it before. I would appreciate it if someone could share with me. thanks a lotQ

You can try using lsq function (http://code.kx.com/wiki/Reference/lsq) to implement linear regression. It should work well, if the number of variables in the problem you are trying to solve is not too large. In case of highly dimensional problems it’s rather recommended to implement linear regression using gradient descent or some other function minimization algorithm.

As far as k-means algorithm is concerned, I don’t have an implementation in q, but I recently did one in octave. I will translate and post it somewhere once I get some free time, probably during New Year’s break.

Thanks,
Pawel

2013/12/17 afancy <toxiufeng@gmail.com>

I am looking for  linear regression, and k-mean algorithm for KDB, but cannot find it. 
I guess someone must have already implemented it before. I would appreciate it if someone could share with me. thanks a lotQ


Submitted via Google Groups

Yes, there is one available from Andrey Zholos in qml:<o:p></o:p>

<o:p> </o:p>

/ linreg[y;X]: Performs linear regression of y (vector) on X (list of vectors).<o:p></o:p>

/   This function computes the least squares estimates of parameters and<o:p></o:p>

/   covariance matrix and then calls linregtests to compute test statistics.<o:p></o:p>

/.<o:p></o:p>

/   e.g. exec linreg[price;(1.;sign;quantity)] from trades  / 1. for constant<o:p></o:p>

/.   <o:p></o:p>

/   Returns dictionary:<o:p></o:p>

/     `X     = X (list of row vectors)<o:p></o:p>

/     `y     = y (vector)<o:p></o:p>

/     `S     = covariance matrix<o:p></o:p>

/     `b     = parameter estimates<o:p></o:p>

/     `e     = residuals<o:p></o:p>

/     `n     = number of observations<o:p></o:p>

/     `m     = number of parameters<o:p></o:p>

/     `df    = degrees of freedom<o:p></o:p>

<o:p> </o:p>

linreg1:{[y;X]<o:p></o:p>

    if[any[null y:“f”$y]|any{any null x}‘[X:“f”$X];’`nulls];<o:p></o:p>

    if[$[0=m:count X;1;m>n:count X:flip X];'`length];<o:p></o:p>

    Z:.qml.minv[flipmmu X];<o:p></o:p>

    ZZ:X mmu Z mmu flip; <o:p></o:p>

    e:y- yhat:X mmu beta:Z mmu flip mmu y;<o:p></o:p>

    linregtests1 ``XySbetaenmdfZZZyhat!(::;X;y;Z*mmu[e;e]%n-m;beta;e;n;m;n-m;ZZ;Z;yhat)};<o:p></o:p>

<o:p> </o:p>

If you don’t want to rely on qml then just replace .qml.minv with lsq.<o:p></o:p>

<o:p> </o:p>

The linregtests1 function calculates some diagnostics numbers. The implementation depends heavily on qml<o:p></o:p>

<o:p> </o:p>

/ linregtests[R]: Perform linear regression tests on a set of estimation<o:p></o:p>

/   results. This function is called automatically by linreg, but can be called<o:p></o:p>

/   again, for example, if the covariance matrix is adjusted. None of the values<o:p></o:p>

/   returned by linreg are recalculated, in particular, if b is adjusted, e<o:p></o:p>

/   needs to be recalculated.<o:p></o:p>

/.<o:p></o:p>

/   Updates R dictionary with:<o:p></o:p>

/     `se    = standard error of estimates vector<o:p></o:p>

/     `tstat = vector of t-statistics<o:p></o:p>

/     `tpval = vector of p-values for t-test<o:p></o:p>

/     `rss   = sum of squared residuals<o:p></o:p>

/     `tss   = total sum of squares<o:p></o:p>

/     `r2    = R-squared statistic<o:p></o:p>

/     `r2adj = adjusted R-squared<o:p></o:p>

/     `fstat = f-statistic<o:p></o:p>

/     `fpval = p-value for f-test<o:p></o:p>

<o:p> </o:p>

linregtests1:{[R]<o:p></o:p>

    tstat:R[beta]%se:sqrt R[S]@'til count R`S;<o:p></o:p>

    fstat:(R[df]*rss-tss:{x mmu x}R[y]-+/[Ry]%Rn)%(1-Rm)*rss:e mmu e:Re;<o:p></o:p>

    R,m:setstattpvalrsstssr2r2adjfstat`fpval!(se;tstat;<o:p></o:p>

        2*1-R[`df] .qml.stcdf/:abs tstat;rss;tss;1-rss%tss;<o:p></o:p>

        1-(rss*-1+Rn)%tss*Rdf;fstat;1-.qml.fcdf[-1+Rm;Rdf;fstat])};<o:p></o:p>

<o:p> </o:p>

You should exclude it from linreg1 if you don’t need it.<o:p></o:p>

<o:p> </o:p>

Kim<o:p></o:p>

<o:p> </o:p>

Von: personal-kdbplus@googlegroups.com [mailto:personal-kdbplus@googlegroups.com] Im Auftrag von afancy
Gesendet: Dienstag, 17. Dezember 2013 14:48
An: personal-kdbplus@googlegroups.com
Betreff: [personal kdb+] Is there linear regression algorithms written by Q language?<o:p></o:p>

<o:p> </o:p>

I am looking for  linear regression, and k-mean algorithm for KDB, but cannot find it. <o:p></o:p>

I guess someone must have already implemented it before. I would appreciate it if someone could share with me. thanks a lotQ<o:p></o:p>


Submitted via Google Groups

I might have a k-means implementation I’ve been fooling around with if you’re still interested.

 

y = -3 - 2x + x^2

q)x:    1.0    2.0    3.0    4.0

q)y:   -4.0   -3.0    0.0    5.0

fit:{(enlist y) lsq x xexp/: til 1+z}

q)fit[x;y;2]

-3 -2 1

Read it here: http://code.kx.com/wiki/Reference/lsq&nbsp;

Linear regression - https://github.com/krish240574/kaggle-deandecock - Linear regression applied to Kaggle’s DeAn De Cock competition. (Predicting housing prices, based on the Ames housing dataset - - https://www.kaggle.com/c/house-prices-advanced-regression-techniques )
K means - https://github.com/jlas/ml.q/blob/master/ml.q - a bunch of useful algorithms implemented in q. 

Kumar