Hi, all,I am trying to figure out how to delete the first record by group .Need help.I have a table “stk_data” as below to contain all stocks OHLCVrecords:sym date open high low close volume openint-----------------------------------------------------------------SH600809 2012.03.19 71.6 71.6 68.71 69.27 1.517e+004 1.058e+008SH600809 2012.03.16 69.46 71.58 69.31 71.02 1.3e+004 9.214e+007SH600809 2012.03.15 68.01 69.69 68.01 69.35 9751 6.736e+007…And transform to include the return rate as below “rt_tb” table:rt_tb:select date,close,ud:deltas close, rt:100*((deltas close)%(close-deltas close)) by sym from sym xasc
date xasc stk_datasym |date ..--------|-----------------------------------------------------------------------------------------------------------..SH000001| 2008.01.02 2008.01.03 2008.01.04 2008.01.07 2008.01.082008.01.09 2008.01.10 2008.01.11 2008.01.14 2008.01…SH000002| 2008.01.02 2008.01.03 2008.01.04 2008.01.07 2008.01.082008.01.09 2008.01.10 2008.01.11 2008.01.14 2008.01…SH000003| 2008.01.02 2008.01.03 2008.01.04 2008.01.07 2008.01.082008.01.09 2008.01.10 2008.01.11 2008.01.14 2008.01…Obviously, the “rt” field of first record in each “grouped record” ismeanless (it is 0w for not exist previous close price). So how torewrite the “rt_tb” script to get rid of the meanless record ?Thanks,Halley
Try this:
rt:1_100*(… fromsym
date xasc stk_data
cheers
Patryk
Thanks,On 3??19??, ???7?00??, Patryk Bukowinski
<p.bukowin…> wrote:> Try this:>> rt:1_100*(… fromsym
date xasc stk_data>> cheers> Patryk> On Mar 19, 2012 10:13 AM, “bigbug” <matlab…> wrote:>>>> > Hi, all,>> > I am trying to figure out how to delete the first record by group .> > Need help.>> > I have a table “stk_data” as below to contain all stocks OHLCV> > records:>> > sym date open high low close volume openint> > -----------------------------------------------------------------> > SH600809 2012.03.19 71.6 71.6 68.71 69.27 1.517e+004 1.058e+008> > SH600809 2012.03.16 69.46 71.58 69.31 71.02 1.3e+004 9.214e+007> > SH600809 2012.03.15 68.01 69.69 68.01 69.35 9751 6.736e+007> > …>> > And transform to include the return rate as below “rt_tb” table:>> > rt_tb:select date,close,ud:deltas close, rt:100*((deltas close)%(close-> > deltas close)) by sym from sym xasc
date xasc stk_data>> > sym |> > date> > ..> > --------|>> > ------------------------------------------------------------------------------------------------------------..> > SH000001| 2008.01.02 2008.01.03 2008.01.04 2008.01.07 2008.01.08> > 2008.01.09 2008.01.10 2008.01.11 2008.01.14 2008.01…> > SH000002| 2008.01.02 2008.01.03 2008.01.04 2008.01.07 2008.01.08> > 2008.01.09 2008.01.10 2008.01.11 2008.01.14 2008.01…> > SH000003| 2008.01.02 2008.01.03 2008.01.04 2008.01.07 2008.01.08> > 2008.01.09 2008.01.10 2008.01.11 2008.01.14 2008.01…>> > Obviously, the “rt” field of first record in each “grouped record” is> > meanless (it is 0w for not exist previous close price). So how to> > rewrite the “rt_tb” script to get rid of the meanless record ?>> > Thanks,> > Halley>> > –> >
Submitted via Google Groups</matlab…></p.bukowin…>
next is a bad choice here;
it adds 0N at the end of the list
is two times slower
uses twice as much space
q)a:100000?234.
q)a
196.2136 42.35917 97.59831 63.39222 71.42954 3.930051 160.196 106.1094 189.64..
q)\ts do[10000;1_a]
9515 1048768j
q)\ts do[10000;next a]
19937 2097344j
cheers,
Patryk
2012/3/21 Ajay <rathore.ajay@gmail.com>
Another way
… rt: next 100*((deltas close)%(close- deltas close))…
It’s not surprising that parallelized algo is faster, done properly can be even faster.
Please provide fair benchmarks as well;
Btw. parallel size is not accurate.
8core Xeon
q)\ts do[1000; 1!(select sym,date from tNew) ,’ flip (enlist `rt)! enlist rRate peach exec close from select date,close by sym from `sym`date xasc t]
5509 971104j
q)\ts do[1000;select sym,date,rt:.Q.fc[{100*1_'d%x-d:deltas each x}]close from select date,close by sym from sym
date xasc t]
3670 970864j
I’m pretty sure this can be improved further…
?
P
sent from droid
try this is in k instead
your most recent code took
3766
q)\t do[1000;select date,rt:1_rt by sym from select sym,date, rt:{100*y%x-y}[close;deltas close] from sym
date xasc t]
2279
both best,
please don’t put that on gpu, pick Rohit challenge instead.
P like Patryk
sent from limbo ;-)
woow? i learned a lot from your guys..On Mar 23, 2:59 am, Patryk Bukowinski
<p.bukowin…> wrote:> your most recent code took> 3766>> q)\t do[1000;select date,rt:1_rt by sym from select sym,date,> rt:{100y%x-y}[close;deltas close] from sym
date xasc t]> 2279>> both best,> please don’t put that on gpu, pick Rohit challenge instead.>> P like Patryk>> sent from limbo ;-)> On Mar 22, 2012 2:47 PM, “Ajay” <rathore.a…> wrote:>>>> > Definitely can be improved>> > rRate: {100 1 ’ d%x-d:deltas each x}>> > q)\ts do[1000; select sym,date,rt: raze rRate peach 100 cut close> > from select date,close by sym from sym
date xasc t]> >5780 839552j>> > q)\ts do[1000;select sym,date,rt:.Q.fc[{100*1’d%x-d:deltas each> > x}]close from select date,close by sym from sym
date xasc t]> >5859 839504j>> > .Q.fc always cuts into two halves which might not be that efficient>> > P>> > sent from IPhone>> > On Mar 22, 12:45 pm, Patryk Bukowinski <p.bukowin…> wrote:> > > It’s not surprising that parallelized algo is faster, done properly can> > be> > > even faster.>> > > Please provide fair benchmarks as well;> > > Btw. parallel size is not accurate.>> > > 8core Xeon>> > > q)\ts do[1000; 1!(select sym,date from tNew) ,’ flip (enlist `rt)! enlist> > > rRate peach exec close from select date,close by sym from `sym`date xasc> > t]> > >5509 971104j>> > > q)\ts do[1000;select sym,date,rt:.Q.fc[{1001_'d%x-d:deltas each x}]close> > > from select date,close by sym from `sym`date xasc t]> > >3670 970864j>> > > I’m pretty sure this can be improved further…>> > > P>> > > sent from droid> > > On Mar 22, 2012 11:13 AM, “Ajay” <rathore.a…> wrote:>> > > > Shouldnt have been much lazy in analysing, next does seem to utilize> > > > more space in the outset.>> > > > Here is another approach using peach and slaves for computing the> > > > return rate for each group in parallel which can further improve the> > > > performance, looks memory efficient too (I am running with 2 slaves> > > > on> > > > a 2 core machine)> > > > q) t:([]sym:10000?`3;date:10000?.z.d;close:10000?200f)>> > > > q) tNew: select date,close by sym from `sym`date xasc t>> > > > q) rRate:{1001_((deltas x)%(x-deltas x))}>> > > > q)\ts do[1000; 1!(select sym,date from tNew) ,’ flip (enlist `rt)!> > > > enlist rRate peach exec close from tNew]> > > >6021 153568j>> > > > q)\ts do[1000; select date, rt: 1 _ 100*((deltas close)%(close-deltas> > > > close)) by sym from `sym`date xasc t]> > > >11989 794704j>> > > > Since we are talking about performance and memory, it takes half the> > > > time and much less memory.>> > > > Cheers-> > > > Ajay>> > > > On Mar 21, 9:48 pm, Patryk Bukowinski <p.bukowin…> wrote:> > > > > next is a bad choice here;> > > > > it adds 0N at the end of the list> > > > > is two times slower> > > > > uses twice as much space>> > > > > q)a:100000?234.> > > > > q)a> > > > > 196.2136 42.35917 97.59831 63.39222 71.42954 3.930051 160.196> >106.1094> > > > > 189.64..> > > > > q)\ts do[10000;1_a]> > > > >9515 1048768j> > > > > q)\ts do[10000;next a]> > > > >19937 2097344j>> > > > > cheers,> > > > > Patryk>> > > > > 2012/3/21 Ajay <rathore.a…>>> > > > > > Another way>> > > > > > … rt: next 100*((deltas close)%(close- deltas> > close))…>> > > > > > On Mar 19, 10:13 am, bigbug <matlab…> wrote:> > > > > > > Hi, all,>> > > > > > > I am trying to figure out how to delete the first record by> > group .> > > > > > > Need help.>> > > > > > > I have a table “stk_data” as below to contain all stocks OHLCV> > > > > > > records:>> > > > > > > sym date open high low close volume openint> > > > > > > -----------------------------------------------------------------> > > > > > > SH600809 2012.03.19 71.6 71.6 68.71 69.27 1.517e+004 1.058e+008> > > > > > > SH600809 2012.03.16 69.46 71.58 69.31 71.02 1.3e+004 9.214e+007> > > > > > > SH600809 2012.03.15 68.01 69.69 68.01 69.35 9751 6.736e+007> > > > > > > …>> > > > > > > And transform to include the return rate as below “rt_tb” table:>> > > > > > > rt_tb:select date,close,ud:deltas close, rt:100*((deltas> > > > close)%(close-> > > > > > > deltas close)) by sym from `sym xasc `date xasc stk_data>> > > > > > > sym |> > > > > > > date> > > > > > ..> > > > > > > --------|>> > --------------------------------------------------------------------------- --------------------------------..> > > > > > > SH000001| 2008.01.02 2008.01.03 2008.01.04 2008.01.07 2008.01.08> > > > > > > 2008.01.09 2008.01.10 2008.01.11 2008.01.14 2008.01…> > > > > > > SH000002| 2008.01.02 2008.01.03 2008.01.04 2008.01.07 2008.01.08> > > > > > > 2008.01.09 2008.01.10 2008.01.11 2008.01.14 2008.01…> > > > > > > SH000003| 2008.01.02 2008.01.03 2008.01.04 2008.01.07 2008.01.08> > > > > > > 2008.01.09 2008.01.10 2008.01.11 2008.01.14 2008.01…>> > > > > > > Obviously, the “rt” field of first record in each “grouped> > record” is> > > > > > > meanless (it is 0w for not exist previous close price). So how> > to> > > > > > > rewrite the “rt_tb” script to get rid of the meanless record ?>> > > > > > > Thanks,> > > > > > > Halley>> > > > > > –> > > > > > You received this message because you are subscribed to the Google> > > > Groups> > > > > > “Kdb+ Personal Developers” group.> > > > > > To post to this group, send email to> > personal-kdbplus@googlegroups.com> > > > .> > > > > > To unsubscribe from this group, send email to> > > > > > personal-kdbplus+unsubscribe@googlegroups.com.> > > > > > For more options, visit this group at> > > > > >http://groups.google.com/group/personal-kdbplus?hl=en.-Hidequoted> > > > text ->> > > > > - Show quoted text ->> > > > –> > > > You received this message because you are subscribed to the Google> > Groups> > > > “Kdb+ Personal Developers” group.> > > > To post to this group, send email to personal-kdbplus@googlegroups.com> > .> > > > To unsubscribe from this group, send email to> > > > personal-kdbplus+unsubscribe@googlegroups.com.> > > > For more options, visit this group at> > > >http://groups.google.com/group/personal-kdbplus?hl=en.-Hide quoted> > text ->> > > - Show quoted text ->> > –> >
Submitted via Google Groups</matlab…></rathore.a…></p.bukowin…></rathore.a…></p.bukowin…></rathore.a…></p.bukowin…>
last one is a cheat which is actually 0.5 times slower.
{1_x} peach by…
please check your code before you post…
I won’t post my next algo until you’ll find it on your own.
out of topic, but…
I have to say that I also received some message with that peach call. See below. But when looking through groups.google interface that post is missing. And this is the one that is not working…
And you two should calm down! Who is going to write the same query in k at last?! ;)
ok, here’s one:
k) rates:{ +{sym
date`rt !(,:!x),+.:+x } (y i j;100*d%z-d:-‘:z:z i j)@: 1_’=x i j:<x i:<y}.
\t do[1000;rates t `sym`date`close]
2217
slightly faster than old select which took
2679
looks like there is no benefit from running deltas in parallel, maybe because it needs to keep track of previous values (if not implemented with two vectors), worth putting more effort to utilize more cores.
also not sure why it takes a little bit more space..maybe second index…
Cheers,
Patryk
There was a typo in previous one;
but here is faster and shorter one:
k) rates:{+{sym
date`rt!(!z;.:x z;.:(100*d%y-d:-':y)1_'z)}[y i j;z i j;=x i j:<x i:<y]}.
Running on old celeron :( , so can’t parallelize it… anyone?
Cheers,
Patryk
2012/3/23 Patryk Bukowinski <p.bukowinski@gmail.com>
ok, here’s one:
k) rates:{ +{
sym
date`rt !(,:!x),+.:+x } (y i j;100*d%z-d:-‘:z:z i j)@: 1_’=x i j:<x i:<y}.\t do[1000;rates t `sym`date`close]
2217slightly faster than old select which took
2679looks like there is no benefit from running deltas in parallel, maybe because it needs to keep track of previous values (if not implemented with two vectors), worth putting more effort to utilize more cores.
also not sure why it takes a little bit more space..maybe second index…
Cheers,
Patryk