Hi everyone!I have a typical trade table:q)tradeprice amount moment-------------------------------------------150260 1 2011.10.24D19:00:00.113000000150250 1 2011.10.24D19:00:00.113000000…Trying to calculate a wavg for last N seconds for each trade.Solution I came up with involves global variables and doesn’t lookthat good to me:q)a:timestamp$();b:
float$();c:int$();d:
float$();q)func:{[row]moment:rowmoment;amount:row
amount;price:row`price;idx:where(moment-a)<5000000000j;a@::idx;a::a,moment;b@::idx;b,::price;c@::idx;c,::amount;d,::cwavg b;}q)func each tradeq)dIs there a better solution?
how about a wj to itself
On Oct 28, 11:35 am, Man Kei Lee <mankei…> wrote:> On Thu, Oct 27, 2011 at 6:32 PM, Serge <sergey.tychi…> wrote:>> > q)trade> > price amount moment> > -------------------------------------------> > 150260 1 2011.10.24D19:00:00.113000000> > 150250 1 2011.10.24D19:00:00.113000000>> > Trying to calculate a wavg for last N seconds for each trade.q)t:(price:100?100.;size:1+100?10;time:asc"p"$.z.P+1e9100?3600)q)tprice size time-------------------------------------------61.73515 4 2011.10.28D11:32:14.68876499290.11315 8 2011.10.28D11:32:50.68876499296.61672 7 2011.10.28D11:33:03.688764992..> how about a wj to itselfcouldn’t figure out a way to wedge dyadics into wj, soq)update a:size wavg’price from wj[flip t[`time]-:(60"n"$1e9,0);`time;t;(t;(::;`price);(::;`size))]q)delete price,size from update a:size wavg’price from wj[flipt[`time]-:(60*“n”$1e9,0);`time;t;(t;(::;`price);(::;`size))]time a--------------------------------------2011.10.28D11:32:14.688764992 61.735152011.10.28D11:32:50.688764992 80.653812011.10.28D11:33:03.688764992 86.53488..alternatively,q)60{[x;y]exec size wavg price from t where time within(y-1e9x,0)}'t`time61.73515 80.65381 86.53488 79.23071 27.2449 47.45604 32.42692 28.5903413.582..q)(update wap:60{[x;y]exec size wavg price from t where timewithin(y-1e9x,0)}'time from t)price size time wap----------------------------------------------------61.73515 4 2011.10.28D11:32:14.688764992 61.7351590.11315 8 2011.10.28D11:32:50.688764992 80.6538196.61672 7 2011.10.28D11:33:03.688764992 86.53488..q)update wap:{s;p;t;x;y.(s;p)@:where t within(y-1e9x,0)}[size;price;time;60]each time from tq)update wap:{s;p;t;x;y.(s;p)@:where t within(y-1e9x,0)}[size;price;time;60]each time from tprice size time wap----------------------------------------------------61.73515 4 2011.10.28D11:32:14.688764992 61.7351590.11315 8 2011.10.28D11:32:50.688764992 80.6538196.61672 7 2011.10.28D11:33:03.688764992 86.53488..</sergey.tychi…></mankei…>
one method I can think of
tbl:update?? wavgN:qty wavg price by time-int$(time mod 60000) from
time xdesc tbl
Prashant
q)t:(price:100?100.;size:1+100?10;time:asc"p"$.z.P+1e9*100?3600)
let’s make t bigger to be realistic
q)n:100000;t:(price:n?100.;size:1+n?10;time:asc n?24t)
q)60{[x;y]exec size wavg price from t where time within(y-1e9*x,
0)}'t`time
q)\t naive:{exec size wavg price from t where time within x-1 0t}each t`time
8500
And then there is this simple idea which never gets old
Easiest to see for sum: sum in an interval
can be efficiently calculated if we calculate the cumulative sum
then do a lookup at the two ends of the interval
and do a simple difference between the two cumulative values
easy to generalize to wavg, cov, var, etc.
In fact this is also how kdb+'s internal map-reduce does it across partitions
In this case adding cumulative volumes and values
q)\t update time,cum_size:sums size,cum_val:sums price*size from`t
1
Do the lookup
q)\t t:aj[`time;t;select time+1t,prev_cum_size:cum_size,prev_cum_val:cum_val from t]
3
Deltas, note the 0^ for the initial cases at the beginning
q)\t update vwap:(cum_val-0^prev_cum_val)%cum_size-0^prev_cum_size from`t
7
11ms in total vs 7977ms
q)5?update naive from t
price size time cum_size cum_val prev_cum_size prev_cum_val vwap naive
---------------------------------------------------------------------------------------------
69.3842 9 05:21:27.984 145632 7262266 122182 6093489 49.84122 49.84122
99.76577 2 22:14:11.057 532587 2.66232e+07 510123 2.549036e+07 50.42898 50.42898
24.93458 2 03:04:50.525 93220 4644538 70007 3499687 49.31939 49.31939
39.26955 10 14:59:53.124 368271 1.842645e+07 344602 1.724998e+07 49.70515 49.70515
34.80508 10 20:35:51.035 495555 2.476728e+07 472540 2.361316e+07 50.14639 50.14639
Note that these two methods will give different numbers in the following (relatively rare? and slightly confusing) cases:
- multiple trades at exactly the same time (easy to fix - preaggregate by time)
- there is a trade at both ends of the interval (the naive approach is inclusive at both sides)
Cheers,
Attila