Hello!
I’m sure this is a quite common problem, but I struggle to find efficient solution of realistic data sizes (~1mio orders and ~100k trades).
Problem: given a table with orders’ start and end times find whether there was at least one trade for the same sym.
E.g. for inputs like that
q)order:([]oid:1 2 3 4;sym:
ab
ab;st:
s#00:00:00 00:00:02 00:00:04 00:00:04;et:00:00:01 00:00:07 00:00:06 00:00:05)q)orderoid sym st et-------------------------1 a 00:00:00 00:00:012 b 00:00:02 00:00:073 a 00:00:04 00:00:064 b 00:00:04 00:00:05q)trade:(sym:a
b;t:s#00:00:03 00:00:06;qty:10 100)q)tradesym t qty----------------a 00:00:03 10b 00:00:06 100
the result should be
q)resultid sym st et hadTrade----------------------------------1 a 00:00:00 00:00:01 04 b 00:00:04 00:00:05 03 a 00:00:04 00:00:06 02 b 00:00:02 00:00:07 1
On real data sizes solution I got with aj0 takes around 10 minutes:
)delete qty,t from update hadTrade:not[null qty]&t>st from aj0[
symt;
t xasc update t:et from order;trade]oid sym st et hadTrade----------------------------------1 a 00:00:00 00:00:01 04 b 00:00:04 00:00:05 03 a 00:00:04 00:00:06 02 b 00:00:02 00:00:07 1`
Your order table already defines window to search in so you are well set up use a window join (wj1):
q)show order:([]oid:1 2 3 4;sym:
ab
ab;st:
s#00:00:00 00:00:02 00:00:04 00:00:04;et:00:00:01 00:00:07 00:00:06 00:00:05);oid sym st et-------------------------1 a 00:00:00 00:00:012 b 00:00:02 00:00:073 a 00:00:04 00:00:064 b 00:00:04 00:00:05q)show trade:(sym:a
b;t:s#00:00:03 00:00:06;qty:10 100);sym t qty----------------a 00:00:03 10b 00:00:06 100q)show w:value exec st, et from order;00:00:00 00:00:02 00:00:04 00:00:0400:00:01 00:00:07 00:00:06 00:00:05q)wj1[w;
symt;update t:st from order;(update hasTrade:0<qty from trade;(any;
hasTrade))]oid sym st et t hasTrade-------------------------------------------1 a 00:00:00 00:00:01 00:00:00 02 b 00:00:02 00:00:07 00:00:02 13 a 00:00:04 00:00:06 00:00:04 04 b 00:00:04 00:00:05 00:00:04 0`
NB: it is important that you use wj1 here and not wj, as the latter will count the last trade prior to each of your windows. See https://code.kx.com/q/ref/wj/
does this work?
update hasTrade:any each t within (st;et) from order lj select t by sym from trade
Thank you both!
I tried several options and within+grouping by sym proved to be most time and space efficient (put _ delimeters in numbers for convenience):
N:100000;T:
int$N%10;S:20?1;order:([]oid:1+til N;sym:N?S;st:asc 00:00:00+N?100);update et:st+1+N?10 from
order;trade:(t:asc 00:00:01+T?100;sym:T?S;qty:1+T?1000); w:value exec st,et from order; fn:{[d] d[hadTrade]:any each within[d
t;] each flip (dst;d
et); :delete t from d;}; \ts:10 st
oid xasc select sym,oid,st,et,hadTrade from update hadTrade:not[null qty]&t>=st from aj0[sym
t;t xasc update t:et from order;trade]/ 19_263j, 7_605_552j \ts:10
stoid xasc select sym,oid,st,et,hadTrade from update hadTrade:any each t within (st;et) from order lj select t by sym from trade/ 3_255j, 367_876_048j \ts:10
stoid xasc select sym,oid,st,et,hadTrade from wj1[w;
symt;update t:st from order;(update hadTrade:1 from trade;(any;
hadTrade))]/ 43_268j, 9_683_904j \ts:10 st
oid xasc select sym,oid,st,et,hadTrade from ungroup fn each (select oid,st,et by sym from order) lj (select t by sym from trade)/ 1_787j, 23_676_272j \ts:10 st
oid xasc select sym,oid,st,et,hadTrade from ungroup fn peach (select oid,st,et by sym from order) lj (select t by sym from trade)/ 1_4223j, 7_882_112j`
Your use of aj0 should be much faster than wj, but there is another way to gain significant speedup by directly using bin and binr
q)count order1000000q)count trade100000q)\t s1:delete qty,t from update hadTrade:not[null qty]&t>st from aj0[
symt;
t xasc update t:et from order;trade]147900q)q)\t lookup:exec t by sym from t xasc trade0q)\t s2:delete t from update hadTrade:{binr[x;y+1]<=x bin z}[lookup sym 0;st;et] by sym from order288q)(
oid xasc s1) ~ oid xasc s21b
Yay! That’s what I was looking for - binary search!
Thanks Ajay - that’s a clear winner (I removed +1 to get correct “edges included” results):
\ts:10
stoid xasc select sym,oid,st,et,hadTrade from update hadTrade:{binr[x;y]<=x bin z}[lookup sym 0;st;et]by sym from order
/ 275j, 7_490_080j
Happy To Help!!
I added +1 to match the result of your original query, make sense to remove it.
-Ajay