using binary sort finding in select

Hi,

I have a table that is parted by a symbol 

all entries to one symbol are stored sorted by date.

i want to select the value to a symbol at a specific date

last select from table where sym=`a,date<=2014.04.01

as date is sorted (but i cant set `s# attribute as the sorting

is only valid for one symbol) i could use binary search

bin[date;2014.04.01]  for fast finding the row. 

How can I use “bin” in a select template?

thank you! Markus

asof
Attila

great! thank you!

Hi,

selecting one row with asof works great, but how can i select multiple rows in a range?

select from table where sym=`a,date>2014.04.01,date<2014.07.01

this query works but does not take advantage of the sorted date column (sorted for sym=`a)

Can i somehow use binary search to have a faster lookup?

As far as I can see asof can not do this for me.

Markus

Hi Markus 

If you only have one record per sym and date then you can construct a table of the syms and dates that you want, then use that with asof.  However, be aware that it will be a <= match - so if you require exact match the it?s not what you want. 

In terms of your query then within will speed things up:

q)n:10000000                                                                                                            

q)t:update p#sym from sym xasc t:(sym:n?-100?`4; date:asc n?2015.01.01;price:n?100f)                                

q)\t:1000 select from t where sym=`adkk,date within 2010.01.01 2010.04.01                                               

120

q)\t:1000 select from t where sym=`adkk,date>2009.12.31, date<2010.04.02                                                

195

In terms of utilising the fact that date is sorted within each sym, the within operation is optimised for sorted lists but to flag the list as sorted probably adds to much overhead: 

q)\t:1000 select from t where sym=adkk,(s#date) within 2010.01.01 2010.04.01                                                                       

168

You can sometimes do tricky things with i and bin on sorted columns to speed things up, but I don?t think this technique can be utilised here because i is global to the table rather than for the sub-selected sample

q)t1:`date xasc t

q)\t:1000 a:select from t1 where date within 2010.01.01 2010.04.01                                                                                   

547

(I?m not sure if the edge cases are quite right here but you get the idea)

q)\t:1000 b:select from t1 where i within 1 0 + date bin 2009.12.31 2010.04.01                                                                       

225

q)a~b                                                                                                                                                

1b

Thanks 

Jonny

AquaQ Analytics

Hi Markus,

Here’s another solution that has quite a significant speedup versus the select query, but it requires the table be resorted by date and a g# applied to the sym column, and the table to be in-memory. The in-memory requirement is because an in-memory g# gives instant access to group[t`sym].

q)n:10000000

q)t:update p#sym from sym xasc t:(sym:n?-100?`4; date:asc n?2015.01.01;price:n?100f)

q)\t:1000 select from t where sym=`adkk,date within 2010.01.01 2010.04.01

489

q)update g#sym from date xasc `t;

q)f:{t i {x+til y-x}. 1+(i:group[tsym][x]) bin t[date] bin -1 0+y,z}

q)\t:1000 r:f[`adkk;2010.01.01;2010.04.01]

37

q)r~select from t where sym=`adkk, date within 2010.01.01 2010.04.01

1b

All the best,

Glen 
AquaQ Analytics

Hi thank you all for the great feedback. It really helped me a lot!
Markus

one can use the `p# version of the table

better on disk

q)T:t //before the resorting

q)update g#sym from date xasc `t;

q)f:{t i {x+til y-x}. 1+(i:group[tsym][x]) bin t[date] bin -1 0+y,z}

q)\t:10000 r:f[`adkk;2010.01.01;2010.04.01]

102

q)\t:10000 a:select from T where i within 1 0+(symdate#T)bin(sym:`adkk;date:-1 0+2010.01.01 2010.04.01)

36

q)a~r

1b

If we would need code which works on non-integer binary searched columns as well

q)\t:10000 a:select from T where i within(binr;bin)[;symdate#T]@'(sym:`adkk;date:2010.01.01 2010.04.01)

41

Cheers,

  Attila