Performance using a table to combine where phrases

Hi all,

Passing a table in to a where clause instead of the individual values significantly affects performance, even compared to using & to join all the where phrases into one. What is the query doing under the covers?

1#.eoh.f date SRC PRICE_TYPE INDEX1 INDEX2 CONTRACT1 PROFILE CONTRACT_TYPE -------------------------------------------------------------------------- 2023.06.28 market dev JPM 2023.10.M BLA // PUBLICATIONS is date-partitioned, and doesn’t have attributes on the columns we filter on \ts select from PUBLICATIONS where (date;SRC;PRICE_TYPE;INDEX1;INDEX2;CONTRACT1;PROFILE;CONTRACT_TYPE) in 1#.eoh.f 1176 177568 \ts select from PUBLICATIONS where date=2023.06.28,src=market,PRICE_TYPE=dev,INDEX1=JPM,INDEX2=,CONTRACT1=2023.10.M,PROFILE=,CONTRACT_TYPE=BLA 5 2101760 \ts select from PUBLICATIONS where (date=2023.06.28)&(src=market)&(PRICE_TYPE=dev)&(INDEX1=JPM)&(INDEX2=)&(CONTRACT1=2023.10.M)&(PROFILE=)&CONTRACT_TYPE=BLA 531 179664 \ts select from PUBLICATIONS where (date in 2023.06.28)&(SRC in market)&(PRICE_TYPE in dev)&(INDEX1 in JPM)&(INDEX2 in )&(CONTRACT1 in 2023.10.M)&(PROFILE in )&CONTRACT_TYPE in `BLA 504 179664

Using kdb+ 3.5 2017.10.11

Thanks,

Eoghan

If you have a list of where clauses separated by a comma, it is executed sequentially, successively reducing the number of rows processed. On the other hand if you use the & operator, all operations will run on the full table.

I would presume optimizations such as left to right sub-clause filtering as mentioned here https://code.kx.com/q4m3/9_Queries_q-sql/#9333-multiple-where-subphrases are lost when filtering using a table/dict lookup.

i.e. we need to check all of the columns specified in the lookup simlataneouly, instead of passing each consective filter to the next sub-phrase.

To visualize differently:

q)show tab:(c1:ab`c;c2:1 2 3) c1 c2 ----- a 1 b 2 c 3 q)tab in 1#tab 100b q)select from tab where tab in 1#tab c1 c2 ----- a 1

 Also worth having a look at the parse trees:

 

q)parse "select from PUBLICATIONS where date=2023.06.28,src=market,PRICE_TYPE=dev,INDEX1=JPM,INDEX2=,CONTRACT1=2023.10.M,PROFILE=,CONTRACT_TYPE=BLA" ? PUBLICATIONS ,((=;date;2023.06.28);(=;src;,market);(=;PRICE_TYPE;,dev);(=;INDEX1;,JPM);(=;INDEX2;,);(=;CONTRACT1;,2023.10.M);(=;PROFILE;,);(=;CONTRACT_TYPE;,BLA)) 0b () q)parse "select from PUBLICATIONS where ([]date;SRC;PRICE_TYPE;INDEX1;INDEX2;CONTRACT1;PROFILE;CONTRACT_TYPE) in 1#.eoh.f" ? PUBLICATIONS ,(in;(+:;(!;,dateSRCPRICE_TYPEINDEX1INDEX2CONTRACT1PROFILECONTRACT_TYPE;(enlist;date;SRC;PRICE_TYPE;INDEX1;INDEX2;CONTRACT1;PROFILE;CONTRACT_TYPE)));(#;1;`.eoh.f)) 0b ()