Bitcoin exchange look-ups

Hi Q Experts,

I’m currently capturing order and execution data from a bitcoin exchange

I have a range of messages which I turn directly into tables, eg singleOrder , executionReport, Logon, etc.

My main issue here is deciding how to save the tags relating to ID (e.g. 11 and 109) which are non-repeating text.

I need to perform a large amount of look-ups on these fields (for various reasons including tracing order history, etc.) - with these large quantities in mind, I still wish to maintain as much speed efficiency and as small a memory footprint is as possible to my request.

So what is the best practice for fields in this case? - Should my memory footprint be the main concern, or should I save the values as nested lists and suffer a possible (probable) performance hit? - Any further advice or ideas on this are as always, very welcome!

Thanks as usual,

Regards,

Andie C

Are those like FIX tags? (in which case it’s not like a proper ID that could number in millions)

char array look up is obviously slower… but if you don’t want to blow up your sym file but get sym lookup like behavior, perhaps you can store a column for that ID (for reference) and also another column which would be a int hash or a guid (for lookup)? You might need to maintain a separate table of mappings.

Hi there,

As Manish noted, the tags are non repeating, so it would not be advised to save them as Symbols. 

Instead it might be worth encoding each tag using .Q.j10 or .Q.j12 depending on requirements. This would allow you to save each ID as a long instead of a string, greatly improving lookup speeds. Each encoding type is limited by charterer set and string length so please ensure you select an encoding  suitable for your data. Q.j10 works with strings with length of up to 10 and the characters in .Q.b6. Q.j12 works with strings of length up to 12, and the characters in .Q.na

For more detailed information see https://code.kx.com/q/ref/dotq/#qj10-encode-binhex.

A word of warning that the decoding functions  .Q.x10 and Q.x12  do add some padding so it would be best to avoid these where possible.

For example 

//Convert q) quotes:update .Q.j10 each sym from quotesq)3#quotestime sym src bid ask bsize asize-----------------------------------------------------------------2018.06.12D08:00:30.320000000 598158 L 35.47 35.48 7000 35002018.06.12D08:00:32.263000000 3219795 O 36.01 36.05 9000 20002018.06.12D08:00:50.733000000 598158 N 35.45 35.49 2500 2500//Lookupq)3#select from quotes where sym = .Q.j10["GOOG"]time sym src bid ask bsize asize-----------------------------------------------------------------2018.06.12D08:01:35.773000000 1631110 L 41.33 41.36 7500 15002018.06.12D08:04:59.757000000 1631110 O 41.35 41.38 9000 30002018.06.12D08:06:13.761000000 1631110 N 41.32 41.36 8500 5500//Query by Symbol and Repopulate symqu:{m:(s:.Q.j10'[string x])!x,:();update m sym from select from quotes where sym in s}5#quGOOGIBMtime sym src bid ask bsize asize--------------------------------------------------------------2018.06.12D08:01:35.773000000 GOOG L 41.33 41.36 7500 15002018.06.12D08:04:48.023000000 IBM O 43.53 43.56 2500 95002018.06.12D08:04:59.757000000 GOOG O 41.35 41.38 9000 3000

Regards

George