lower memory verison of w.q

This is an obvious modification of w.q, but I thought I’d share thechanges I made for anyone else trying to run a rdb in a low-memoryenvironment that is still capable of capturing tables of arbitrary size:https://gist.github.com/anonymous/843d7fea557ea1c60410fa35c793474eSimon’s w.q works well intraday, but the disksort loads individualcolumns into memory at the end of day. One of my tables has a largestring column of a couple GB, so that doesn’t work.I don’t really need the sorting of the sym column, so I sacrificed thatattribute and am writing everything directly to the hdb while preservingthe sorted time attribute. I’m now running 5-6GB through kdb+/tickdaily and the rdb’s memory usage peaks at around 60MB.Any feedback or suggestion for improvement is welcome.Rob

Hi Rob

Another option to avoid the big sort is to write the data intraday partitioned by whatever field you want the p attribute on.  At end-of-day you can then merge the data by reading it (in chunks if required) and upserting to the HDB partition.  It should allow much lower memory usage, and can be faster for some datasets (e.g. high row count, low unique values)

We have a discussion on it here:

http://www.aquaq.co.uk/q/optional-write-down-method-added-to-wdb-process-in-torq-2-3/

We also modified the end-of-day recently to run the sorting or merging using parallel processes invoked by .z.pd (to reduce speed time taken rather than memory usage).  .z.pd made the change very easy! 

http://www.aquaq.co.uk/q/end-of-day-parallel-sorting-in-torq/

Thanks 

Jonny