I have a table I would like to export which is quite large, but can be held in memory. What I would like to do is export this to a csv file, but attempting to do this using ‘save table.csv’ seems to use an extraordinary amount of additional memory. Is there any alternative procedure I can use to write the file in chunks? I’m not sure if its duplicating the table in memory or what.
Hi David,
Try this function:
/ Saves a table-x to a file-y in chunks, using less memory for large tables.
saveTab:{
??? h:neg hopen @[hdel;y;y]; / open handle
??? h “,” sv string cols t;? / write headers
??? applyToTab[{x each 1 _ “,” 0: y;} h; x; 1000]; / append chunks to file
??? hclose neg h;};
/ Apply a function-x to a table-y in chunks of size-z.
applyToTab:{{x $[.Q.qp y;.Q.ind[y;z];y z]}[x;y] each z cut til count y};
An example of saving a table with 100,000 rows:
q)t
date??? sym time??? price??? size? cond mySize
2013.06.30 A?? 09:30:17.997 57.81544 65600 C??? 65600
2013.06.30 A?? 09:30:23.144 54.58004 31200??? 31200
2013.06.30 A?? 09:30:24.157 57.77217 27600??? 27600
..
/ measure memory used for both save methods
q)\ts save :t.csv 369 29398656 q)\ts saveTab[t;
:ta.csv]
807 5291536
q)29398656 % 5291536
5.555789
q) / Uses 5x less more memory, with the same result
q)\fc t.csv ta.csv
“Comparing files t.csv and TA.CSV”
“FC: no differences encountered”
Regards,
Ryan Hamilton
David,
Please take a look at http://code.kx.com/wsvn/code/contrib/ptryfon/util/csv.q. .csv.chunkSave function saves table to .csv file in chunks so it uses memory more sparingly. Example usage:
t:(sym:10000?`4;f:10000?10f)
.csv.chunkSave[`:t.csv;t;1000] //third argument is chunk size
HTH,
Pawel Tryfon
2014/1/10 David Fagnan <divadf@gmail.com>
I have a table I would like to export which is quite large, but can be held in memory. What I would like to do is export this to a csv file, but attempting to do this using ‘save table.csv’ seems to use an extraordinary amount of additional memory. Is there any alternative procedure I can use to write the file in chunks? I’m not sure if its duplicating the table in memory or what.
–
Submitted via Google Groups
you can use Q.fsn which allows specifying the size of chunks that are read in bytes.
Thanks
Hari