at a quick glance I would say it is due to missing a trailing /
.Q.fs[{`:newfile/
/ indicates a splayed table on disk.
at a quick glance I would say it is due to missing a trailing /
.Q.fs[{`:newfile/
/ indicates a splayed table on disk.
.Q.fs[{`:newfile/
Hm, adding / gives me 'type error. I guess I have to create this splayed table first?
also need to enumerate the symbols
.Q.fs[{:newfile/ upsert .Q.en[:.;]flip symbolsystemtypemomentidactionpricevolumeid_dealprice_dealownaccount!(“SSSSSIFISFIS”;“,”)0:x}]`:c:/work/orderlog.txt
fsn might have an impact too with a bigger n.
also consider writing to a different drive to that which you are reading from (unless you have ssd).
Thank you for answers.
Also I guess I better not read timestamps as string.
Took me 8 minutes to save 1.09 GB file like this. I wonder if it is an acceptable time for kdb. Same file took 3 minutes for HDF5.
You could naturally parallelize this if 8 min is not fast enough.
How fast can you query HDF5?
Did you save it down compressed?? Maybe your disks are slow so it would actually help with performance.
Cheers,
? Attila
Yes, we havent come to reading data yet, we are comparing time needed to store it so far.
No compression was enabled, I don’t understand how to do it yet.
The way you save it down now wont have any indices for fast access. As it is a log I assume you could at least put a sorted on time. It all depends on your queries
Compression is as easy as changing .z.zd
Cheers,
? Attila
I expect part of the slowdown is that each time .Q.en is called it locks :sym, reads :sym, updates sym in mem, then saves to :sym again, unlocks :sym. If you have only one process updating :sym then this could be done much more efficiently by enumerating in mem once and then writing :sym. The larger that you make n in .Q.fsn the fewer calls to .Q.en.
i.e.
q):sym?symbol$(); / ensure sym exists
q)sym:get`:sym / load it into memory
q)k).Q.en2:{f@:&11h=@:'x f:!+x;@[x;f;`sym?]} / define .Q.en2 to work with sym in mem
q)t:(s:ab`c) / define a test table
q)0N!.Q.en2[t] / enumerate it
+(,s)!,sym$ab`c
s
a
b
c
q)sym / in mem sym has been updated
ab`c
q)`:sym set sym / save to disk
`:sym
hence your code then becomes
q):sym?symbol$();sym:get`:sym
q)k).Q.en2:{f@:&11h=@:'x f:!+x;@[x;f;`sym?]}
q).Q.fs[{:newfile/ upsert .Q.en2 flip symbolsystemtypemomentidactionpricevolumeid_dealprice_dealownaccount!("SSSSSIFISFIS";",")0:x}]:c:/work/orderlog.txt
q)`:sym set sym
n.b. this is safe if only one process is updating `:sym. If you have multiple processes, you’d need another mechanism.
on 32bit kdb+ you can easily run out of address space with file compression.
And if your disks can read faster than 300MB/s, using file compression will likely slow things down.
Thank you, I will try that.