at a quick glance I would say it is due to missing a trailing /
.Q.fs[{`:newfile/
/ indicates a splayed table on disk.
at a quick glance I would say it is due to missing a trailing /
.Q.fs[{`:newfile/
/ indicates a splayed table on disk.
.Q.fs[{`:newfile/
Hm, adding / gives me 'type error. I guess I have to create this splayed table first?
also need to enumerate the symbols
.Q.fs[{:newfile/ upsert .Q.en[
:.;]flip symbol
systemtype
momentid
actionprice
volumeid_deal
price_dealown
account!(“SSSSSIFISFIS”;“,”)0:x}]`:c:/work/orderlog.txt
fsn might have an impact too with a bigger n.
also consider writing to a different drive to that which you are reading from (unless you have ssd).
Thank you for answers.
Also I guess I better not read timestamps as string.
Took me 8 minutes to save 1.09 GB file like this. I wonder if it is an acceptable time for kdb. Same file took 3 minutes for HDF5.
You could naturally parallelize this if 8 min is not fast enough.
How fast can you query HDF5?
Did you save it down compressed?? Maybe your disks are slow so it would actually help with performance.
Cheers,
? Attila
Yes, we havent come to reading data yet, we are comparing time needed to store it so far.
No compression was enabled, I don’t understand how to do it yet.
The way you save it down now wont have any indices for fast access. As it is a log I assume you could at least put a sorted on time. It all depends on your queries
Compression is as easy as changing .z.zd
Cheers,
? Attila
I expect part of the slowdown is that each time .Q.en is called it locks :sym, reads
:sym, updates sym in mem, then saves to :sym again, unlocks
:sym. If you have only one process updating :sym then this could be done much more efficiently by enumerating in mem once and then writing
:sym. The larger that you make n in .Q.fsn the fewer calls to .Q.en.
i.e.
q):sym?
symbol$(); / ensure sym exists
q)sym:get`:sym / load it into memory
q)k).Q.en2:{f@:&11h=@:'x f:!+x;@[x;f;`sym?]} / define .Q.en2 to work with sym in mem
q)t:(s:a
b`c) / define a test table
q)0N!.Q.en2[t] / enumerate it
+(,s)!,
sym$a
b`c
s
a
b
c
q)sym / in mem sym has been updated
a
b`c
q)`:sym set sym / save to disk
`:sym
hence your code then becomes
q):sym?
symbol$();sym:get`:sym
q)k).Q.en2:{f@:&11h=@:'x f:!+x;@[x;f;`sym?]}
q).Q.fs[{:newfile/ upsert .Q.en2 flip
symbolsystem
typemoment
idaction
pricevolume
id_dealprice_deal
ownaccount!("SSSSSIFISFIS";",")0:x}]
:c:/work/orderlog.txt
q)`:sym set sym
n.b. this is safe if only one process is updating `:sym. If you have multiple processes, you’d need another mechanism.
on 32bit kdb+ you can easily run out of address space with file compression.
And if your disks can read faster than 300MB/s, using file compression will likely slow things down.
Thank you, I will try that.