Simple Question about loading large csv file.

Dear All,

Finally, I found a way to load the large size csv file in this link (http://code.kx.com/wiki/Cookbook/LoadingFromLargeFiles).

.Q.fs[{`:newfile upsert flip `DateTime`Bid`Ask!("ZFF";",")0:x}]`:EURUSD.csv

I believe this one is the right one. I just have little trouble on datetime.

here is my EURUSD.csv format:

According the code above, I only read first three column datetime, bid and ask. I have trouble to split the datetime into the date and hh:mm:ss.xxx. The code above would update the table persisted on disk called `newfile directly and the table will only have three column. I prefer to have four column: date, time, bid, ask.

How could i modify the code above to convert the large csv data in to persisted table directly? The reason I would have code to do it DIRECTLY is he the EURUSD.csv is really large and I could not load the whole EURUSD into a in-memory table first and split the first column by updating the table, then writing it. Anyone could help me to split the first column?

Thank you very much and have a nice day

Wenhao SHE

Given the format of the csv you showed above (which already includes column names):

q)read0`test.csv

"Time,Ask,Bid,AskVolume,BidVolume"
"2007.04.21 21:00:34.086,1.3374,1.3371,9.5,20.7"
"2007.04.21 21:00:40.029,1.3375,1.337,19.9,20.7"
q).Q.fs[{`:newfile upsert `Date`Time xcols update Date:Time.date,Time:Time.time from ("ZFF";enlist ",")0:x}]`test.csv
127j
q)load `newfile
`newfile
q)newfile
Date Time Ask Bid
-------------------------------------
2007.04.21 21:00:34.086 1.3374 1.3371
2007.04.21 21:00:40.029 1.3375 1.337

 

Dear Tom Martin,

Thank you very much for your reply.

I fully understand your code and I think it is pretty charming to use it. But I have a little problem that my csv file is really big like 1.3GB. And I am using 32bit version and after executing “read0`mybigfile.csv”, the kdb+ terminated itself automatically.

Now I try the second line of code:

q).Q.fs[{:newfile upsert DateTime xcols update Date:Time.date,Time:Time.time from ("ZFF";enlist ",")0:x}]test.csv

it complaints:ERROR:`Time.

May you help me on this?

Thank you very much!

Dear Tom Martin,

I have a wild guess on the error of second line of code. It complaints the `Time. Your code is based on the fact that there is a title as first row. May be we could neglect the first row? Then how could we modify the code?

Thank you very much 

Wenhao SHE

Here I uploaded the file for your convenience.

If you’re going to define your own column names in the code, I’d remove the first row of the csv which contains the titles. Otherwise your first row will be blank, as seen below

q).Q.fs[{:newfile upsert delete DateTime from DateTime xcols update Time:DateTime.time,Date:DateTime.date from flip DateTimeAskBid!(“ZFF”;“,”)0:x}]`Book1.csv

1779j

q)load `newfile

`newfile

q)newfile

Date Time Ask Bid

--------------------------------------


2007.04.01 21:00:34.086 1.3374 1.3371

2007.04.01 21:00:40.029 1.3375 1.337

2007.04.01 21:01:36.029 1.3376 1.3373

2007.04.01 21:02:21.118 1.3376 1.3373

2007.04.01 21:02:23.084 1.3376 1.3373

2007.04.01 21:02:23.280 1.3376 1.3373

2007.04.01 21:02:25.060 1.33775 1.3373

2007.04.01 21:02:35.565 1.3376 1.3372

2007.04.01 21:02:38.767 1.33775 1.3373

2007.04.01 21:03:05.415 1.3376 1.3373

2007.04.01 21:03:05.503 1.3376 1.3373

2007.04.01 21:03:06.035 1.3376 1.3372

2007.04.01 21:03:07.038 1.3376 1.3373

2007.04.01 21:03:13.029 1.3376 1.3373

2007.04.01 21:03:18.917 1.3376 1.3372

2007.04.01 21:03:24.079 1.3376 1.3373

2007.04.01 21:03:24.409 1.3376 1.3373

2007.04.01 21:03:25.106 1.3376 1.3373

2007.04.01 21:03:27.075 1.3376 1.3373

Dear Tom Martin,

Your code works like a charm. Admiring your programming ability on this APL style language. You also answered my last question. It is a blessing for this group to have you. Gratitude for your time and care for newbie like me. Really hope you enjoy rest of your day.

Best Regards,

Wenhao SHE