Is the table partitioned/splayed or are you accessing it in memory? Are you using the 32 bit or 64 bit version of kdb+? The 32-bit version is more limited in terms of how it handles large amounts of in-memory data. You can only use 4GB of RAM with it.
It may be that you have to filter the data further/select less columns if possible. Have you tried using a different where clause? the like operator requires kdb+ to read the 50m records into memory for that column to do the search (50m big string might exceed the 4GB of RAM). This is less likely but worth a try.
Can you paste in the meta of the table? Otherwise I can send some sample code that will read in chunks of the table at a time and stream execute so to keep memory consumption low.
Hi Aidan, Thanks for the response. The table is partitioned and I am accessing it in memory. I am using 64 bit version of kdb. The table has 50+ columns in it and at least I need to select 30+ columns. I am unable to share the metadata of the table. Could you please share the sample code if possible that can use the where clause and read it in chunks and consumes very less memory. Thanks Ram
Try querying for less columns - When you don’t provide those that you are interested in, kdb+ will read all columns into memory, causing the wsfull error
select from mytable where name like “amzn” -> select name from mytable where name like “amzn”
Hi Connor, Thanks for your suggestion. However I have a requirement to load at least 30+ columns into the memory. Please let me know if you have any alternate solution to consume less memory. Thanks Ram
Sorry to have you repeat yourself; I missed your earlier reply. What is your requirement exactly? Is this data needed in memory purely to speed-up query time by another process? Reading in chunk by chunk is only an option if, when processing each chunk, you create a (smaller)subset of this data - not really an appropriate solution if your end goal is to have the entire 30 column dataset in memory. Thanks, Connor
Why can you not share the meta data? The metadata will just give column names, attributes, types of the columns etc. Nothing ‘of-importance’ per se. A simple copy and paste of ‘meta tab’?
What is your partition domain?
Can you post the banner - the first line of your q proc when you start?
Have you set a workspace limit in your process? Check by .Q.w`wmax
Is your q binary perhaps an alias to q with a workspace size set? Check with alias q or even cat /proc/{pid}/cmdline, etc (if on linux)
PS . 10 million rows is tiny :). You should have no issues here loading/selecting that data into memory, providing your hardware and other settings allow it… but to find out what’s going on you are going to have to give more info (and on the k4 forum)
Cheers,
Sean
________________________________________
From: personal-kdbplus@googlegroups.com [personal-kdbplus@googlegroups.com] on behalf of Ram [venkat.pendam@gmail.com]
Sent: 04 October 2016 15:30
To: Kdb+ Personal Developers
Subject: [personal kdb+] Re: How to load huge data in KDB
Hi Aidan,
Thanks for the response.
The table is partitioned and I am accessing it in memory.
I am using 64 bit version of kdb.
The table has 50+ columns in it and at least I need to select 30+ columns.
I am unable to share the metadata of the table.
Could you please share the sample code if possible that can use the where clause and read it in chunks and consumes very less memory.