Joining CSV files tens of GBs large

I regularly do data-analysis tasks, where I just want to slurp in a few CSV files, join them together, with the theta function being usually inner and left/right joins, filter out some rows further, and finally output results

Currently I uses SAS for this.

Sounds fairly simple, until I tell you that the CSV files can be tens of GBs and there will be a minimum of 3 such tables that will need to be loaded (joined)

Hardware usually has 8GB of free physical RAM (the OS and other programs consume the rest) and an almost idle quadcore CPU

Can kdb+ CARRY OUT THESE JOINs without requiring me to use a DB like Postgres/ParAccel?

Hi Carson

Short answer is yes. From your description, it sounds like you probably wouldn?t have enough memory available to do it all in memory (i.e. you would have to write to disk and analyse from disk perhaps piece-by-piece).

There is a small tutorial here http://code.kx.com/wiki/Cookbook/LoadingFromLargeFiles which allows you to download and create some example large csv files, load them in in chunks, write them out to disk, then re-sort at the end.  You could adapt some of this for your purposes. 

Thanks 

Jonny