Anyone have existing script to compress whole HDB folder

HI all

From http://code.kx.com/wiki/Cookbook/FileCompression, it look like I need to compress file by file ( i.e. column by column ) for existing HDB which are not compressed. Just wonder if there any handy script which will compress whole HDB folder for all the table under a segment?

Thanks!

Here’s a broken down function that will compress partitioned tables given a hdb path (hdb) and compression algo (cAlgo)

it will only compress columns that exist (in the case of incomplete dbs)

it will only compress the column if the column’s compression stats are different to the inputted algo

it will compress to a temp file, and then move the temp file to the original

it is very quickly drafted but works (tested on win+lin)

it can/should be given extra logging and extra protected evaluation

the last line contains a peach - if you hook into .z.pd you can parallelize it (it uses a system call so cant use threads). If you remove the step of compressing to a temp file and then system"mv", and just compress to the same file - you’ll be able to parallelize over threads.

compressPartTabs:{[hdb;cAlgo]

// load hdb to pop .Q.pf and .Q.pt

system"l ",1_ string hdb;

// ensure cAlgo is ints

cAlgo:“i”$cAlgo;

// create potentials paths for each part tab

pot:{(`$string get .Q.pf;x;cols except .Q.pf)} each .Q.pt;

// create paths from potentials

paths:raze {` sv/:hsym cross/(y;z)} .’ pot;

// test existence, remove non existent

paths:paths where not {()~key x} each paths;

// test compression, if same as cAlgo then remove from todo list

paths:paths where not {y~(-21!x)logicalBlockSizealgorithm`zipLevel}[;cAlgo] each paths;

// complete remaining, compress to col_compressed (for windows and incase of mid write corruption)

res:{-19!(x;hsym `$ string,“_compressed”),y}[;cAlgo] each paths;

// move any _compressed file to normal

// if windows use move /y and change forward slashes

cmds:{" " sv enlist[“mv”],1_’ string (x;y)}'[res;paths];

if[.z.o like “w*”;cmds:{“move /y”,2_ssr[x;“/”;“\”]} each cmds];

{@[system;x;{"ERROR with: ",x}]} peach cmds

};

q)\l compressPartTabs.q

q)set:[{sv:./db,($string x),}each (2018.01.01+til 2) cross tt2;(10?10)]

:./db/2018.01.01/t/:./db/2018.01.01/t2/:./db/2018.01.02/t/:./db/2018.01.0..

q)\l db

q)-21!`:./2018.01.01/t/x

q)-21!`:./2018.01.01/t2/x

q)compressPartTabs[`:.;17 1 0];

q)-21!`:./2018.01.01/t/x

compressedLength | 152

uncompressedLength| 96

algorithm | 1i

logicalBlockSize | 17i

zipLevel | 0i

q)-21!`:./2018.01.01/t2/x

compressedLength | 152

uncompressedLength| 96

algorithm | 1i

logicalBlockSize | 17i

zipLevel | 0i

HTH,

Sean

Great, thanks a lot, it is very helpful