I am using the below function to pivot multiple columns. For ~100k rows it takes me ~700ms to do this.
Are there any obvious optimizations possible?
normMult: {[data; horcol; vertcol; valCol]
ahs: raze raze flip ?[0!data; (); 0b; (enlist vertcol)!enlist (distinct; vertcol)];
bhs: raze raze flip ?[0!data; (); 0b; (enlist horcol)!enlist (distinct; horcol)];
chs: flip (vertcol, horcol)!flip ahs cross bhs;
chs: chs lj data;
bts: ?[0!chs; (); (enlist horcol)!enlist horcol; (vertcol, valCol)!(vertcol, valCol)];
colnames: $raze each string valCol cross
$(“_”,/:string[(0!bts)horcol]);
dataRow: raze (0!bts)valCol;
vertRow: first (0!bts)vertcol;
ans: flip colnames!dataRow;
ans: vertcol xcols ![ans; (); 0b; (enlist vertcol)!enlist vertRow];
ans
};
\t normMult[ans; `sym; `time; `qty`high`low]
nick11
2
Hi Naveen,
I’m only testing on a 32bit trial version so can’t test your soln but the below is a faster implementation of what I think you’re after.
/ create tab
n:100000;
ans:(sym:n?a
bc
d`e;time:asc n?.z.t;qty:n?1000;high:100+n?10;low:90+n?10);
/ create rename fn
rename_cols:{(1#x),`$(string[y],“_”),/:string 1_x};
/ run standard pvt fn from http://code.kx.com/wiki/Pivot but unkey and rename col at the end
pvt:{[t;horzCol;vertCol;val] P:asc ?[t;();();(distinct;horzCol)];r:?[t;();{x!x}(),vertCol;(#;`P;(!;horzCol;val))];rename_cols[cols r;val]xcol 0!r};
/ blend the results for multiple cols
multi_piv:{[ans;horzCol;vertCol;valCol] {{0!x lj 1!y}/[first x;1_x]}pvt[ans;horzCol;vertCol;]each valCol};
\t multi_piv[ans;`sym;`time;`qty`high`low]
381
This runs in 381ms on a fairly low powered laptop with 32bit q, so should be quite a bit faster on a higher powered 64 bit machine.
Cheers,
Nick