Pivot table with calculated columns based on grouping variable

Hi,

I am working on a pivot table where new columns need to be created using the value of a grouping column. In my original dataset I have more grouping variables, but I have used a simple example below to illustrate the task.

If I have a table, 

t:(col1:1 1 2 2;col2:10 10 20 20; col3:5.0 2.0 2.3 2.4; grp:aabc)

col1 col2 col3  grp

1    10   5.0     a

1    10   2.0     a

2    20   2.3     b

2    20   2.4     c

I need to find the sum of col3 grouped by col1 and col2 and add columns for each distinct value in the column called “grp” (the last column above)

My current solution ----

select sumgrpa:sum col3[where grp=a], sumgrpb:sum col3[where grp=b], sumgrpc:sum col3[where grp=`c] by col1, col2 from t

The final result, based on the above would look like,

col1 col2  sumgrpa   sumgrpb   sumgrpc

1    10         7.0         0         0

2    20          0       2.3       2.4

While I do have a working solution as shown above, it’s a bit cumbersome having to create columns manually with distinct values.

Further, since my dataset is fairly large (~ 500 million rows), I am looking for a less computationally expensive and faster solution. I don’t know if it can be any simpler, but there is likely a more elegant solution which can be produced. I have seen the example of Pivot in the Cookbooks, but not sure how I can apply it in a situation like this.

Thanks in advance,

  • Raj.

Hi,

a bit more flexible solution (based on cookbook pivot):

G:asc distinct exec grp from t
exec G#grp!s by col1,col2 from exec s:sum col3 by col1,col2,grp from t

since this is slover than your solution, I was a bit reluctant to post it… at least no manually created cols ;-)
didn’t had time but maybe we could reuse first grouping, this should improve this…

cheers
Pat