Hi,
I am working on a pivot table where new columns need to be created using the value of a grouping column. In my original dataset I have more grouping variables, but I have used a simple example below to illustrate the task.
If I have a table,
t:(col1:1 1 2 2;col2:10 10 20 20; col3:5.0 2.0 2.3 2.4; grp:a
ab
c)
col1 col2 col3 grp
1 10 5.0 a
1 10 2.0 a
2 20 2.3 b
2 20 2.4 c
I need to find the sum of col3 grouped by col1 and col2 and add columns for each distinct value in the column called “grp” (the last column above)
My current solution ----
select sumgrpa:sum col3[where grp=a], sumgrpb:sum col3[where grp=
b], sumgrpc:sum col3[where grp=`c] by col1, col2 from t
The final result, based on the above would look like,
col1 col2 sumgrpa sumgrpb sumgrpc
1 10 7.0 0 0
2 20 0 2.3 2.4
While I do have a working solution as shown above, it’s a bit cumbersome having to create columns manually with distinct values.
Further, since my dataset is fairly large (~ 500 million rows), I am looking for a less computationally expensive and faster solution. I don’t know if it can be any simpler, but there is likely a more elegant solution which can be produced. I have seen the example of Pivot in the Cookbooks, but not sure how I can apply it in a situation like this.
Thanks in advance,
- Raj.