Key-dependent aggregation

Say you have these two keyed tables: t: ([s:(,/)3#'abc;d:(,/)3#enlist 1 2 3]v:til 9) r: ([s:abc]c:.25 .5 .75)and that you want to sum the v’s grouped by s but weightedby the appropriate coefficient c from r, i.e., something like select {[c;v] c*sum v} by s from t"(yes, in this example I could multiply after the fact but thereal-world problem is more complex and I need c upfront).I have a couple of solutions that involve joining t and r orderivatives thereof first but they look wasteful, data-wise,and worse the code looks like something I won’t understandanymore in 2 weeks, if not sooner. This, on other hand, isnot too bad: select {r[x;`e]*sum y}.(first s;v) by s from tbut, again, the real problem is more complex and I’d like tosimplify even more. Is there a really elegant piece of syntaxI have missed?Thanks

Hello,Perhaps a foreign-key chase might do the trick if I understand youcorrectly:select sum v*(r(s:s))[c] by s from tRegards,Fintan.On Nov 30, 12:52?pm, olivier <lefev...> wrote:&gt; Say you have these two keyed tables:&gt;&gt; ? t: ([s:(,/)3#'abc;d:(,/)3#enlist 1 2 3]v:til 9)> ? r: ([s:abc]c:.25 .5 .75)&gt;&gt; and that you want to sum the v's grouped by s but weighted&gt; by the appropriate coefficient c from r, i.e., something like&gt;&gt; ? select {[c;v] c*sum v} <what to put here> by s from t"&gt;&gt; (yes, in this example I could multiply after the fact but the&gt; real-world problem is more complex and I need c upfront).&gt; I have a couple of solutions that involve joining t and r or&gt; derivatives thereof first but they look wasteful, data-wise,&gt; and worse the code looks like something I won't understand&gt; anymore in 2 weeks, if not sooner. This, on other hand, is&gt; not too bad:&gt;&gt; ? select {r[x;e]*sum y}.(first s;v) by s from t>> but, again, the real problem is more complex and I’d like to> simplify even more. Is there a really elegant piece of syntax> I have missed?>> Thanks</lefev…>

Fintan is right, you are looking for this> select sum v*(r(s:s))[c] by s from tit is usually written as this:select sum v*r[([]s);c]by s from tbecause of not having the projection it is a bit fasterq)\t do[10000;select sum v*r[(s)]c by s from t]109q)\t do[10000;select sum v*r[([]s);c] by s from t]95And btw the join is not that bad itselfq)\t do[10000;select sum v*c by s from t lj r]160Regards, AttilaOn 30 Nov 2009, at 18:55, fintanq wrote:> Hello,> > Perhaps a foreign-key chase might do the trick if I understand you> correctly:> > select sum v*(r(s:s))[c] by s from t\> \> Regards,\> \> Fintan.\> \> On Nov 30, 12:52 pm, olivier <lefev...> wrote:&gt;&gt; Say you have these two keyed tables:&gt;&gt; &gt;&gt; t: ([s:(,/)3#'abc;d:(,/)3#enlist 1 2 3]v:til 9)>> r: ([s:abc]c:.25 .5 .75)&gt;&gt; &gt;&gt; and that you want to sum the v's grouped by s but weighted&gt;&gt; by the appropriate coefficient c from r, i.e., something like&gt;&gt; &gt;&gt; select {[c;v] c*sum v} <what to put here> by s from t"&gt;&gt; &gt;&gt; (yes, in this example I could multiply after the fact but the&gt;&gt; real-world problem is more complex and I need c upfront).&gt;&gt; I have a couple of solutions that involve joining t and r or&gt;&gt; derivatives thereof first but they look wasteful, data-wise,&gt;&gt; and worse the code looks like something I won't understand&gt;&gt; anymore in 2 weeks, if not sooner. This, on other hand, is&gt;&gt; not too bad:&gt;&gt; &gt;&gt; select {r[x;e]*sum y}.(first s;v) by s from t>> >> but, again, the real problem is more complex and I’d like to>> simplify even more. Is there a really elegant piece of syntax>> I have missed?>> >> Thanks> > –> >

Submitted via Google Groups</lefev…>

Thanks to both of you but, as I said, in the real problem theexpressionis much complicated and thus I really need to pass c as an argumenttothe aggregating function. Like my {[c;v] c*sum v} dummy the real-worldverb takes a scalar argument and a vector one but it is much toocomplicated to be inlined as in your solution.About the join: it may be fast but it must cause terrible memory blow-up:in the real problem there are many v columns in t, each one with amatchinge column in r, and there are thousands of d’s for each s value.

Also, in your solution it seems to me that you aremultiplying every v by its c instead of the whole sum,i.e., you are using a vector c, albeit a constant one.In this example this works because (sum c*v) (c*sum v) but in the real problem c really functionsas a scalar and the aggregating function will bombif passed a vector of c’s.

sure, you can have 1#sOn 30 Nov 2009, at 20:35, olivier wrote:> Also, in your solution it seems to me that you are> multiplying every v by its c instead of the whole sum,> i.e., you are using a vector c, albeit a constant one.> In this example this works because (sum c*v) => (c*sum v) but in the real problem c really functions> as a scalar and the aggregating function will bomb> if passed a vector of c’s.> > –> > Submitted via Google Groups

OK, so this works, too: select {x*sum y}.(first r[(s);c];v) by s from tbut is functionally equivalent to my first posted solution and Ifind it harder to read: the r[([]s);c] syntax is really quite magicalif you haven’t seen it before.