Sort row by column

WooiKent_Lee · May 23, 2013, 9:44am

Hi all,

Just wondering whether it is possible to sort rows by columns?

q)a:(num:6 2 3 1 4 5;tt:xxxyyy;pp:1.1 2.2 3.3 4.4 5.5 6.6)

q)a

num tt pp

6 x 1.1

2 x 2.2

3 x 3.3

1 y 4.4

4 y 5.5

5 y 6.6

I want to sort the num column by tt, so the desire result is this:

num tt pp

6 x 1.1

3 x 3.3

2 x 2.2

5 y 6.6

4 y 5.5

1 y 4.4

I can only think of these solutions:

q)(num xdesc select from a where tt=x),num xdesc select from a where tt=y

q)(select[>num] from a where tt=x),select[\>num] from a where tt=y

Any more efficient ways to do this?

Attila · May 23, 2013, 10:09am

Hi,

kdb+'s sort is stable by design, so

tt xascnum xdesc a

Cheers,

? Attila

WooiKent_Lee · May 23, 2013, 10:49am

Awesome, just what I needed (nice and efficient!)

What if I want to put conditions on sorting, say if tt=x then sort desc, if tt=y then sort asc:

q)(select[>num] from a where tt=x),select[\<num] from a where tt=y

num tt pp

6 x 1.1

3 x 3.3

2 x 2.2

1 y 4.4

4 y 5.5

5 y 6.6

charlie1 · May 23, 2013, 12:50pm

other options

q)update neg num from ttnum xasc update neg num from a
q){x i iasc x[y]i:idesc x z}[a;tt;num]

tt xascnum xdesc a

seems a little wasteful sorting the whole table twice

and modified for your latest question

q)update abs num from ttnum xasc update neg num from a where tt=`x

Attila · May 23, 2013, 1:15pm

thanks, sure these are very interesting

however i would argue that

elegance, conceptual simplicity and generality matter too

and they matter more (especially as the first cut) than pure performance

and that is not so easy to predict to begin with

q)a:10000000#a

i can explain xasc/xdesc easily to anyone

works with no matter of column-types

and one could even have multiple columns in all combinations

(or even more than a pair of relations but that is probably rare)

q)\ts tt xascnum xdesc a

515 805307360

this trick only works with numbers, but also not that much faster on this toy example

q)\ts update neg num from ttnum xasc update neg num from a

490 603981232

this works with all kind of columns and faster too, but again harder to explain

and of course only works with two columns (could be modified easily with # though)

q)\ts {x i iasc x[y]i:idesc x z}[a;tt;num]

425 603980416

of course performance penalty for the general solution with a real table with more columns would be more pronounced

Cheers,

Attila

vikas_agarwal1 · May 13, 2014, 11:14pm

Thanks Charles for you solution. I was needing something similar for n number of columns

Please let me know if following implementation is correct or not. Or I can improve the performance more.

Function:

;GenericSort:{[tbl;colNames;colOrders]

;len:-1 + count colNames

;indexes:til count tbl

;while[len>=0;$[`ds=colOrders len;indexes:indexes idesc tbl[colNames len] indexes;indexes:indexes iasc tbl[colNames len] indexes];len-:1]

;tbl indexes

}

Examples

;a:(num:1 2 1 2 1 2;tt:xxxyyy;pp:1 1 2 4 4 5)

;colNames:numtt`pp

;colOrders:asds`as

;GenericSort[a;colNames;colOrders]

;colNames:numtt`pp

;colOrders:asds`ds

;GenericSort[a;colNames;colOrders]

charlie1 · May 13, 2014, 11:36pm

For simplicity and readability, just use the pattern that Attilla advised, i.e.

tt xascnum xdesc a

but for fun here’s another way

gsort:{x value{(::;i value[y]i:value x)}/[flip((),y;x(),z)]}

q)gsort[a;(iasc;idesc;iasc);`pp`tt`num]

num tt pp

1 y 4

1 x 1

1 x 2

2 y 4

2 y 5

2 x 1

charlie1 · May 13, 2014, 11:54pm

oops, don’t want value y.

gsort:{x value{(::;i y[0]y[1]i:value x)}/[flip((),y;x(),z)]}

charlie1 · May 14, 2014, 10:18am

a tip - if you find you’re reaching out for a while loop, it’s probably a mistake. It’s more idiomatic in kdb+ to use adverbs, but if for whatever reason that’s unsuitable, then try to rework as a do loop - so it is certain to complete. e.g. your fn reworked as

q)GenericSort:{[t;colNames;colOrders]n:count colNames;i:til count t;do[m:n;m-:1;i:i((iasc;idesc)`ds=colOrders m)t[colNames m]i];t i}

adverbs remove a lot of the ‘-1’ errors in loops.

vikas_agarwal1 · May 14, 2014, 10:54pm

Thanks Charles for your suggestion. I will use it for next code changes.
I tried the solution provided by you. And i am gaining performance benefit. Though still i have to understand what is happening in that code :)

I cannot use the solution provided by Attila because even for less milliseconds makes a difference in overall performance.

Attila · May 15, 2014, 9:06am

{x{x y z x}/[::;(),y;x(),z]}

is a bit even faster

as to explanation you can read Stevan’s article

http://archive.vector.org.uk/art10500340

(point 6, look for msort)

but for reasonably long and not too wide tables

the naive approach is just fine

Attila

vikas_agarwal1 · May 27, 2014, 2:16pm

Thanks Attila for the link. It is helpful.

Topic		Replies	Views
Loading multiple csv file record and maintain order by one of the column Community Support kdb-and-q	1	3	October 7, 2010
Re: [personal kdb+] How can I create a conditional sort? Community Support imported , kdb-and-q	1	6	May 13, 2016
Code snippet from example. Community Support kdb-and-q	1	12	June 1, 2012
Sorting without combining Community Support kdb-and-q	0	1	May 13, 2015
Sorting on disk Community Support kdb-and-q	3	22	November 17, 2013

Sort row by column

Related topics