Cloning all rows of a table for each set of selected columns from

vivek_shende · March 14, 2019, 7:30pm

Hello,

I want to clone all rows from a Kdb table for each set of selected columns from the same table. I couldn’t find the best way to describe what I want to achieve. The example below should clarify what I want to do:

q)tab:([]a:JonSamwellDavosMelisandre;x1:11 12 13 14;x2:21 22 23 24;x3:31 32 33 34)q)taba x1 x2 x3-----------------------Jon 11 21 31Samwell 12 22 32Davos 13 23 33Melisandre 14 24 34q)tab2: (select a,x:x1 from tab),(select a,x:x2 from tab),(select a,x:x3 from tab)q)tab2a x---------------Jon 11Samwell 12Davos 13Melisandre 14Jon 21Samwell 22Davos 23Melisandre 24Jon 31Samwell 32Davos 33Melisandre 34

I need a shortcut function to convert tab to tab2. I have used joins to achieve this; but this solution is not scalable as I might get more columns x4, x5 etc.that I would I need to add to ‘x’ and querying table again and again in each join clause will also affect performance.

thomas_smyth · March 14, 2019, 10:00pm

The operation you are describing sounds like unpivoting a table, the following function will achieve that for your example. To make it more generic you will need to make any columns that need to be ungrouped the values in your keyed table.

q)ungroup{enlist[`x]!enlist value x}each 1!tab

a x

-------------

Jon 11

Jon 21

Jon 31

Samwell 12

Samwell 22

Samwell 32

Davos 13

Davos 23

Davos 33

Melisandre 14

Melisandre 24

Melisandre 34

cmccarthy11 · March 14, 2019, 10:19pm

Hi Vivek,

One way you could go about doing it that should be reasonably more scalable is through the use of a functional select. For the sake of clarity I’ve increased both the number of rows and columns in order to show the difference in performance between the methods. There may be faster ways than this to achieve what you’re looking for but this is a cleaner/more scalable version of the code to begin with,

q)N:100000

q)5#tab:(a:N?`1;x1:N?5;x2:N?10;x3:N?5;x4:N?10;x5:N?10;x6:N?5;x7:N?10)

a x1 x2 x3 x4 x5 x6 x7

-----------------------------

l 0 0 2 7 1 0 4

l 2 4 3 6 2 1 5

d 3 7 2 2 1 3 1

b 2 4 0 5 1 3 9

o 4 6 2 3 2 2 5

q)\t t1:raze{?[x;();0b;(y;x)!(y;z)]}[tab;a;]each x1x2x3x4x5x6`x7

101

q)\t t2:(select a,x:x1 from tab),(select a,x:x2 from tab),(select a,x:x3 from tab),(select a,x:x4 from tab),(select a,x:x5 from tab),(select a,x:x6 from tab),(select a,x:x7 from tab)

279

q)t1~t2

1b

The documentation for functional select statements in q can be found at https://code.kx.com/v2/basics/funsql/

I should note here that Thomas’ answer will also work (all required rows will be in the table) but it’s formatted differently and less performant on larger datasets. i.e for the same example

q)\t ungroup{enlist[`x]!enlist value x}each 1!tab

2795

I’d be interested to see any faster implementations. Hope this helps somewhat.

Kind regards,

Conor

Mochalin_Igor1 · March 14, 2019, 10:22pm

ungroup - looks good , but as an additional solution you can you this (if you need to save an order)

a:raze {tab `a} each 1_cols tab;

x: raze {tab x} each 1_cols tab;

(a;x)

??, 15 ???. 2019 ?. ? 01:00, <thomas.smyth@aquaq.co.uk>:

The operation you are describing sounds like unpivoting a table, the following function will achieve that for your example. To make it more generic you will need to make any columns that need to be ungrouped the values in your keyed table.

q)ungroup{enlist[`x]!enlist value x}each 1!tab

a x

-------------

Jon 11

Jon 21

Jon 31

Samwell 12

Samwell 22

Samwell 32

Davos 13

Davos 23

Davos 33

Melisandre 14

Melisandre 24

Melisandre 34

vivek_shende · March 15, 2019, 7:44am

Thank you Thomas! It worked for me. It is an uphill task to get your way around in Kdb/q if you are from a procedural/OOP programming background!

Flying1 · March 15, 2019, 11:50am

One easier-to-understand way for achieving this (without all the cryptic functional forms) is to view table columns as lists themselves:

q)ungroup select a,x:flip(x1;x2;x3)from taba x-------------Jon 11Jon 21Jon 31Samwell 12Samwell 22Samwell 32Davos 13Davos 23Davos 33Melisandre 14Melisandre 24Melisandre 34

vivek_shende · March 15, 2019, 12:23pm

Thanks Igor for the alternate approach!

vivek_shende · March 15, 2019, 12:40pm

Hello Conor,

This solution seems more performant! I am have modified Thomas’s solution to have an indicator of the column a particular row comes from, like below:



q)tab:(a:JonSamwellDavosMelisandre;x1:11 12 13 14;x2:21 22 23 24;x3:31 32 33 34)
q)tab
a            x1  x2  x3
-----------------------
Jon          11  21  31
Samwell      12  22  32
Davos        13  23  33
Melisandre   14  24  34

q)tab2:ungroup {xidx!(value x;til 3)}each 1!tab
q)tab2
a      x   idx
---------------------

Jon      11   0
Jon      21   1
Jon      31   2
Samwell      12   0
Samwell      22   1
Samwell      32   2
Davos      13   0
Davos      23   1
Davos      33   2
Melisandre   14   0
Melisandre   24   1
Melisandre   34   2

It basically tells me that (11 12 13 14) came from a different column than (21 22 23 24). Is this something I can achieve by modifying your solution?

vivek_shende · March 15, 2019, 12:49pm

Thanks!

This indeed is easier to understand!

Similar to what I replied to Conor, can I modify this solution to add some sort of indicator of the column a row corresponds to? like below:

q)tab:([]a:JonSamwellDavosMelisandre;x1:11 12 13 14;x2:21 22 23 24;x3:31 32 33 34)q)taba x1 x2 x3-----------------------Jon 11 21 31Samwell 12 22 32Davos 13 23 33Melisandre 14 24 34q)tab2:ungroup {xidx!(value x;til 3)}each 1!tabq)tab2a x idx---------------------Jon 11 0Jon 21 1Jon 31 2Samwell 12 0Samwell 22 1Samwell 32 2Davos 13 0Davos 23 1Davos 33 2Melisandre 14 0Melisandre 24 1Melisandre 34 2

cmccarthy11 · March 15, 2019, 12:57pm

Yes that’s a reasonable modification to make although the solution will appear less readable

q)raze{[x;y;z;k]?[x;();0b;(y;x;idx)!(y;z;k)]}[tab;`a;;]'[n;til count n:1_cols tab]

q)t1

a x idx

-----------------

Jon 11 0

Samwell 12 0

Davos 13 0

Melisandre 14 0

Jon 21 1

Samwell 22 1

Davos 23 1

Melisandre 24 1

Jon 31 2

Samwell 32 2

Davos 33 2

Melisandre 34 2

This is using the fact that we operate right to left (to name the columns you want to apply to xx1 etc. as n) and the each-both operator to project the column names (1_cols tab) and the unique indices you wanted into the variables z and k.

Hopefully this helps

TerryLynch1 · March 15, 2019, 1:04pm

Another performant approach using take (#) which retains the order so you don’t require an indicator column:

raze{ax xcol y#x}[tab]each(`a,'1_cols tab)

Terry

Nick10 · March 16, 2019, 12:30am

this can be performed with a generalized unpivot function:

unpivot:{ungroup (x!(key;value)@:) each y}

you supply the desired key and value column names as the x parameter and a table who’s key defines the columns to repeat as the y parameter. the values in the key column retain the column names from which they came.

q)unpivot[`k`v] 1!tab
a k v

Jon        x1 11
Jon        x2 21
Jon        x3 31
Samwell    x1 12
Samwell    x2 22
Samwell    x3 32
Davos      x1 13
Davos      x2 23
Davos      x3 33
Melisandre x1 14
Melisandre x2 24
Melisandre x3 34

sannysanoff1 · March 16, 2019, 10:09am

> q)tab:(a:JonSamwellDavosMelisandre;x1:11 12 13 14;x2:21 22 23 24;x3:31 32 33 34)

I have shortest and most performant solution ;)

k)(a:(#v)#t`a;x:v:,/t’1_!+t:tab)

also, it uses fact that # function replicates list until required length.

or, equivalent: q)(a:(count v)#t`a;x:v:raze t 1_cols t:tab)

Code golf, anyone?

vivek_shende · March 17, 2019, 10:54am

Thank you Nick! This seems shorter. Can I replace column names with index? i.e. 0 1 2 in place of x1, x2, x3?

vivek_shende · March 17, 2019, 11:17am

Hello Sanny,

I checked this and found it easier to understand. Thank you!

Can I add indicator for the column to which a record corresponds to, like I posted earlier? Thank you!

vivek_shende · March 17, 2019, 11:45am

Thanks! It indeed helped. I realized later that I would need multiple key columns (like column a) instead of just one. I did below modification to the query to make it work:

q)tab:([]a:JonSamwellDavosMelisandre;b:SnowTarlySeaworthUnknown;x1:11 12 13 14;x2:21 22 23 24;x3:31 32 33 34)q)taba b x1 x2 x3
------------------------------------

Jon Snow 11 21 31

Samwell Tarly 12 22 32

Davos Seaworth 13 23 33

Melisandre Unknown 14 24 34

q)tab2:raze{[x;y;z;k] ?[x;();0b;(y;x;idx)!(y;z;k)]}[tab;`a;;]'[n;til count n:1_cols tab]
q)tab2

a x idx

-----------------------------

Jon Snow 0

Samwell Tarly 0

Davos Seaworth 0

Melisandre Unknown 0

Jon 11 1

Samwell 12 1

Davos 13 1

Melisandre 14 1

Jon 21 2

Samwell 22 2

Davos 23 2

Melisandre 24 2

Jon 31 3

Samwell 32 3

Davos 33 3

Melisandre 34 3

q)tab3:raze{[x;y;z;k] ?[x;();0b;(y,(`x;`idx))!(y,(z;k))]}[tab;`a`b;;]'[n;til count n:2_cols tab]
q)tab3

a b x idx

-----------------------------------

Jon Snow 11 0

Samwell Tarly 12 0

Davos Seaworth 13 0

Melisandre Unknown 14 0

Jon Snow 21 1

Samwell Tarly 22 1

Davos Seaworth 23 1

Melisandre Unknown 24 1

Jon Snow 31 2

Samwell Tarly 32 2

Davos Seaworth 33 2

Melisandre Unknown 34 2

Nick10 · March 17, 2019, 6:04pm

unpivot works here too, you just specify a 2 column primary key. and you can map the columns to integers afterwards.

q)update (2_cols tab)?idx from unpivot[`idx`x] 2!tab
a b idx x

Jon        Snow     0   11
Jon        Snow     1   21
Jon        Snow     2   31
Samwell    Tarly    0   12
Samwell    Tarly    1   22
Samwell    Tarly    2   32
Davos      Seaworth 0   13
Davos      Seaworth 1   23
Davos      Seaworth 2   33
Melisandre Unknown 0   14
Melisandre Unknown 1   24
Melisandre Unknown 2   34

Topic		Replies	Views
functional select Community Support kdb-and-q	5	0	November 30, 2015
merge/copy partitioned tables efficiency Community Support kdb-and-q	2	1	November 27, 2023
Re: Columns to rows in q Community Support kdb-and-q	2	0	July 3, 2013
Apply same function to many columns of the same table Community Support kdb-and-q	6	0	September 25, 2014
Select statement with additional columns Community Support kdb-and-q	3	0	February 10, 2015

Cloning all rows of a table for each set of selected columns from

q)unpivot[kv] 1!tab a k v

q)update (2_cols tab)?idx from unpivot[idxx] 2!tab a b idx x

Related topics

q)unpivot[`k`v] 1!tab
a k v

q)update (2_cols tab)?idx from unpivot[`idx`x] 2!tab
a b idx x