intersting "function" question

VA1 · January 31, 2017, 12:15pm

Good day Q-Gods,

I have an interesting question which I have been struggling with for a while now. I have two tables (t1, t2)

t1:

time px qty

1 10 100

2 20 200

2 20.1 200.1

4 40 400

4 40.1 400.1

5 50 500

t2 is a “derivative” of t1. Please note that px are not unique, as well as time steps; they repeat sometimes.

t2 has the px’s copied and sorted in ascending order. px column in t2 is a fixed column. The following columns is time, with each row reserved for each distinctive price.

t2:

px time1 time2 time3

10

20

20.1

40

40.1

50

SO what I want to do is to have some way to scan through t1 take qty from t1 and record it into t2 at the correct time(x), i.e. time1 or time2 or time3 at the correct price level, i.e. 10 or 40 or 40.1

I have tried working with each left and each right: “=/::” and then using “where” to get indices and then working with these indices. But it doesn’t work 100% right at times, and sometimes I get a list of indices not an integer alone and I have no idea how to proceed from then on, i.e. how do split this list into individual integers and then write them into t2 etc etc.

So what I would like to have is:

px time1 time2 time3 time4 time5

10 100

20 200

20.1 200.1

40 400

40.1 400.1

50 500

If anyone would be so kind and help me with this, it would be vert much appreciated. I have been trying to work this out myself for a few weeks now.

rahul_asati041 · January 31, 2017, 1:02pm

One option:

key[a]! (uj/) (0!a:select v:enlist ($"time",/:string time)!qty by px from t)v

Jonathon_McMurr · January 31, 2017, 1:10pm

Hi<o:p></o:p>

What youre after would be a pivot table - http://code.kx.com/wiki/Pivot<o:p></o:p>

An example using your t1 table:

q)t1:(time:1 2 2 4 4 5;px:10 20 20.1 40 40.1 50;qty:100 200 200.1 400 400.1 500)<o:p></o:p>

q)exec($"time",/:string asc exec distinct time fromt1)#(($“time”,/:string time)!qty) by px:px from t1<o:p></o:p>

px | time1 time2 time4 time5<o:p></o:p>

----|-----------------------<o:p></o:p>

10 | 100<o:p></o:p>

20 | 200<o:p></o:p>

20.1| 200.1<o:p></o:p>

40 | 400<o:p></o:p>

40.1| 400.1<o:p></o:p>

50 | 500

The main difference here from the example on thekx wiki is in getting the column names column names need to be symbols, sothe integers in t1 have to be converted to strings and then cast to symbols. Inthis case you are joining time to each of the times, so would have to do thisanyway.

In this version, times that are not present in t1 (e.g. time3) are absent fromthe result. The example you posted had it present. To do this, you can dosomething like this:<o:p></o:p>

q)exec($"time",/:string 1+til exec max time fromt1)#(($“time”,/:string time)!qty) by px:px from t1<o:p></o:p>

px | time1 time2 time3 time4 time5<o:p></o:p>

----|-----------------------------<o:p></o:p>

10 | 100<o:p></o:p>

20 | 200<o:p></o:p>

20.1| 200.1<o:p></o:p>

40 | 400<o:p></o:p>

40.1| 400.1<o:p></o:p>

50 | 500

Here instead of getting the distinct times fromt1, you simply get the maximum and generate the numbers from 1 up to thatnumber, and use these as the column names in the result.<o:p></o:p>

<o:p> </o:p>

Hope this helps<o:p></o:p>

Jonathon<o:p></o:p>

On Tuesday, January 31, 2017 at 12:17:45 PM UTC, VA wrote:

Good day Q-Gods,

I have an interesting question which I have been struggling with for a while now. I have two tables (t1, t2)

t1:

time px qty

1 10 100

2 20 200

2 20.1 200.1

4 40 400

4 40.1 400.1

5 50 500

t2 is a “derivative” of t1. Please note that px are not unique, as well as time steps; they repeat sometimes.

t2 has the px’s copied and sorted in ascending order. px column in t2 is a fixed column. The following columns is time, with each row reserved for each distinctive price.

t2:

px time1 time2 time3

10

20

20.1

40

40.1

50

SO what I want to do is to have some way to scan through t1 take qty from t1 and record it into t2 at the correct time(x), i.e. time1 or time2 or time3 at the correct price level, i.e. 10 or 40 or 40.1

I have tried working with each left and each right: “=/::” and then using “where” to get indices and then working with these indices. But it doesn’t work 100% right at times, and sometimes I get a list of indices not an integer alone and I have no idea how to proceed from then on, i.e. how do split this list into individual integers and then write them into t2 etc etc.

So what I would like to have is:

px time1 time2 time3 time4 time5

10 100

20 200

20.1 200.1

40 400

40.1 400.1

50 500

If anyone would be so kind and help me with this, it would be vert much appreciated. I have been trying to work this out myself for a few weeks now.

VA1 · January 31, 2017, 10:49pm

Thank you all for your replies. I will look into this tomorrow morning and will read up in the pivot tables. Can’t believe that I’ve spent weeks to get basically a one line soluition… very frustrating, but totally worth it - hopefully.

VA1 · March 7, 2017, 5:58pm

Thanks for the advice, but is there any way this could be improved, performance wise? I am running out of memory every time I run this function.. Currently I have a table with count around 9m records. At the moment I can use

q) exec ($"time",/:string 1+til exec max time from t1)#(($“time”,/:string time)!qty) by px:px from t

just on the 300000#t, but no more than that. Maybe there is a way to break this function into several somehow and use less memory?

Regards,

Vadim A.

On Tuesday, January 31, 2017 at 1:13:07 PM UTC, Jonathon McMurray wrote:

Hi

What youre after would be a pivot table - http://code.kx.com/wiki/Pivot

An example using your t1 table:

q)t1:(time:1 2 2 4 4 5;px:10 20 20.1 40 40.1 50;qty:100 200 200.1 400 400.1 500)

q)exec($"time",/:string asc exec distinct time fromt1)#(($“time”,/:string time)!qty) by px:px from t1

px | time1 time2 time4 time5

----|-----------------------

10 | 100

20 |       200

20.1|      200.1

40 |             400

40.1|            400.1

50 |                  500

The main difference here from the example on thekx wiki is in getting the column names column names need to be symbols, sothe integers in t1 have to be converted to strings and then cast to symbols. Inthis case you are joining time to each of the times, so would have to do thisanyway.

In this version, times that are not present in t1 (e.g. time3) are absent fromthe result. The example you posted had it present. To do this, you can dosomething like this:

q)exec($"time",/:string 1+til exec max time fromt1)#(($“time”,/:string time)!qty) by px:px from t1

px | time1 time2 time3 time4 time5

----|-----------------------------

10 | 100

20 |       200

20.1|      200.1

40 |                  400

40.1|                  400.1

50 |                        500

Here instead of getting the distinct times fromt1, you simply get the maximum and generate the numbers from 1 up to thatnumber, and use these as the column names in the result.

Hope this helps

Jonathon

On Tuesday, January 31, 2017 at 12:17:45 PM UTC, VA wrote:

Good day Q-Gods,

I have an interesting question which I have been struggling with for a while now. I have two tables (t1, t2)

t1:

time px qty

1 10 100

2 20 200

2 20.1 200.1

4 40 400

4 40.1 400.1

5 50 500

t2 is a “derivative” of t1. Please note that px are not unique, as well as time steps; they repeat sometimes.

t2 has the px’s copied and sorted in ascending order. px column in t2 is a fixed column. The following columns is time, with each row reserved for each distinctive price.

t2:

px time1 time2 time3

10

20

20.1

40

40.1

50

SO what I want to do is to have some way to scan through t1 take qty from t1 and record it into t2 at the correct time(x), i.e. time1 or time2 or time3 at the correct price level, i.e. 10 or 40 or 40.1

I have tried working with each left and each right: “=/::” and then using “where” to get indices and then working with these indices. But it doesn’t work 100% right at times, and sometimes I get a list of indices not an integer alone and I have no idea how to proceed from then on, i.e. how do split this list into individual integers and then write them into t2 etc etc.

So what I would like to have is:

px time1 time2 time3 time4 time5

10 100

20 200

20.1 200.1

40 400

40.1 400.1

50 500

If anyone would be so kind and help me with this, it would be vert much appreciated. I have been trying to work this out myself for a few weeks now.

Flying1 · March 9, 2017, 2:53am

Hi VA,

You can guesstimate’ the amount of memory you need in order to accommodate the resultset you are trying to build.

How many distinct values do you have with exec max time from t1? How many unique px do you have in t? With these two parameters, you can already estimate whether it is realistic for you to hold such dataset in memory in the first place (not to mention the address space required for the original data in t, as well as some extra memory required during data processing).

Cheers,

Flying

Topic		Replies	Views
Transforming table: values into column names Community Support kdb-and-q	18	16	January 1, 2010
How to combine two tables Community Support kdb-and-q	7	5	July 9, 2009
Pivot Community Support kdb-and-q	3	4	December 2, 2009
SQL Subset? Community Support kdb-and-q	17	6	May 16, 2008
What is the equivalent of rownum Community Support kdb-and-q	6	5	August 11, 2009

intersting &quot;function&quot; question

Related topics

intersting "function" question