xbar & timestamp manipulation

https://learninghub.kx.com/forums/topic/xbar-timestamp-manipulation

Hi,

I have a minute <DT(timestamp),O,H,L,C> table , from which I need to aggregate/process data for different intervals. In my case, as I need the equivalent of a (eg. hourly data from 10:00 to 11:00 must show on the output DT line yyyy.mm.ddD11:00:00), I use an offset value in the statement.

No problem for any interval <= daily and weekly, however I don’t understand what is going on for a ‘non-standard’ 2-day interval. Below is some basic data around 2013 new-year (01/01/13 is a Tuesday).

/Daily interval... OK 
10#select first O,last C by DT:(1 xbar DT.date)+0D16:00 from tt 
DT                           | O       C 
-----------------------------| --------------- 
2012.12.31D16:00:00.000000000| 1405.22 1425.69 
2013.01.02D16:00:00.000000000| 1426.19 1461.36 
2013.01.03D16:00:00.000000000| 1462.42 1459.07 
2013.01.04D16:00:00.000000000| 1459.37 1466.1 
2013.01.07D16:00:00.000000000| 1466.47 1461.77 
2013.01.08D16:00:00.000000000| 1461.89 1457.05 
2013.01.09D16:00:00.000000000| 1457.15 1461.04 
2013.01.10D16:00:00.000000000| 1461.02 1471.99 
2013.01.11D16:00:00.000000000| 1472.12 1471.71 
2013.01.14D16:00:00.000000000| 1472.05 1470.79 

/2-Day interval... odd output 
5#select first O, last C by DT:(2 xbar DT.date)+1D16:00 from tt 
DT                           | O       C 
-----------------------------| --------------- 
2012.12.31D16:00:00.000000000| 1405.22 1425.69 /ok 
2013.01.02D16:00:00.000000000| 1426.19 1459.07 /close?... should be 1461.36 (from 01.02) 
2013.01.04D16:00:00.000000000| 1459.37 1466.1 /open? ... should be 1462.42 (from 01.03) 
2013.01.06D16:00:00.000000000| 1466.47 1461.77 /Sunday data? taken from following Monday? 
2013.01.08D16:00:00.000000000| 1461.89 1461.04 /open? ... should be 1466.37 (from 01.07)

Questions:
- Isn’t expected to use ONLY the available DT.dates to build its bins, like it does for minute/hourly/daily/weekly intervals?

- How should the be built to yield, for either one of the following cases:

a) if using 2-day buckets of calendar days, including when market is closed (no data): 
DT                           | O       C 
-----------------------------| --------------- 
2012.12.31D16:00:00.000000000| 1405.22 1425.69 /from 12.31 to 01.01 (nothing on 01.01, so shows on 12.31) 
2013.01.03D16:00:00.000000000| 1426.19 1459.07 /from 01.02 to 01.03 
2013.01.04D16:00:00.000000000| 1459.37 1466.1 /from 01.04 to 01.05 (nothing on 01.05, so shows on 01.04) 
2013.01.07D16:00:00.000000000| 1466.47 1461.77 /from 01.06 to 01.07 (nothing on 01.06) 
2013.01.09D16:00:00.000000000| 1461.89 1461.04 /from 01.08 to 01.09 NOTE: In the above example, 2012.12.31 is 'day 0'; if 2012.12.31 is 'day 1', obviously different results, but same concept would apply 

b) OR... what I would prefer, if using 2-day buckets of available data: 
DT                           | O       C 
-----------------------------| --------------- 
2013.01.02D16:00:00.000000000| 1405.22 1461.36 /from 12.31 to 01.02 (nothing on 01.01) 
2013.01.04D16:00:00.000000000| 1462.42 1466.1 /from 01.03 to 01.04 
2013.01.08D16:00:00.000000000| 1466.47 1457.05 /from 01.07 to 01.08 (01.05 and 06 is a week-end) 
2013.01.10D16:00:00.000000000| 1457.15 1471.99 /from 01.09 to 01.10 
2013.01.14D16:00:00.000000000| 1472.12 1470.79 /from 01.11 to 01.14 (01.12 and 13 is a week-end)

Unless there is a relatively standard way to present data for ‘n-day’ intervals that I am unaware of, either one would be fine with me, as long as I understand how it works(!)

Thx
JP

 

I’m not sure what your complete data set looks like - but using a best guess, I get an expected result.

q)tt:([]DT:`timestamp$2012.12.31 2013.01.02 2013.01.03 2013.01.04 2013.01.07 2013.01.08 2013.01.09 2013.01.10 2013.01.11 2013.01.14;O:1405.22 1426.19 1462.42 1459.37 1466.47 1461.89 1457.15 1461.02 1472.12 1472.05;C:1425.69 1461.36 1459.07 1466.1 1461.77 1457.05 1461.04 1471.99 1471.71 1470.79) 
q)select first O,last C by DT:(2 xbar DT.date)+1D16:00 from tt 
DT                           | O       C 
-----------------------------| --------------- 
2013.01.01D16:00:00.000000000| 1405.22 1425.69 // open 31st, close 31st 
2013.01.03D16:00:00.000000000| 1426.19 1459.07 // open 2nd, close 3rd 
2013.01.05D16:00:00.000000000| 1459.37 1466.1 // open 4th, close 4th 
2013.01.07D16:00:00.000000000| 1466.47 1461.77 // open 7th, close 7th 
2013.01.09D16:00:00.000000000| 1461.89 1461.04 // open 8th, close 9th 
2013.01.11D16:00:00.000000000| 1461.02 1471.71 // open 10th, close 11th 
2013.01.15D16:00:00.000000000| 1472.05 1470.79 // open 14th, close 14th

I also don’t see how you can have 2012.12.31D16:00:00 in both results (1 day/2 day buckets), as it shouldn’t appear in the 2-day bucket query (unless you have some other row dated 2012.12.30 in your table you’re not showing us?)

q)(2 xbar 2012.12.31)+0D16:00 1D16:00 
2012.12.31D16:00:00.000000000 2013.01.01D16:00:00.000000000

This looks like some data quality issue somewhere rather than issue with xbar - if you can provide a small stand alone sample of your data that replicates the problem, I can take a look using that.

Feeling embarrassed in here… This is entirely my mistake; no data missing, but faulty statement transcription in the forum’s editor, while playing with the offset. The 2-day interval statement select does indeed yield the correct output.

Thank you for your time.