Optimal way of doing X-minute bars over a multi-day span

guerrero_diego · November 12, 2013, 9:00pm

Hello!

I’m new to this KDB world, so I’m looking for a little help trying to optimize/replace a query.

I’m developing a charting application that allows traders to look at data from several days or months, so I’m trying to hit the database in just one query to extract the data that I need.

For example, if I wanted to extract 60-minute bars for the whole month of October from the database I would use the following query:

{[start;end] dateminute xasc raze {dateminute xkey update date:x from select first price,max price,min price,last price,sum volume by 60 xbar time.minute from TABLE where date=x,ex=EXCHANGE,sym=SYMBOL} peach (start + til (end-start) + 1)}[2013.10.01;2013.10.31]

What are your thoughts on that query? Any tip for improving it?

Thanks,

Diego

qhuyvu · November 12, 2013, 11:52pm

It depends on your purpose but you might want to use 0D60 xbar time to maintain the time information.

manni_patel · November 13, 2013, 6:53am

Swap ex and sym in the order of the where clause. If you already use -s on the HDB, there’s no need for the peach - a "date in " does the same thing.

charlie1 · November 13, 2013, 7:35am

do you have parted attribute set on sym? If so that should probably be your first constraint after date. Also can use within.

{[start;end] select first price,max price,min price,last price,sum volume by date,60 xbar time.minute from TABLE where date within(start;end),sym=SYMBOL,ex=EXCHANGE}[2013.10.01;2013.10.31]

guerrero_diego · November 14, 2013, 7:23pm

Hey guys, thanks a lot for the help, I’m using Charles’ suggestion and it seems to be faster than my original query.

About your other suggestions, since I am still kind of new about KDB, I have some questions for you:

What does “0D60 xbar time” do?
What does the option -s do? I ask the guy in charge of KDB in the office and he didn’t know.
What does “parted attribute” stand for?

Thank you very much,

Diego

charlie1 · November 14, 2013, 8:50pm

>1. What does “0D60 xbar time” do?

0D60 is shorthand for a timespan of 60 hours - better not to rely on exceeding 23 hours in that field. Huy probably meant 0D01, but you’d also have to cast the rhs to timespan for that to work. It floors to multiples of that interval. Used for bucketing.

q)0D01 xbar `timespan$05:02:01.123

0D05:00:00.000000000

>2. What does the option -s do?

It’s a command line option that specifies how many slave threads kdb+ should use for “parallel each” (peach) operations. kdb+ can use multithreading to speed up queries in hdbs in this mode.

>3. What does “parted attribute” stand for?

Attributes can be supg - sorted, unique, parted, grouped; each one can allow optimizations within certain operations. Parted means the unique elements are grouped together in the vector.

e.g. q)`p#1 1 5 5 3 3

Usually hdb data is sorted by sym, then time. This lends itself to having the `p attribute on sym.

code.kx.com is the best and most accurate reference for this info, along with Arthur’s original documents

http://kx.com/q/d/q.htm

http://kx.com/q/d/kdb+.htm

And Dennis Shasha’s

http://kx.com/q/d/primer.htm

Topic		Replies	Views
Query Performance Community Support kdb-and-q	3	27	August 22, 2017
Question about the performance difference of two queries on HDB Community Support kdb-and-q	7	10	January 28, 2016
Calculation of bars according to the time range Community Support kdb-and-q	2	13	March 25, 2015
Optimal Database Structure Community Support kdb-and-q	8	11	January 7, 2015
kdb+ intro question Community Support kdb-and-q	2	4	June 28, 2014

Optimal way of doing X-minute bars over a multi-day span

Related topics