slow performance of win32 version of KDB

Hi, I like to test some simple thing on windows version of 32bit KDB, I inputted a table with 99722 row, with date, sym and other 15 columns of numeric data. However, if I just take a simple average of that 15 column, it take more than 3 minutes to return, will anyone have any clue about why it is that slow? How can I start to analysis this performance issue?

post your code plz. i can run mine 1k times in 1.7 sec

n:99722;

tab:(

  date:n#.z.D;

  sym:n#`AAA;

  f1:1.0*til n; f2:1.0*til n; f3:1.0*til n; f4:1.0*til n; f5:1.0*til n; 

  f6:1.0*til n; f7:1.0*til n; f8:1.0*til n; f9:1.0*til n; f10:1.0*til n; 

  f11:1.0*til n; f12:1.0*til n; f13:1.0*til n; f14:1.0*til n; f15:1.0*til n);

\ts:1000 select

avg f1,avg f2,avg f3,avg f4,avg f5,

avg f6,avg f7,avg f8,avg f9,avg f10,

avg f11,avg f12,avg f13,avg f14,avg f15

from tab

q) \ts:1000 select

avg f1,avg f2,avg f3,avg f4,avg f5,

avg f6,avg f7,avg f8,avg f9,avg f10,

avg f11,avg f12,avg f13,avg f14,avg f15

from tab

1754j, 4304j

Sorry, just see your reply, I just check with this simple query

select avg c1, avg c2, avg c3, avg c4 , …  by sym from tab where date in dates, sym in syms

dates is a list of date in the partition table, I put only 10 days of data , with total 99722 rows.

syms in a list of symbol in the table, which is about 5000 symbols 

What does “meta tab” show?

Hi Yan, it look like this

| c | t | f | a |
| date | d | | |
| sym | s | | p |
| c1 | f | | |
| c2 | f | | |
| c3 | f | | |
| c4 | f | | |
| c5 | j | | |
| c6 | i | | |
| c7 | e | | |
| c7 | e | | |
| c9 | j | | |
| c10 | f | | |
| c11 | f | | |
| c12 | f | | |
| c13 | f | | |
| c14 | j | | |
| c15 | f | | |
| c16 | f | | |
| c17 | f | | |
| | | | |

On Monday, September 14, 2015 at 4:30:17 PM UTC+8, Yan Yan wrote:

What does “meta tab” show?

Hi Carfield,

p# - partition attribute should be used for tables with tens of millions of rows. I see that your table probably has just ~ 2rows per each sym, p# performance will be awful in this case.

I’d suggest to switch to `g# - group attribute. Or use sort attribute if appropriate.

WBR, Andrey Kozyrev.

???, 15 ??? 2015 ?., 10:54:07 UTC+3 ??? Carfield Yim ???:

Hi Yan, it look like this

| c | t | f | a |
| date | d | | |
| sym | s | | p |
| c1 | f | | |
| c2 | f | | |
| c3 | f | | |
| c4 | f | | |
| c5 | j | | |
| c6 | i | | |
| c7 | e | | |
| c7 | e | | |
| c9 | j | | |
| c10 | f | | |
| c11 | f | | |
| c12 | f | | |
| c13 | f | | |
| c14 | j | | |
| c15 | f | | |
| c16 | f | | |
| c17 | f | | |
| | | | |

On Monday, September 14, 2015 at 4:30:17 PM UTC+8, Yan Yan wrote:

What does “meta tab” show?

I have changed the 2nd c7 to c8. The avg query took 870 ms only. Carfield asked for less than 100k rows.

/ simple avg of partitioned db


dbDir:`:/tmp/hdb;

system "cd ",1_string dbDir;


genDay:{

data:(

date:x#y;

sym:{`$“sym”,“0”^-4$string x} each til x;

c1:“f”$til x;

c2:“f”$til x;

c3:“f”$til x;

c4:“f”$til x;

c5:“j”$til x;

c6:“i”$til x;

c7:“e”$til x;

c8:“e”$til x;

c9:“j”$til x;

c10:“f”$til x;

c11:“f”$til x;

c12:“f”$til x;

c13:“f”$til x;

c14:“j”$til x;

c15:“f”$til x;

c16:“f”$til x;

c17:“f”$til x);

update p#sym from sym xasc data};


writeDay:{

x:update p#sym from sym xasc x;

tabDir: sv dbDir, ($string x[`date][0]), `tab, `;

show tabDir;

tabDir set .Q.en[dbDir] x;};


writeDay each genDay[10000;] each 2015.01.01+til 10;

system "l ",1_string dbDir;


\ts select

avg c1, avg c2, avg c3, avg c4, avg c5,

avg c6, avg c7, avg c8, avg c9, avg c10,

avg c11, avg c12, avg c13, avg c14, avg c15,

avg c16, avg c17

by sym

from tab

where (date in 2015.01.01+til 10),(sym in 5000#sym)


/ 870j, 21835264j

An O(n^2) algorithm on 100k rows will probably take 3 minutes.