Sorting half of a list

In C++ I can do a partial_sort(ptr1, ptr2, ptr3). After calling partial_sort. The data in [ptr1, ptr2) is sorted and smaller than those in [ptr2, ptr3). The data in [ptr2, ptr3) is left in undefined order.

http://www.cplusplus.com/reference/algorithm/partial\_sort/

http://www.cplusplus.com/reference/algorithm/partial\_sort\_copy/

How can I do this in kdb/q? Thx.

I don’t believe there is something like that already in q. 

Here is a quick function I just wrote:

q){(asc x[til m+1]),x[m+til (count x)-m:x?x[floor (count x)%2]]} (64 49 82 40 88 77 30 17 23 12 66)

40 49 64 77 82 88 77 30 17 23 12 66

I am sure it can be immensely improved. I will check back later when I have a better version!

Just noticed 77 is repeated. Here is a quick fix:

q){(asc x[til m+1]),x[(m+1)+til ((count x)-1)-m:x?x[floor (count x)%2]]} (64 49 82 40 88 77 30 17 23 12 66)

40 49 64 77 82 88 30 17 23 12 66

you can split the data above and below your partition, and then only sort the data below.

q)psort:{(asc y where x;y where not x>:y)}
q)show x:20?10
8 8 2 3 3 2 4 5 1 0 6 1 7 7 2 1 3 2 8 9

the function returns two lists so you can obtain the sorted data if you want

q)psort[5] x
`s#0 1 1 1 2 2 2 2 3 3 3 4
8 8 5 6 7 7 8 9

or raze the result to get a single list

q)raze psort[5] x
0 1 1 1 2 2 2 2 3 3 3 4 8 8 5 6 7 7 8 9

Nick: Where does the number 5 come from? In partial_sort(ptr1, ptr2, ptr3), the pivot value is NOT given. The sizes of the left partition and the right partition may be deduced from the 3 pointers. To get the pivot value, you will need something like nth_element(ptr1, ptr2, ptr3).

http://www.cplusplus.com/reference/algorithm/nth\_element/

using a c++ algorithm to solve kdb+ problems isn’t always the best solution. can you describe your problem, giving example inputs and outputs. q does not use iterators so the approaches will differ. 

q can only sort a whole list - so no performance benefit from only returning the first n sorted elements.

since there are no pointers in q, you’ll need to pass in 3 indices and the data to the function.

psort:{f;m;l;x#asc x f+til l-f}

given the following list:
q)x

4 8 7 7 7 6 5 3 2 6 4 2 4 9 5 5 2 5 1 4

we can ask for the next 5 ascending elements starting from index 5 using data through index 20:
?

q)psort[5;10;20] x
1 2 2 2 3

asking for 20 elements logically returns the sorted list:

q)psort[0;20;20] x
`s#1 2 2 2 3 4 4 4 4 5 5 5 5 6 6 7 7 7 8 9

I think selecting the rows of the first 10% of symbols is faster in the average case. The worst case is still O(n log n) when the quotes table has only 1 symbol. I can also iterate thru all the quotes in 10 batches incrementally.

quote: genSampleQuoteMb[800];

update g#sym from quote; / O(n) incremental index construction

symList: `sym xasc select distinct sym from quote; / distinct symbol list

symList: `sym xasc select from symList where i<0.1*count symList; / first 10% of symbols

selectedQuote: symtime xasc select from quote where sym in symList[`sym]; / quotes of the first 10% of symbols, not necessarily 10% of row count but OK

symtime xasc `quote; / O(n log n) batch sorting

btw Is applying the grouped attribute really O(n)? Is it building a hash table? It seems to be faster than sorting the whole table.

On Thursday, February 5, 2015 at 10:50:37 AM UTC-5, Himanshu Gupta wrote:

I don’t believe there is something like that already in q. 

Something like this is very easy in PyQ:

>>> q()  #  switch to q mode

q)x:-10?10   / shuffle 0-9

q)x

1 8 5 7 0 3 6 4 2 9 

q)\

>>> import numpy

>>> numpy.asarray(q.x)[:5].sort()

>>> q.x

k(‘0 1 5 7 8 3 6 4 2 9’)

>>> q()  # just to be sure we didn’t just sort a copy

q)x

0 1 5 7 8 3 6 4 2 9

Does the embedded kdb in PyQ listen on any network port?

On Monday, February 9, 2015 at 1:04:41 AM UTC-5, Yan Yan wrote:

Does the embedded kdb in PyQ listen on any network port?

PyQ embeds Python in kdb+, so there is no special “embedded kdb.” You get the standard kdb+ environment plus the Python interpreter.  The tricky part is not to make kdb+ listen on the network port, but to make non-blocking calls to Python from within kdb+ event loop.

Here is an example that shows how to call python function from a kdb+ timer while serving kdb+ IPC on port 1111:

$ cat x.q

p)from pyq import q

p)def f():

      print “hello from python”

      return q(‘::’)

p)q.f = f

.z.ts:{f()}

\t 2000

$ q/m32/q x.q -p 1111 -q

hello from python

hello from python

hello from python

Now, while embedded Python prints a greeting every 2 seconds, you can interact with the same kdb+ process on localhost:1111.

This kind of usage is still considered experimental and you would typically want to use a Python web server like tornado with PyQ rather than the kdb+ built-in.