Interaction between peach and other optimisations

erichards · November 17, 2023, 12:00am

https://learninghub.kx.com/forums/topic/interaction-between-peach-and-other-optimisations

I understand there are various parallel optimisations that happen under the hood when running with some number of secondary threads, e.g. summing across multiple partitions. How do these interact with peach?

For example:

disk0/hdb/par.txt → disk1/hdb/partitions , disk2/hdb/partitions disk1/hdb/partitions/1-3-5 disk2/hdb/partitions/2-4-6

If I ran a query such as

select sum price by sym where int within (1;4)

and I had two secondary threads available, thread #1 would retrieve data from partitions 1, 3 on disk 1, and thread #2 would retrieve data from partitions 2, 4 on disk 2 to maximise I/O throughput.

But if my queries were wrapped in peach, would this still be possible, given peach would be using all available threads, e.g.

{x[]} peach ( {select sum price by sym where int within (1;4)}; {select sum price by sym where int within (5;6)} )

So are there situations when using peach can reduce performance? Thank you

rocuinneagain · November 17, 2023, 12:00am

The parallelism can only go one layer deep.

.i.ie These 2 statements end up executing the same path. In the first one the inner peach can only run like an each as it is already in a thread:

data:8#enlist til 1000000 ts {{neg x} peach x} peach data 553 1968 ts {{neg x} each x} peach data 551 1936

For queries map-reduce still will be used to reduce the memory load of your nested queries even if run inside a ``peach` even if not running the sub parts in parallel.

https://code.kx.com/q4m3/14_Introduction_to_Kdb%2B/#1437-map-reduce

Where you choose to put your peach can be important and change the performance of your execution.

My example actually runs better without peach due to the overhead of passing data around versus neg being a simple operation

ts {{neg x} each x} each data 348 91498576

.Q.fc exists to help in these cases

ts {.Q.fc[{neg x};x]} each data 19 67110432

https://code.kx.com/q/ref/dotq/#fc-parallel-on-cut

And in fact since `neg` has native multithreading and operates on vectors and vectors of vectors it is best of off left on it's own:

ts neg each data 5 67109216 
ts neg data 5 67109104 
neg data

This example of course is extreme but does show that thought and optimisation can go in to each use-case on where to choose to iterate and place `peach``

erichards · November 17, 2023, 12:00am

I guess a more succint version of my question is “what happens to native parallelisations when running queries inside an instance of peach?”

erichards · November 20, 2023, 12:00am

Many thanks for the reply and examples.

"in fact since `neg` has native multithreading and operates on vectors and vectors of vectors it is best of off left on it's own"

This is what I was keen to understand, and it's useful to know that there are cases when you may be better off without peach.

rocuinneagain · February 23, 2024, 12:00am

kdb+ 4.1 has been released with some interesting improvements for peach which changes some of my answers as nesting is now supported

https://code.kx.com/q//releases/ChangesIn4.1/#peachparallel-processing-enhancements

Topic		Views
Interaction between peach and other optimisations Community Support kdb-and-q	3	November 20, 2023
puzzled about 'peach'... Community Support kdb-and-q	1	May 6, 2011
peach / no socket in kdb 3.5 Community Support kdb-and-q	3	June 5, 2019
.Q.fc vs peach Community Support kdb-and-q	6	January 27, 2018
Multi-threaded decompression Community Support kdb-and-q	3	April 18, 2016

Interaction between peach and other optimisations

Related topics