KDB instances using shared memory

vortexsbkny · October 7, 2016, 2:39am

Hi,

I tried asking this question on stack-overflow. I am interested in having two (or more) KDB instances using a read-only table loaded into shared memory. My requirement is to load the data(table) once and cache it there. If I were able to put the table into read-only(immutable) mode, multiple readers can use the data without
again loading the data from disk.

Can this be done?

thanks

charlie1 · October 7, 2016, 5:36am

if you can store your data as a non-compressed splay on-disk, e.g.

:t/ set .Q.en[:.;t]

and load it into each process with

sym:get`:sym

t:get`:t

the pages should be shared between the processes.

vortexsbkny · October 7, 2016, 5:50am

Thank you for your reply. If I understood correctly, your solution relies on
OS page caching?

Is there true shared memory space where I can instruct KDB instances that table X is laid out at the offset Y
and should be treated as effectively read-only?

charlie1 · October 7, 2016, 7:04am

yes, it uses the page cache, and allows the same physical pages to be shared between processes. What are your concerns in using this method?

Regarding readonly - you can set the kdb+ to be readonly using the command line option

http://code.kx.com/wiki/Reference/Cmdlineb

or implement an access control layer

http://code.kx.com/wiki/Cookbook/AuthenticationAndAccessControl

or revert the data (remap) if changed (monitor with .z.vs)

http://code.kx.com/wiki/Reference/dotzdotvs

?hth,

Charlie?

vortexsbkny · October 7, 2016, 6:00pm

OS page cache is implicit and could be swapped out by OS depending
on system utilization. I wanted to build an in-memory caching layer where multiple KDB “compute” instances
share table(s) and reduce disk I/O because data is pre-loaded for them. I want to be in full control of
caching and I/O.

Charles, I could discuss my use case offline if you’re interested at all?

Thank you

sieber_m1 · October 9, 2016, 11:42am

Hi,

you should read this article: http://varnish-cache.org/docs/trunk/phk/notes.html

I think it might be interesting for you.

regards, Markus

felix1 · October 9, 2016, 1:37pm

you can do it but have to dust off your C and OS skills.

serialization/deserialization:(http://code.kx.com/wiki/Cookbook/InterfacingWithC#Serialization.2FDeserialization)

shared mem:

http://man7.org/linux/man-pages/man7/shm_overview.7.html

once you have the name of the shared segment and data is mapped on the right address pass it to the other process by name.

jay_han · October 14, 2016, 11:48pm

It looks like d9(b9(-1,x) is a (or the only?) nul-safe way of doing deep-copy of types 0, KS, XT, XD using C API? (It is a neat way, albeit expensive.)

Speaking of C API, there is no mention of API for retrieving number of bytes a K object takes up. TorQ’s objsize function is useful from q and it can be called via k(), but it’d be more convenient to have a C function.

effbiae · October 15, 2016, 8:19am

>retrieving number of bytes a K object takes up.

there would be a small recursive answer in q using count, type and a map of types to size. only problem is that where lists share references to other objects, you will find that the total size of all objects sum to greater than process memory usage. once the q function is known, it can be translate to c - or just call k(0,“{my small size function}”,r1(x)) if you’re in a shared object.

but if you want to know how much memory an object uses in q’s memory allocation method (http://code.kx.com/wiki/DotQ/DotQDotgc), each list size will have to be rounded up to the next larger memory block.

rspa9428 · October 16, 2016, 5:11pm

As Jay mentioned, that function is already implemented here:

https://github.com/AquaQAnalytics/TorQ/blob/master/code/common/memusage.q

It gets a bit complex and has to make some assumptions when dealing with complex objects, and large tables containing nested lists. Theres a write up of it here: http://www.aquaq.co.uk/q/adventure-in-retrieving-memory-size-of-kdb-object/

It seems to work quite well- Ive used it in production set ups to measure object size

Cheers

Ryan

jay_han · October 16, 2016, 6:24pm

https://github.com/AquaQAnalytics/TorQ/blob/master/code/common/memusage.q

Topic		Views
Understanding KDBs memory and disk utilization Community Support kdb-and-q	5	July 22, 2015
Querying on-disk splayed tables Community Support kdb-and-q	0	September 4, 2013
sharing data between master and slave threads in kdb Community Support kdb-and-q	0	December 28, 2015
persistence Community Support kdb-and-q	4	December 2, 2010
KDB WORM (Write once and read many) function Community Support kdb-insights	1	October 2, 2022

KDB instances using shared memory

Related topics