Reading KDB serialized data from file?

https://learninghub.kx.com/forums/topic/reading-kdb-serialized-data-from-file

Hi,

I'm working with an existing codebase which uses KDB serialization for multiple purposes.

This existing code is written in a language which is not officially supported by KDB/KX.

It seems that a number of years ago, someone implemented KDB serialization and deserialization in a library, and that since this library was written, this KDB serialization and deserialization format/library has been used to serialize data to a number of different systems, one of which is a KDB+ server.

This library is also used to serialize data to files on disk, for example.

Unfortunatly, the performance isn't very good. This appears to be a language runtime problem and not a KDB serialization format problem.

I think that changing from this language to something like C++/C may result in improved performance.

I believe there is a C/C++ library for KDB which can be used to write KDB clients using C/C++.

However, I don't know if this library can be used to read and write data from disk.

I had a look at this reference, and didn't see anything which looked like it might be relevant for reading and writing files from disk containing data which has been serialized in KDB format.

- https://code.kx.com/q/interfaces/c-client-for-q/

Does anyone know if this is possible?

I am trying to avoid having to re-write the existing library in C++ myself. I could do this, but if there is a faster route, it doesn't make much sense to re-invent the wheel.

What is your use-case for needing to serialise to disk? what is intended to read this data & must it be kdb serialised format? I assume the serialisation currently used by this process is to send/receive data over IPC?

I believe re-using the same serialization format as is used for KDB was just for convenience. A function existed to serialize data. It was probably simply reused for serializing to disk.

The C api has b9/d9 to serialise/deserialise (e.g. https://code.kx.com/q/interfaces/capiref/#b9-serialize ) if you wish to turn a turn anything to bytes (as would be bytes it can be read/write disk in a normal manner if needed). If that helps in someway. Im not clear on the use case to know what its required for, or how the application used whatever serialised data it or something else generated.

I think those are for IPC serialization, which is completely different from the on-disk format. Some types are represented on disk with multiple files - how would that information come back from the serialization function or be passed into the deserialization function if they only return/accept one byte array?

Yes they are completely different. That was the reasoning behind the queries of why/what it was being used for & what is trying to replicated from the original use-case which Im not clear of. Was it some reverse engineering of the on-disk format, or was it something like an in-house use for recording ipc messages, etc.

I think this is the right function to use. I guess in this context IPC, meaning InterProcess Communication, means communication via network sockets. In that case, this function appears to be an equivalent to the one which someone has manually implemented in the codebase I am looking at. That function is used to talk to KDB instances via a network socket. The same logic has been repurposed for writing data to disk.