Hi,
This is a slightly complex question about KDB+ IPC Serialization.
I am working on updating some legacy code which implements the KDB+ wire protocol for IPC/network communication.
A programming language typically has access to OS System Calls to do things such as read and write files, read and write data to sockets, etc.
In the case of Linux systems, the OS function to receive data from a socket is recv.
This function reads data from a socket, and writes it to a buffer in memory. It returns the number of bytes read from the socket, or a negative integer in the case of error.
It takes a parameter which specifies the maximum number of bytes to be read. Let us call this n.
However, there is no guarantee that n bytes of data will be read from a socket. This is because when calling recv we do not know what data will arrive. (It could be junk.)
In addition, IP packets may be fragmented. I think I am correct in saying that in principle there are no restrictions on how badly a network packet can be fragmented. I think I am correct in stating that the packet could be fragmented all the way down to the minimum size, which I assume is one byte of client data per packet. (Obviously packets have IP headers and ethernet headers etc so that actual minimum packet size is much larger than a single byte, but the actual client data contained within can presumably be very small.)
Onto the main part of my question - I am unsure as to how I should implement the logic relating to recv.
I looked as the specification for the KDB+ IPC. I don’t understand how the serialization specification guarantees that recv always knows if there is more data to be read from a socket or whether the message is complete.
In particular, KDB+ can apply compression to a block of data before sending. Presumably, in order to read a compressed block of data from a socket, we would need to know how many bytes of data need to be read before attempting to decompress the block of data.
Does this question make sense? Basically, I am asking how to determine how many bytes need to be read by 1 or more calls to recv before the reader knows the full message has been received.