Interacting with C++

see
http://www.cplusplus.com/reference/string/string/c_str/

char buf1=‘a’;

is not a null terminated string. So for that you would use sn(&buf1,1) to create a symbol, or kpn(&buf1,1) to create a char vector.

If you just want to insert all the fields into a table, it’s perhaps easier to use a list rather than a dictionary. That way you can just pass a table name and a table row
to a function and then call insert in kdb+.

To take an example from the Kx Wiki:

/ Build a table row as a list of K objects
K row = knk(3, ks((S)“ibm”), kf (93.5), ki(300));

/ Call the function with the table name and the row data.
/ The upd function will append the row onto the trade table
k(-kdbSocketHandle, “func”, ks((S)“trade”), row, (K)0);

Then in kdb+ you would have your function that could be defined as:

func:{[t;x] t insert x; }
or
func:insert

I meant to say thank you Mark and Charles. For a variable in C++ of type string, call it var1. To access this as a symbol in kdb I needed to do ks(strdup(var1.c_str())). Furthermore, when I used this to create a dictionary key value eg (13th value): kK(val)[12]=ks(strdup(var1.c_str()); Did not like the assignment from ks. To get around this problem,I used what Mark mentioned on how to create lists. So I created a one element list.: kK(val)[12]=knk(1,ks(strdup(var1.c_str()))); and it worked. Maybe not the best thing to do, but it worked.

Hi,

Coming back to the question of transferring data from C++ to kdb real time.

The process I have of doing this: firstly the feed handler receives the data, parses it and publishes to a port using zmq.

I listen to the data in my C++ application and for each message update from the port I create a K dictionary of the variables I want (approx 13 variables), then I send this dictionary across to kdb as follows:

K result = k(handle, “myfunc”, dict, (K) 0);

where handle takes me to a kdb session on the localhost and specific port. The function myfunc takes the dictionary, dict extracts a couple of longs and converts them to datetimes and then inserts all the values to the end of a pre-defined table.

The problem I have is, this process is too slow to consume the data that is being published to the port, from the zmq. This means my application is missing lots of packets. Actually it’s missing 70% of packets it is that slow. A run I did, earlier for 5 mins, the zmq published approx. 500,000 messages and my kdb application only recorded roughly 150000. which equates to only 500 messages per second on average. When I remove the above line from the C++ application, and print the count of messages received to screen, there is no loss of packets.

Therefore the bottleneck must occur in the line above, i.e. sending the dictionary of values to kdb from C++, across the handle. 

I would like to know, is sending a (small) dictionary across to kdb slow? and therefore is this a pretty inefficient way of transferring the data realtime.

An alternative way of way of doing it then may be to append the dictionaries together in C++ (i.e. clumping the data) and then after every 5 seconds (say) send it across to kdb, therefore vastly reducing the number of times the I call the above line?

Another problem is, if kdb crashes, then when I reopen, I will wish to replay all the data into the table. So, actually the current method is, have another application receive the data from the port and write to a md file (binary) and have my application read and decode the md file, then sending to kdb as mentioned before. This means if kdb crashes, I can restart the application and all data is replayed from the md file and written to kdb. Problem is, reading and decoding the binary adds an extra level of slowness. 

Maybe my method here is generally bad. Another solution I thought of, was to have a separate application write the 13 variables of each message update to a csv file and every so often have kdb read the csv file in using a read0 or something. But im not keen on something like this because it seems more restrictive, plus I don’t know how a realtime verion of this would work. i.e. I can’t keep reading in a csv every so often, as the csv obviously will grow very large by the end of the day.

The goal is,listen to the port, write the data using C++ to a file, then have c++ read this file constantly and send to kdb any updates. At end of the day, save the table to disk, and repeat for the next day. I don’t care about a few seconds or more of latency doing this.

Someone I spoke to who’s familiar with mongoDB said threading would help here, I guess multiple threads doing the transfer of data to kdb. Any thoughts?

I wouldn’t have thought capturing 1000 messages per sec on avg and sending to kdb would be a problem?

best,

John.

 

are you sending async (negative handle) or sync (positive handle) ? Feeds should always send async.

In general, try to bulk up data in fewer msgs - it’s more efficient, less cpu load.

Ensure your socket buffer sizes (an OS setting) on your machines is sufficient for your payload and for any backlog if the receiver is temporarily busy with something else.

>Another problem is, if kdb crashes,…

That should be an exceptional case; ticker plants should never crash and they allow for recovery of downstream components through re-subscription.

Usually feedhandlers push into the tickerplant which does the logging, and the tp handles pub/sub for subscribers. The tp can also bulk up data too and publish later. It sounds like you may be reinventing the wheel. See

http://code.kx.com/wiki/Startingkdbplus/tick