kdb+ v3.5 has now reached production status and the 32bit version is available on kx.com. Licensed customers should obtain the 64bit version from their company representatives.
kdb+ v3.5 Highlights
Enhanced Debugger
In kdb+ v3.5, the debugger has been extended to include the backtrace of the q call stack, including the current line being executed, the filename, line and character offset of code, with a visual indicator (caret) pointing to the operator which failed. The operator and arguments may be captured programmatically for further propagation in error reporting. Backtraces may also be printed at any point by inserting the .Q.bt command in your code. Please see http://code.kx.com/qref/reference/debug for further details.
Concurrent Memory Allocator
kdb+ v3.5 has an improved memory allocator which allows memory to be used across threads without the overhead of serialization, hence the use-cases for (multithreaded) peach now expand to include large result sets.
Socket Sharding
kdb+ v3.5 introduces a new feature that enables the use of the SO_REUSEPORT socket option, which is available in newer versions of many operating systems, including Linux (kernel version 3.9 and later). This socket option allows multiple sockets (kdb+ processes) to listen on the same IP address and port combination. The kernel then load-balances incoming connections across the processes.
When the SO_REUSEPORT option is not enabled, a single kdb+ process receives incoming connections on the socket.
With the SO_REUSEPORT option enabled, there can be multiple processes listening on an IP address and port combination. The kernel determines which available socket listener (and by implication, which process) gets the connection. This can reduce lock contention between processes accepting new connections, and improve performance on multicore systems. However, it can also mean that when a process is stalled by a blocking operation, the block affects not only connections that the process has already accepted, but also connection requests that the kernel has assigned to the process since it became blocked.
To enable the SO_REUSEPORT socket option, include the new reuseport parameter (rp) to the listen directive for the \p command, or -p cmd line arg. e.g.
q)\p rp,5000
Use cases include coarse load-balancing and HA/failover.
N.B. when using socket sharding (e.g. -p rp,5000) the unix domain socket (uds) is not active; this is deliberate and not expected to change.
Improved sort performance
kdb+ uses a hybrid sort, selecting the algorithm it deems best for the data type, size and domain of the input. With kdb+ v3.5, this has been tweaked to significantly improve the sort performance of certain distributions, typically those including a null. e.g.
q)a:@[10000001?100000;0;:;0N];system"t iasc a" / 5x faster than v3.4
Improved search performance
kdb+ v3.5 significantly improves the performance of bin, find, distinct and various joins for large inputs, particularly for multi-column input. The larger the data set, the better the performance improvement compared to previous versions. e.g.
q)nn:166*n:60000;v1:50?v2:neg[100]?2;t1:
c1c2
c3#n?t2:(c1:`g#nn?v1;c2:nn?v1;c3:nn?v2;val:nn?100);system"ts t1 lj 3!t2" / 100x faster than v3.4
q)a:-1234567890 123456789,100000?10;b:1000?a;system each(“ts:100 distinct a”;“ts:1000 a?b”) / 30% faster than v3.4
NUCs - Not Upwardly Compatible
Although we have tried to make the process of upgrading seamless, please pay attention to the following NUCs to consider whether they impact your particular installation:
added ujf (new keyword) which mimics the behaviour of uj from v2.x, i.e. that it fills from lhs. e.g.
q)([a:1 2 3]b:2 3 7;c:10 20 30;d:“WEC”)~([a:1 2]b:2 3;c:5 7;d:“WE”)ujf([a:1 2 3]b:2 3 7;c:10 20 30;d:" C")
constants limit in lambdas reduced from 96 to 95; could cause existing user code to throw 'constants error. e.g.
q)value raze"{“,(string[10+til 96],:”;“),”}"
now uses abstract namespace for unix domain sockets on linux to avoid file permission issues in /tmp.
N.B. hence 3.5 cannot connect to 3.4 using uds. e.g.
q)hopen`:unix://5000
comments no longer stripped from the function text by the tokenizer (-4!x); they can be stripped explicitly from the -4! result with
q){x where not(1<count each x)&x[;0]in" /\t\n"} -4!“/a comment\n{2+ 3; /another comment\n3\n\t/yet another\n \n} /and one more”
the structure of the result of value lambda, e.g. value {x+y}, is:
(bytecode;parameters;locals;(namespace,globals);constants[0];…;constants[n];m;n;f;l;s)
where
m: bytecode to source position map, -1 if position unknown
n: fully qualified (with namespace) function name as a string, set on first global assignment, with @ appended for inner lambdas. () if n/a
f: full path to the file where the function originated from, “” if n/a
l: line number in said file, -1 if n/a
s: source code
this structure is subject to change.
Suggested upgrade process
Even though we have run a wide range of tests on kdb+v3.5, and various customers have been kind enough to repeatedly run their own tests during the last few months of development, users who wish to upgrade to v3.5 should run their own tests on their own data and code/queries before promoting to production usage.