Hi Guys,
I have a simple function that uses zlib to compress a string. It works fine for strings of length, say, 1 million chars, but fails with strings that are larger than around 8.35 million chars.
“Why am I doing this?” I hear you ask… I want to take the output from .q.csv, and compress it before passing to a (web) client via something like .h.hn… this is working great for small tables (around 8mb worth of csv), but is failing for larger ones.
compress.c
#include <string.h> // for memcpy
#include <zlib.h> // for compressBound, compress
#include “k.h”
// takes a string, compresses and returns byte array
K1(k_compress)
{
if (xt != KC)
return krr((S)“type”);
uLong l = xn; // length of input
uLong cl = compressBound(l); // compressed length
unsigned char tmp[cl]; // temporary array
if(0 == compress(tmp, &cl, kC(x), l)) // compress successful?
{
K y = ktn(KG, cl); // byte-array of real compressed length
memcpy(kG(y), &tmp, cl); // copy compressed data into K object
R y; // return byte array
}
R krr((S)“compress”); // return error to client
}
Makefile
CCOPTS = -fno-builtin -Wall -g -fno-omit-frame-pointer -lz -std=c99 -shared -fPIC -DKXVER=3 -O3 -I../include
all:
mkdir -p l32
mkdir -p l64
gcc $(CCOPTS) zlib.c -o l32/zlib.so -m32
gcc $(CCOPTS) zlib.c -o l64/zlib.so
compress.q
/ note I have a CHOME environment variable defined
.zlib.compress:($getenv[
CHOME],“/lib/zlib/”,string[.z.o],“/zlib”)2:(`k_compress;1);
a:.zlib.compress 10000#“hello”;
-1 string count a;
b:.zlib.compress 100000#“hello”;
-1 string count b;
c:.zlib.compress 1000000#“hello”;
-1 string count c;
d:.zlib.compress 10000000#“hello”; / crashes out
-1 string count d;
Output
44
175
1483
rlwrap: warning: taskset crashed, killed by SIGSEGV.
rlwrap itself has not crashed, but for transparency,
it will now kill itself (without dumping core) with the same signal
I’m using q/kdb+ 3.3, the Segfault occurs with both 32- and 64-bit binaries.
Notes: the reason for the intermediate tmp variable is otherwise we get a K byte array with a bunch of nulls at the end (cl is updated with true compressed length as a side-effect of compress()) .
Any help or advice would be greatly appreciated, so far I’m completely stumped!
/Mark