loadfile:{[file] cmd:" " sv (“./ORC-1.5.3-Linux/bin/orc-contents”;1_string file); / returns 1 JSON document per line tmp:.j.k each system cmd; tmp:columnnames xcol tmp; item:item,tmp; }
loadfile each filelist;
I wanted to ask 2 things:
1/ Is there a canonical way to preserve the schema for each json document? My simplistic , doesn’t seem to work that way.
2/ I understand using named pipes [2] instead could improve performance. I fail however to use .Q.fps in combination with .j.k. .. all the samples I saw so far are about using CSV and they are using the (“”!“,”) syntax. I fail to understand how I could use .j.k instead ..
I’m not sure what you mean by preserving the schema, if each document has a different schema how do you plan to store them if they don’t conform to a common schema?
On your point about .Q.fps, the easiest way to figure out what to do with the stream of data is to extract out a chunk and inspect it, like so:
.Q.fps[{$[not global in key .;global set x;()]};:/path/to/myFIFO]
Then you can inspect the global to see what a chunk of the stream looks like.
It turns out though that .j.k only returns C or f types ..
And thus I am not able to upsert into my `item definition.
What is the best way to align the types of z and my target table?
Cheers, Mathias
On Thursday, December 20, 2018 at 11:04:09 PM UTC+9, TerryLynch wrote:
I’m not sure what you mean by preserving the schema, if each document has a different schema how do you plan to store them if they don’t conform to a common schema?
On your point about .Q.fps, the easiest way to figure out what to do with the stream of data is to extract out a chunk and inspect it, like so:
.Q.fps[{$[not global in key .;global set x;()]};:/path/to/myFIFO]
Then you can inspect the global to see what a chunk of the stream looks like.
In theory the “J”$ should work. Are you 100% certain that every value in the “string_in_json_but_actually_a_long” column is a string of type 10h? Using meta doesn’t tell the whole story, it only reports the types of the first row. You would need to run “distinct type each z`string_in_json_but_actually_a_long” to be certain.
Values that are “null” in json will actually get parsed as null floats (0n) even if the column is supposed to be a string column. I suspect your column might have a mixture of strings and null floats and the null floats are erroring your attempt to cast using “J”$