Read JSON file output in KDB

https://learninghub.kx.com/forums/topic/read-json-file-output-in-kdb

Hello Team,

I have attached sample JSON file.

I have created a python code which reads json file as an input using pandas dataframe and returns dataframe contains information about all the employees.

requirement is i need to load output(dataFrame) of python script as an input to KDB function. i have used .p.get function to run the python function in KDB but the output i am getting is “dynamicLoad”.

Reading JSON file can be done directly in KDB but I wanted it to keep the reading json in python and accessing data in KDB to reduce workload.

 

Sample Code: python code: import pandas as pd; import json def readJsonFile(fp): with open(fp) as data_file: data = json.load(data_file) res = pd.json_normalize(data); return res

 

KDB code

readJSONFile:.p.get[``readJsonFile;<];

out:readJSONFile[fp];

Tried as like below

 

Can someone please hhelp me.

 

Thanks in advance..!!!

PyKX under q is replacing embedPy and is recommended if you are starting a new project (*Except for on Windows as support is still in development for this)

https://code.kx.com/pykx/2.2/pykx-under-q/intro.html

Install it with:

pip install pykx 
python -c "import pykx;pykx.install_into_QHOME()"

Then you can load in your python code (I placed it in a file called functions.py):

q)\l pykx.q 
q)loadPy:{.pykx.pyexec"exec(open('",(1_ string x),"').read())"} 
q)loadPy`:functions.py 
q)readJSONFile:.pykx.get[`readJsonFile;<] 
q)readJSONFile`sampleJSONFile.json 
Employees 
------------------------------------------------------------------------------- 
+`userId`jobTitleName`firstName`lastName`preferredFullName`employeeCode`regio.. 

q)(readJSONFile`sampleJSONFile.json)[0;`Employees] 

userId jobTitleName firstName lastName preferredFullName employeeCode region phoneNumber emailAddress 
--------------------------------------------------------------------------------------------------------------------- 
rirani Developer    Romin     Irani    Romin Irani       E1           CA     408-1234567 romin.k.irani@gmail.com 
nirani Developer    Neil      Irani    Neil Irani        E2           CA     408-1111111 neilrirani@gmail.com 
thanks Program Directory Tom  Hanks    Tom Hanks         E3           CA     408-2222222 tomhanks@gmail.com

 

One issue to watch for here though is all the data being brought back as symbols to q. For the lifetime of a q process all unique symbols are interned in memory and cannot be garage collected. You would need to update your code to convert columns to suitable datatypes. Text data should mostly be passed to q as bytes so it is converted to type C in q.

meta (readJSONFile`sampleJSONFile.json)[0;`Employees] 
c                | t f a 
-----------------| ----- 
userId           | s 
jobTitleName     | s 
firstName        | s 
lastName         | s 
preferredFullName| s 
employeeCode     | s 
region           | s
phoneNumber      | s 
emailAddress     | s

 

Using q’s inbuilt JSON parser is simple and default textual data to type C which prevents any issues with unique symbols.

q).j.k raze read0`:sampleJSONFile.json

Employees| +userIdjobTitleNamefirstNamelastNamepreferredFullNameemploye..

q)(.j.k raze read0:sampleJSONFile.json)Employees

userId jobTitleName firstName lastName preferredFullName employeeCode region phoneNumber emailAddress 
----------------------------------------------------------------------------------------------------------------------------- 
"rirani" "Developer" "Romin" "Irani" "Romin Irani" "E1" "CA" "408-1234567" "romin.k.irani@gmail.com" 
"nirani" "Developer" "Neil" "Irani" "Neil Irani" "E2" "CA" "408-1111111" "neilrirani@gmail.com" 
"thanks" "Program Directory" "Tom" "Hanks" "Tom Hanks" "E3" "CA" "408-2222222" "tomhanks@gmail.com" 

meta (.j.k raze read0`:sampleJSONFile.json)`Employees 
c                | t f a 
-----------------| ----- 
userId           | C 
jobTitleName     | C 
firstName        | C 
lastName         | C 
preferredFullName| C 
employeeCode     | C 
region           | C 
phoneNumber      | C 
emailAddress     | C


 

If you want to complete this with embedPy you will need the df2tab utility function

ml/util/utilities.q at master KxSystems/ml (github.com)

 

You can install the library following:

https://github.com/KxSystems/ml/tree/master?tab=readme-ov-file#installation

 

Or the function can also be defined stand alone if you prefer:

https://stackoverflow.com/a/51612291/4256419

 

 

 

//You cannot pass < as there is no automatic way to convert a dataframe to q readJSONFile:.p.get[`readJsonFile];

//Pass the embedPy foreign to df2tab to do the conversion out:df2tab[readJSONFile[fp]];