Read JSON file output in KDB

Hello Team,

I have attached sample JSON file. 

I have created a python code which reads json file as an input using pandas dataframe and returns dataframe contains information about all the employees. 

requirement is  i need to load output(dataFrame) of python script as an input to KDB function.  i have used  .p.get function to run the python function in KDB but the output i am getting is  " dynamicLoad". 

Reading JSON file can be done directly in KDB but I wanted it to keep the reading json in python and accessing data in KDB to reduce workload.

 

Sample Code:
python code:
import pandas as pd;
import json
def readJsonFile(fp):
   with open(fp) as data_file:
         data = json.load(data_file)
   res = pd.json_normalize(data);
   return res 

KDB code

readJSONFile:.p.get[``readJsonFile;<];

out:readJSONFile[fp];

Tried as like below

Can someone please hhelp me.

 

Thanks in advance..!!!

PyKX under q is replacing embedPy and is recommended if you are starting a new project (*Except for on Windows as support is still in development for this)

https://code.kx.com/pykx/2.2/pykx-under-q/intro.html

Install it with:

pip install pykx python -c “import pykx;pykx.install_into_QHOME()”

Then you can load in your python code (I placed it in a file called functions.py):

q)\l pykx.q q)loadPy:{.pykx.pyexec"exec(open(‘“,(1_ string x),”’).read())"} q)loadPy:functions.py q)readJSONFile:.pykx.get[readJsonFile;<] q)readJSONFilesampleJSONFile.json Employees ------------------------------------------------------------------------------- +userIdjobTitleNamefirstNamelastNamepreferredFullNameemployeeCoderegio.. q)(readJSONFilesampleJSONFile.json)[0;Employees] userId jobTitleName firstName lastName preferredFullName employeeCode region phoneNumber emailAddress --------------------------------------------------------------------------------------------------------------------- rirani Developer Romin Irani Romin Irani E1 CA 408-1234567 romin.k.irani@gmail.com nirani Developer Neil Irani Neil Irani E2 CA 408-1111111 neilrirani@gmail.com thanks Program Directory Tom Hanks Tom Hanks E3 CA 408-2222222 tomhanks@gmail.com

 

One  issue to watch for here though is all the data being brought back as symbols to q. For the lifetime of a q process all unique symbols are interned in memory and cannot be garage collected. You would need to update your code to convert columns to suitable datatypes.  Text data should mostly be passed to q as bytes so it is converted to type C in q.

meta (readJSONFilesampleJSONFile.json)[0;Employees] c | t f a -----------------| ----- userId | s jobTitleName | s firstName | s lastName | s preferredFullName| s employeeCode | s region | s phoneNumber | s emailAddress | s

 

Using q’s inbuilt JSON parser is simple and default textual data to type C which prevents any issues with unique symbols. 

q).j.k raze read0:sampleJSONFile.json Employees| +userIdjobTitleNamefirstNamelastNamepreferredFullNameemploye.. q)(.j.k raze read0:sampleJSONFile.json)Employees userId jobTitleName firstName lastName preferredFullName employeeCode region phoneNumber emailAddress ----------------------------------------------------------------------------------------------------------------------------- "rirani" "Developer" "Romin" "Irani" "Romin Irani" "E1" "CA" "408-1234567" "romin.k.irani@gmail.com" "nirani" "Developer" "Neil" "Irani" "Neil Irani" "E2" "CA" "408-1111111" "neilrirani@gmail.com" "thanks" "Program Directory" "Tom" "Hanks" "Tom Hanks" "E3" "CA" "408-2222222" "tomhanks@gmail.com" meta (.j.k raze read0:sampleJSONFile.json)`Employees c | t f a -----------------| ----- userId | C jobTitleName | C firstName | C lastName | C preferredFullName| C employeeCode | C region | C phoneNumber | C emailAddress | C

 

 

If you want to complete this with embedPy you will need the df2tab utility function

ml/util/utilities.q at master · KxSystems/ml (github.com)

 

You can install the library following:

https://github.com/KxSystems/ml/tree/master?tab=readme-ov-file#installation

 

Or the function can also be defined stand alone if you prefer:

https://stackoverflow.com/a/51612291/4256419

 

 

 

//You cannot pass < as there is no automatic way to convert a dataframe to q readJSONFile:.p.get[`readJsonFile]; //Pass the embedPy foreign to df2tab to do the conversion out:df2tab[readJSONFile[fp]];

 

 

 

Can you flip your interpreter to Python? The code then look more straight-forward.