I have created a python code which reads json file as an input using pandas dataframe and returns dataframe contains information about all the employees.
requirement is i need to load output(dataFrame) of python script as an input to KDB function. i have used .p.get function to run the python function in KDB but the output i am getting is " dynamicLoad".
Reading JSON file can be done directly in KDB but I wanted it to keep the reading json in python and accessing data in KDB to reduce workload.
Sample Code:
python code: import pandas as pd;
import json
def readJsonFile(fp):
with open(fp) as data_file:
data = json.load(data_file)
res = pd.json_normalize(data);
return res
PyKX under q is replacing embedPy and is recommended if you are starting a new project (*Except for on Windows as support is still in development for this)
Then you can load in your python code (I placed it in a file called functions.py):
q)\l pykx.q q)loadPy:{.pykx.pyexec"exec(open(‘“,(1_ string x),”’).read())"} q)loadPy:functions.py q)readJSONFile:.pykx.get[readJsonFile;<] q)readJSONFilesampleJSONFile.json Employees ------------------------------------------------------------------------------- +userIdjobTitleNamefirstNamelastNamepreferredFullNameemployeeCoderegio.. q)(readJSONFilesampleJSONFile.json)[0;Employees] userId jobTitleName firstName lastName preferredFullName employeeCode region phoneNumber emailAddress --------------------------------------------------------------------------------------------------------------------- rirani Developer Romin Irani Romin Irani E1 CA 408-1234567 romin.k.irani@gmail.com nirani Developer Neil Irani Neil Irani E2 CA 408-1111111 neilrirani@gmail.com thanks Program Directory Tom Hanks Tom Hanks E3 CA 408-2222222 tomhanks@gmail.com
One issue to watch for here though is all the data being brought back as symbols to q. For the lifetime of a q process all unique symbols are interned in memory and cannot be garage collected. You would need to update your code to convert columns to suitable datatypes. Text data should mostly be passed to q as bytes so it is converted to type C in q.
meta (readJSONFilesampleJSONFile.json)[0;Employees] c | t f a -----------------| ----- userId | s jobTitleName | s firstName | s lastName | s preferredFullName| s employeeCode | s region | s phoneNumber | s emailAddress | s
Using q’s inbuilt JSON parser is simple and default textual data to type C which prevents any issues with unique symbols.
q).j.k raze read0:sampleJSONFile.json Employees| +userIdjobTitleNamefirstNamelastNamepreferredFullNameemploye.. q)(.j.k raze read0:sampleJSONFile.json)Employees userId jobTitleName firstName lastName preferredFullName employeeCode region phoneNumber emailAddress ----------------------------------------------------------------------------------------------------------------------------- "rirani" "Developer" "Romin" "Irani" "Romin Irani" "E1" "CA" "408-1234567" "romin.k.irani@gmail.com" "nirani" "Developer" "Neil" "Irani" "Neil Irani" "E2" "CA" "408-1111111" "neilrirani@gmail.com" "thanks" "Program Directory" "Tom" "Hanks" "Tom Hanks" "E3" "CA" "408-2222222" "tomhanks@gmail.com" meta (.j.k raze read0:sampleJSONFile.json)`Employees c | t f a -----------------| ----- userId | C jobTitleName | C firstName | C lastName | C preferredFullName| C employeeCode | C region | C phoneNumber | C emailAddress | C
//You cannot pass < as there is no automatic way to convert a dataframe to q readJSONFile:.p.get[`readJsonFile]; //Pass the embedPy foreign to df2tab to do the conversion out:df2tab[readJSONFile[fp]];