Hi all, as far as I aware, there is no command to delete some records in HDB, I have to load all records to memory, modify it, and save them back, is that correct?
Or I just miss some tools that already available?
Hi all, as far as I aware, there is no command to delete some records in HDB, I have to load all records to memory, modify it, and save them back, is that correct?
Or I just miss some tools that already available?
correct
http://code.kx.com/wiki/Contrib/dbmaintdotq
Use at your own risk ;-)
You can load all records into memory or modify the splayed column files directly. It’s a bit fiddly as for this you need to know the row index of the data you’re no longer interested in. Here’s an example:
// some random datan:1000;mytbl:([]id:n?0Ng;name:n?
foobar
baz;sz:n?50;px:n?100.0);.Q.dpft[:db;2015.07.25;
name;mytbl];mytbl:([]id:n?0Ng;name:n?
foobar
baz;sz:n?50;px:n?100.0);.Q.dpft[:db;2015.07.26;
name;mytbl];// load hdb\l db// select data from mytbl tableshow select count i, t:wsum[sz;px] by date, name from mytbl// get rid of name=
bar in 2015.07.25 partitionidx:1+(select i from mytbl where date = 2015.07.25, name = bar)
x// drop from the list (name) and enumerate sym:./2015.07.25/mytbl/name set
sym?last idx _ get :./2015.07.25/mytbl/name;// drop indexes from other columns
:./2015.07.25/mytbl/id set last idx _ get :./2015.07.25/mytbl/id;
:./2015.07.25/mytbl/px set last idx _ get :./2015.07.25/mytbl/px;
:./2015.07.25/mytbl/sz set last idx _ get :./2015.07.25/mytbl/sz;// bar data has been removed from hdbshow select count i, t:wsum[sz;px] by date, name from mytbl
You can delete records without loading the whole table into memory, but it’s a non-atomic operation so if you do choose to do it I’d recommend copying the partition into a staging directory, doing the delete and then copying it back into your main hdb directory.
q)t:(sym:a
bc
d;a:1 2 3 4;b:10 20 30 40)
q).Q.dpft[`:.;.z.D;`sym;`t]
`t
q)\l .
q)t
date sym a b
-------------------
2015.07.26 a 1 10
2015.07.26 b 2 20
2015.07.26 c 3 30
2015.07.26 d 4 40
//Let’s say we want to delete where sym=`b. Get the index of the rows you want to remove
q)inds:exec x from select i from t where sym=`b
q)inds
,1
//Then figure out the indices of the rows that you don’t want to remove
q)keep:(exec til first x from select count i from t where date=2015.07.26) except inds
q)keep
0 2 3
//Now iterate over each column in the parition, keeping the relevant indices and discarding the ones we want to remove
q).[;();@;keep] each :./2015.07.26/t/sym
:./2015.07.26/t/a`:./2015.07.26/t/b
:./2015.07.26/t/sym
:./2015.07.26/t/a`:./2015.07.26/t/b
//Reload the HDB and the sym=`b row is gone.
q)\l .
q)t
date sym a b
-------------------
2015.07.26 a 1 10
2015.07.26 c 3 30
2015.07.26 d 4 40
Thanks
Tom
Thanks a lot for the information!