About HDB maintenance

Hi all, as far as I aware, there is no command to delete some records in HDB, I have to load all records to memory, modify it, and save them back, is that correct?

Or I just miss some tools that already available?

correct

http://code.kx.com/wiki/Contrib/dbmaintdotq

Use at your own risk ;-)

You can load all records into memory or modify the splayed column files directly. It’s a bit fiddly as for this you need to know the row index of the data you’re no longer interested in. Here’s an example:

// some random datan:1000;mytbl:([]id:n?0Ng;name:n?foobarbaz;sz:n?50;px:n?100.0);.Q.dpft[:db;2015.07.25;name;mytbl];mytbl:([]id:n?0Ng;name:n?foobarbaz;sz:n?50;px:n?100.0);.Q.dpft[:db;2015.07.26;name;mytbl];// load hdb\l db// select data from mytbl tableshow select count i, t:wsum[sz;px] by date, name from mytbl// get rid of name=bar in 2015.07.25 partitionidx:1+(select i from mytbl where date = 2015.07.25, name = bar)x// drop from the list (name) and enumerate sym:./2015.07.25/mytbl/name set sym?last idx _ get :./2015.07.25/mytbl/name;// drop indexes from other columns:./2015.07.25/mytbl/id set last idx _ get :./2015.07.25/mytbl/id;:./2015.07.25/mytbl/px set last idx _ get :./2015.07.25/mytbl/px;:./2015.07.25/mytbl/sz set last idx _ get :./2015.07.25/mytbl/sz;// bar data has been removed from hdbshow select count i, t:wsum[sz;px] by date, name from mytbl

You can delete records without loading the whole table into memory, but it’s a non-atomic operation so if you do choose to do it I’d recommend copying the partition into a staging directory, doing the delete and then copying it back into your main hdb directory.

q)t:(sym:abcd;a:1 2 3 4;b:10 20 30 40)

q).Q.dpft[`:.;.z.D;`sym;`t]

`t

q)\l .

q)t

date sym a b

-------------------

2015.07.26 a 1 10

2015.07.26 b 2 20

2015.07.26 c 3 30

2015.07.26 d 4 40


//Let’s say we want to delete where sym=`b. Get the index of the rows you want to remove

q)inds:exec x from select i from t where sym=`b

q)inds

,1


//Then figure out the indices of the rows that you don’t want to remove

q)keep:(exec til first x from select count i from t where date=2015.07.26) except inds

q)keep

0 2 3


//Now iterate over each column in the parition, keeping the relevant indices and discarding the ones we want to remove

q).[;();@;keep] each :./2015.07.26/t/sym:./2015.07.26/t/a`:./2015.07.26/t/b

:./2015.07.26/t/sym:./2015.07.26/t/a`:./2015.07.26/t/b


//Reload the HDB and the sym=`b row is gone.

q)\l .

q)t

date sym a b

-------------------

2015.07.26 a 1 10

2015.07.26 c 3 30

2015.07.26 d 4 40

Thanks

Tom

Thanks a lot for the information!