Appending data

Hey all,

I plan on having a table that is appended daily with new data. Some of the columns are derived from other columns in the table and I am wondering how I should go about calculating/populating these columns.

Here is a simple example:

q) tbl:([date:til 3] a:3?3;b:3?3;c:3?3)

date| a b c

----| -----

0   | 2 1 0

1   | 2 2 0

2   | 1 1 2

Now let’s say I add a derived column a+b

q) update d:a+b from `tbl

q) tbl

date| a b c d

----| -------

0   | 2 1 0 3

1   | 2 2 0 4

2   | 1 1 2 2

Now say that on day “3”, I append some new “raw” data:

q) tbl upsert dateab`c!(3;3;1;0)

q) tbl

date| a b c d

----| -------

0   | 2 1 0 3

1   | 2 2 0 4

2   | 1 1 2 2

3   | 3 1 0  

My question is - what is the best way to fill in the value for d on date 3 here. Should I simply run another update call? Or should I create a view and use this view for my future queries? Or is there something else that I am missing?

Any help/advice would be greatly appreciated!

Thanks in advance,

Rob

this isn’t really the right place for database design questions.  but if it were me, i would keep raw data in one table, and derived data in another (if it takes a long time to compute). i would compute simple derived data on-the-fly.

Hey Nick,

Thanks for your input - my bad if this is the wrong place for this type of question.

Cheers,

Rob

It’s hard to answer this without knowing the original purpose. For example, if you are going to be working with a small enough table, you can just calculate the derived columns just once when you need it. If you are working with a huge table and you don’t want to be calculating the entire d column every time, then you can call an update on d as soon as you get a new row entry. This is commonly done for real time statistics. As soon as new tick data comes in, you generate stats on that in real time. 

Again, if table is huge, call update statement each time…if table is not as big, just run the update once in the end.