(TL; DR) No improvement in performance despite using unique key dictionary i.e (`u#())!() .. but similar algorithm in python was giving results much much faster. Wondering if I had took a wrong approach in solving the problem
Here the solution will get slower and slower the more elements it has because it will hold the entire list produce in memory and then iterates over and adds to it for each new element for your solution and then that is the end output.
This current problem is not that KDB friendly and it is difficult to use KDB vectorization usefully.
So in your function nextramb will always retain x and that list is getting bigger and bigger
so initially your x is
13 0 10 12 1 5
13 0 10 12 1 5 0
13 0 10 12 1 5 0 5
13 0 10 12 1 5 0 5 …
Which slows down the function massively.
My solution that applies similar iteration logic that Python code had. My code essentially saves the unique values and indices they appear in and then iterates over the entire list of values.
So my solution as follows:
________________
I:2 0 1 9 5 19;.s.d:(u#I)!u#2#'til[count I];.s.l:last I;
\ts {$[all (x-1)=.s.d[.s.l];
[.s.d,:enlist[0]!enlist 2#(.s.d[0;1],x) except 0N;.s.l:0];
[.s.l:last deltas .s.d[.s.l];.s.d,:enlist[.s.l]!enlist 2#(.s.d[.s.l;1],x) except 0N]];
:x+1}/[30000000-count I;count I]
________________
I have the Initial input as a global. Then my dictionary (.s.d) of unique values that appear on the sequence and then .s.l the last item that we iterate over.
The unique attribute will remain on the keys as I will only append to the dictionary when a new value is encountered. So all the operations work on the attributed dictionary that has fast indexing as it has the unique attribute.
The function iterates through all the indices starting from the last input index and going to the amount specified either 30m or 2020.
If the last element in the dictionary is new then it changes .s.l to be 0 and update the indices associated with 0 in the dictionary.
If the last element is not new, then it gets the relevant indices and the step between last occurrence and current occurrence and updates the dictionary with the value.
Then finally typing .s.l you can get the answer for what the last element was.
The performance is that for every 1m elements it runs roughly 2seconds. So for 30m it is 60seconds.