Hi,
i am trying to get the first index of the smallest element in a list.
As I think there is no predefined function to do this i tried some different functions:
v:10000000?10000000 /testvector of random ints
/first one: give first index of sorted index list; not optimal as the whole vector gets sorted..
imin:*<:
\t imin v
1361
imin:{x?&/x} /second approach: find min value then get the index of the min value; not optimal the vector has to be processed twice
\t imin v
34
imin:{i:m:0;do[-1+#x; $[x[m]>x[i+:1];m:i;]];m} /third approach: iterate through vector and compare each value to the min value before
\t imin v
7129 /so slow?? why this?
imin:{i:mi:0;m:x[0]; do[-1+#x; $[m>x[i+:1];m:x[mi:i]; ]] ;mi } /optimized third approach, still bad..
\t imin v
5939
now i dont understand why the iteration is so slow, imo it should be the best. Do you think my second approach is the best solution or is there something better I can do?
Markus
what is wrong with: v?min v
v?min v is the same as my second apporach: k)v?&/v
it scans the vector v twice (first calculate min value, then search for the min value)
but it looks like the best method
but my question is: why is the manual iteration (third approach) so slow.
and is there a way to get the result in the same time min v or k)&/v takes?
Markus
As you probably know, working with vectors is far faster than working with vectors in kdb+. Although I don’t know what’s happening under the hood, it does seem that there is an operational overhead when working with atoms (like creating a new object), hence the inner part of your iteration will always be slower.
explicit looping is slower than vector ops due to the larger number of instructions being executed - there’s a lot more work done for scalars. There is a built-in iterate but this won’t help much here (but included for demo purposes)
q)v:10000000?10000000
q)\ts last (1+)[min[v]<>v@;0] / compare each scalar
1297 208629616
q)\ts last (1+)[min[v]<>v;0] / vector compare, then iterate through to first 1b
472 225406624
q)\ts first where v=min v
15 16777584
q)\ts v?min v
10 464
q)\ts min v / min here is 80% of the cost
8 320
/ go for worst-case scenario
q)v,:0
q)\ts min v
8 320
q)\ts v?min v
12 464
On Fri, 28 Jun 2013 11:15:08 +0200 Charles Skelton wrote:
>
> explicit looping is slower than vector ops due to the larger number of
> instructions being executed - there’s a lot more work done for
> scalars. There is a built-in iterate but this won’t help much here (but
> included for demo purposes)
>
> q)v:10000000?10000000
> q) s last (1+)[min[v]<>v@;0] / compare each scalar
> 1297 208629616
> q) s last (1+)[min[v]<>v;0] / vector compare, then iterate through to
> first 1b
> 472 225406624
> q) s first where v=min v
> 15 16777584
> q) s v?min v
> 10 464
> q) s min v / min here is 80% of the cost
> 8 320
>
> / go for worst-case scenario
> q)v,:0
Given v:10000000?10000000 v can contain 0 anywhere so appending
a 0 won’t change things much!
So I tried a slightly different test:
q)v:100000000#0 / 100 million zeroes
q) s v?min v
100 432
q)v,:-1
q) s v?min v
185 432
This difference seems fairly consistent.
I did check for 0 first ;-) My minimum was 3, so my test was safe
q)where 0=v
,10000000
also note that 32bit builds do not have sse code in them.
your test on 64bit 3.1
q)v:100000000#0
q)\ts v?min v
80 464
q)v,:-1
q)\ts v?min v
124 464