index of smallest element

Hi,

i am trying to get the first index of the smallest element in a list. 

As I think there is no predefined function to do this i tried some different functions:

v:10000000?10000000  /testvector of random ints

/first one:  give first index of sorted index list; not optimal as the whole vector gets sorted..

imin:*<:

\t imin v

1361

imin:{x?&/x} /second approach: find min value then get the index of the min value;  not optimal the vector has to be processed twice

\t imin v

34

imin:{i:m:0;do[-1+#x; $[x[m]>x[i+:1];m:i;]];m}  /third approach: iterate through vector and compare each value to the min value before

\t imin v

7129   /so slow?? why this?

imin:{i:mi:0;m:x[0]; do[-1+#x; $[m>x[i+:1];m:x[mi:i]; ]] ;mi  }    /optimized third approach, still bad..

\t imin v

5939

now i dont understand why the iteration is so slow, imo it should be the best. Do you think my second approach is the best solution or is there something better I can do?

Markus

what is wrong with:  v?min v

v?min v is the same as my second apporach:  k)v?&/v

it scans the vector v twice (first calculate min value, then search for the min value)

but it looks like the best method

but my question is: why is the manual iteration (third approach) so slow.

and is there a way to get the result in the same time min v   or k)&/v takes?

Markus

As you probably know, working with vectors is far faster than working with vectors in kdb+. Although I don’t know what’s happening under the hood, it does seem that there is an operational overhead when working with atoms (like creating a new object), hence the inner part of your iteration will always be slower.

explicit looping is slower than vector ops due to the larger number of instructions being executed - there’s a lot more work done for scalars. There is a built-in iterate but this won’t help much here (but included for demo purposes)

q)v:10000000?10000000

q)\ts last (1+)[min[v]<>v@;0] / compare each scalar

1297 208629616

q)\ts last (1+)[min[v]<>v;0] / vector compare, then iterate through to first 1b

472 225406624

q)\ts first where v=min v

15 16777584

q)\ts v?min v

10 464

q)\ts min v / min here is 80% of the cost

8 320

/ go for worst-case scenario

q)v,:0

q)\ts min v

8 320

q)\ts v?min v

12 464

On Fri, 28 Jun 2013 11:15:08 +0200 Charles Skelton wrote:
>
> explicit looping is slower than vector ops due to the larger number of
> instructions being executed - there’s a lot more work done for
> scalars. There is a built-in iterate but this won’t help much here (but
> included for demo purposes)
>
> q)v:10000000?10000000
> q) s last (1+)[min[v]<>v@;0] / compare each scalar
> 1297 208629616
> q) s last (1+)[min[v]<>v;0] / vector compare, then iterate through to
> first 1b
> 472 225406624
> q) s first where v=min v
> 15 16777584
> q) s v?min v
> 10 464
> q) s min v / min here is 80% of the cost
> 8 320
>
> / go for worst-case scenario
> q)v,:0

Given v:10000000?10000000 v can contain 0 anywhere so appending
a 0 won’t change things much!

So I tried a slightly different test:

q)v:100000000#0 / 100 million zeroes
q) s v?min v
100 432
q)v,:-1
q) s v?min v
185 432

This difference seems fairly consistent.

I did check for 0 first ;-) My minimum was 3, so my test was safe

q)where 0=v

,10000000

also note that 32bit builds do not have sse code in them.

your test on 64bit 3.1

q)v:100000000#0

q)\ts v?min v

80 464

q)v,:-1

q)\ts v?min v

124 464