and the results:
/ shortest
hkidcheck:(529#.Q.n,“A”)sum(2_til 10)*.Q.nA?-8$
/ fastest
hkidcheck:{x sum 2 3 4 5 6 7 8 9*y?-8$z}[529#.Q.n,“A”;.Q.nA]
next step - use these results to compute a SEDOL’s check digit:
and the results:
/ shortest
hkidcheck:(529#.Q.n,“A”)sum(2_til 10)*.Q.nA?-8$
/ fastest
hkidcheck:{x sum 2 3 4 5 6 7 8 9*y?-8$z}[529#.Q.n,“A”;.Q.nA]
next step - use these results to compute a SEDOL’s check digit:
very nice
inlining makes it somewhat even faster, albeit it is ugly
a:{"0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0"sum 2 3 4 5 6 7 8 9*“0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ”?-8$x}
Cheers,
Attila
also consider is vectorising
but that strictly wont abide the check defined on the meetup page
q)ids:( “B123456”;“A182361”;“CA182361”;“AB123456”;“ZA182361”;“AZ182361”;“XZ182361”;“ZX182361”;“XX182361”;“XX182460”)
q)\t:100000 hkidcheck each ids
906
q)\t:100000 a each ids
783
q)b:{f:“0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ”?-8$x;“0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0123456789A0”$[type x;sum 2 3 4 5 6 7 8 9*f;(sum’)2 3 4 5 6 7 8 9*/:f]}
q)\t:100000 b ids
593
cheers,
Attila
-8$x / i overlooked this string padding, which would result in 0-36
“[”,x / i went down the alphabet 27 path, which result in numbers in the range of ascii like 48 and 65. then i compensated it by (2231#“B”) in the lookup string
((2231#“B”),1000#“0”, .Q.nA 10-til 10) / yes, inlining this is ugly, avoid it in production
Do index elision and inlining occupy the same amount of stack and L1 cache? Since I am looking up only 1 byte from the lookup string, is only cache line size=64 bytes occupied in the L1 cache?
>From a functional programming perspective, my lookup string can be considered as a composition of mod[; 11], 11 minus, int2char. The function has a domain large enough to cover the range of hkid weighted sums. The following are lookup strings also: .Q nA 529#.Q.n,“A”
sedols:(
“710889”;“B0YBKJ”;“406566”;“B0YBLH”;
“228276”;“B0YBKL”;“557910”;“B0YBKR”;
“585284”;“B0YBKT”;“B00030”);
“97325975270”~sedolcheck each sedols;
sedolcheck:{x sum 1 3 1 7 3 9*.Q.nA?y}[1000#.Q.n mod[;10] 10-til 10];
\ts:100000 sedolcheck each sedols / 1145j, 560j
/ SEDOLs with validation
sedols:(
“710889”;“B0YBKJ”;“406566”;“B0YBLH”;
“228276”;“B0YBKL”;“557910”;“B0YBKR”;
“585284”;“B0YBKT”;“B00030”;“WRONG1”);
"97325975270 "~sedolValidCheck each sedols;
sedolValid:{31>|/[x?y]}[.Q.nA except “AEIOU”];
sedolValidCheck:{
$[sedolValid[y];
x sum 1 3 1 7 3 9*.Q.nA?y;
" "]
}[1000#.Q.n mod[;10] 10-til 10];
\ts:100000 sedolValidCheck each sedols / 1987j, 608j
kudos to Yan who realized that even though inlining makes the function uglier, you can still shorten it:
hkidcheck:eval parse"{"“,(529#.Q.n,“A”),”"sum 2 3 4 5 6 7 8 9*"“,.Q.nA,”"?-8$x}"
The code that combines -[10;] with the weights is shorter than the code that combines the -[10;] with the lookup string.
sedolcheck:{x sum 9 7 9 3 7 1*.Q.nA?y}[1261#.Q.n];