Here’s what I came up with. It’s not too elegant and it assumes both matrixes are square and that the convolution matrix is 3x3 (because we pad the target matrix with one row and column of zeros on each side):
q)zpad:{0,/:flip 0,/:(flip x,:0),:0} / Pad mat all around w/1 row & col of 0s on each side q)m4:zpad 4 4#“f”$til 16 / Mat to convolute q)shape:{(count x),count x[0]} / Shape of regular rectangular array q)convo:{[x;y] sum raze xy} / 1D convolution q)n0:count m0 q)n1:count m4 q)n2:(count m4)-n0-1 q)sd1:raze (n1til n0)+:til n0 q)idxm4:((n2n2),n0n0)#raze (raze flip sd1+:n1*til n2)+/:til n2 q)flip (n2,n2)#((raze m4) idxm4)convo: raze m0
The last line performs the convolution, giving the expected answer of
-10 -9 -6 9 9 0 0 24 21 0 0 36 66 51 54 85
This works by building the indexes into m4 of each sub-window into a 16x9 matrix and convoluting each row against raze m0. Any suggestions for improvement are welcome.
Convolutions are expensive operations, O(n^2), it’s going to be hard to be much faster
If you are working with larger convolutions I would consider instead implementing the convolution as a multiplication in the Fourier domain (i.e., FFT each matrix, multiple and inverse back) which can drop complexity to O(n log n). Implementing this in kdb borrowing from the Kx whitepaper on Signal Processing, this would require padding your matrices to a power of 2 so you need to consider the benefits.
Thanks for the help. I may look at the Fourier transform if what I have is too slow but I want to complete the process all the way to the end before I worry about optimizing it. Does anyone have something for reading JPGs or PNGs into q? I have a kluge for now but it would be more seamless if I could read images directly.