This is old, but FYI, if you downsample the matrix first and then perform your thresholding (jit.matrix then jit.op), you save a bit on cpu processing power