2
\$\begingroup\$

I have a function that takes a four-dimensional NumPy array M and, for each value i of its second index, takes all of M without the i-th "column", evaluates the product over all other columns, and stacks the results so that the array returned has the same shape as M.

In other words, what I want to evaluate is:

$$R_{ijkl} = \prod_{j^l \ne j} M_{ij^{l}kl}$$

My main concern is performance: M's shape is usually something similar to (16,8,1000,255) and this function call takes up the great majority of my program's execution time.

import numpy as np def sliceAndMultiply(M): # create masks masks = [ range(i) + range(i+1, M.shape[1]) for i in range(M.shape[1]) ] # evaluate products over masks and stack them return np.stack([ np.prod(M[:,m,:,:], axis=1) for m in masks ], axis=1) M = np.random.rand(16,8,1000,255) R = sliceAndMultiply(M) 

Another variant I tried is:

def sliceAndMultiply(M): return np.stack([ np.prod(np.delete(M, j, axis=1), axis=1) \ for j in range(M.shape[1]) ], axis=1) 

but the performance of these two functions is basically the same.

\$\endgroup\$

    1 Answer 1

    4
    \$\begingroup\$

    A one liner for your problem:

    def slice_and_multiply(arr): return np.product(arr, axis=1, keepdims=True) / M 

    This multiplies all of the "columns" first, then divides by each "column" to get the result of multiplying all but that column, which may be problematic depending on your data due to potential overflows or precision losses. And it breaks completely if there are any zeros in the input array.

    If you can get rid of the division (this is a typical interview question, by the way), then all of those problems dissappear. And you can do it with relative ease, by cleverly storing accumulated products from both ends of the array. then multiplying them together:

    def slice_and_multiply_bis(arr): ret = np.ones_like(arr) np.cumprod(arr[:, :-1], axis=1, out=ret[:, 1:]) ret[:, :-1] *= np.cumprod(arr[:, :0:-1], axis=1)[:, ::-1] return ret 

    And of course:

    >>> M = np.random.rand(16, 8, 1000, 255) >>> np.allclose(slice_and_multiply(M), slice_and_multiply_bis(M)) True 

    That last code is a little too clever for its own good, so it should it should be heavily commented.

    By my timings, for your target size, slice_and_multiply is 5x faster than your implementation, and slice_and_multiply_bis about 2-3x faster.

    \$\endgroup\$
    1
    • 1
      \$\begingroup\$for posterity: the three lines in slice_and_multiply_bis work by multiplying cumulative products of arr in both directions (from beginning to end of each row and from end to beginning) in a smart way. If arr = array([a,b,c]), the contents of ret would be, for each of the lines, ret == [1,1,1], then ret == [1, a, ab], and finally ret == [1, a, ab]*[cb, c, 1] == [cb, ac, ab]\$\endgroup\$
      – blue
      CommentedMay 6, 2016 at 10:47

    Start asking to get answers

    Find the answer to your question by asking.

    Ask question

    Explore related questions

    See similar questions with these tags.