1
$\begingroup$

I am trying to get this repo of Xu's DONUT algorithm running, however I am getting an error I do not quite understand. The readme says I should load raw_data as follows:

timestamp, values, labels = ... # If there is no label, simply use all zeros. labels = np.zeros_like(values, dtype=np.int32) # Complete the timestamp, and obtain the missing point indicators. timestamp, missing, (values, labels) = \ complete_timestamp(timestamp, (values, labels)) 

however, when I do so, I get this error:

ValueError: The shape of ``arrays[0]`` does not agree with the shape of `timestamp` ((109577, 11) vs (109577,)) 

which does not make sense to me as I can't think of a reason timestamp would be an 11 dim array. When I pass the values as the timestamp arg, I get "timestamp must be a 1D array"

Very confused, hopefully, someone can shed some light.

Here are the checks in the code:

if len(timestamp.shape) != 1: raise ValueError('`timestamp` must be a 1-D array') has_arrays = arrays is not None arrays = [np.asarray(array) for array in (arrays or ())] for i, array in enumerate(arrays): if array.shape != timestamp.shape: raise ValueError('The shape of ``arrays[{}]`` does not agree with ' 'the shape of `timestamp` ({} vs {})'. format(i, array.shape, timestamp.shape)) 

As well as the repo itself: https://github.com/haowen-xu/donut

$\endgroup$

    1 Answer 1

    1
    $\begingroup$

    How does your timestamp look like? Apparently there are too many dimensions.

    When using pandas DataFrames you could pass the .index (in case it's no multiindex) or just np.arange(len(<your_data>)) as timestamps.

    $\endgroup$
    6
    • $\begingroup$(109577,) just a numpy array using df.values. I understand the dimensions are different, I don't understand why the official implementation of the paper wants the timestamp with the same dimensions as the rest of the X data.$\endgroup$CommentedJul 18, 2018 at 17:34
    • $\begingroup$It's more like the data has to be the same dimension (one). As far as I understood, it will not work with multivariate data easily. But you can introduce a simple merging criterion for this. E.g. merging multiple models/predictions by taking the minimum for each timestamp. (or the maximum in case it's binarized already)$\endgroup$
      – Axl
      CommentedJul 18, 2018 at 17:55
    • $\begingroup$My timestamp is 1d, you mean my values?$\endgroup$CommentedJul 19, 2018 at 8:41
    • $\begingroup$Yes, Donut can only work on univariate data. Hence, it cannot detect multivariate anomalies.$\endgroup$
      – Axl
      CommentedJul 19, 2018 at 8:50
    • $\begingroup$Gotcha, any idea on the best way to merge 11 dimensions for this algorithm Axi? Or is it best to run it only with single primary features?$\endgroup$CommentedJul 19, 2018 at 8:55

    Start asking to get answers

    Find the answer to your question by asking.

    Ask question

    Explore related questions

    See similar questions with these tags.