0

Given two numpy arrays, i.e:

images.shape: (60000, 784) # An array containing 60000 images labels.shape: (60000, 10) # An array of labels for each image 

Each row of labels contains a 1 at a particular index to indicate the class of the related example in images. (So [0 0 1 0 0 0 0 0 0 0] would indicate that the example belongs to Class 2 (assuming our class indexing starts from 0).

I am trying to efficiently separate images so that I can manipulate all images belonging to a particular class at once. The most obvious solution would be to use a for loop (as follows). However, I'm not sure how to filter images such that only those with the appropriate labels are returned.

for i in range(0, labels.shape[1]): class_images = # (?) Array containing all images that belong to class i 

As an aside, I'm also wondering if there are even more efficient approaches that would eliminate the use of the for loop.

    2 Answers 2

    1

    One way would be to convert your label array to bool and use it for indexing:

    classes = [] blabels = labels.astype(bool) for i in range(10): classes.append(images[blabels[:, i], :]) 

    Or as a one-liner using list comprehension:

    classes = [images[l.astype(bool), :] for l in labels.T] 
      0
      _classes= [[] for x in range(10)] for image_index , element in enumerate(labels): _classes[element.index(1)].append(image_index) 

      for example the _classes[0] will contain the indexes of images which are classified as class0 .

      1
      • if you are using numpy you can use nonzero(element == 1)[0][0] instead of element.index(1)
        – Pouyan
        CommentedMar 10, 2017 at 1:00

      Start asking to get answers

      Find the answer to your question by asking.

      Ask question

      Explore related questions

      See similar questions with these tags.