Filtering a numpy array using another array of labels

Question

Given two numpy arrays, i.e:

images.shape: (60000, 784) # An array containing 60000 images labels.shape: (60000, 10) # An array of labels for each image

Each row of labels contains a 1 at a particular index to indicate the class of the related example in images. (So [0 0 1 0 0 0 0 0 0 0] would indicate that the example belongs to Class 2 (assuming our class indexing starts from 0).

I am trying to efficiently separate images so that I can manipulate all images belonging to a particular class at once. The most obvious solution would be to use a for loop (as follows). However, I'm not sure how to filter images such that only those with the appropriate labels are returned.

for i in range(0, labels.shape[1]): class_images = # (?) Array containing all images that belong to class i

As an aside, I'm also wondering if there are even more efficient approaches that would eliminate the use of the for loop.

Paul Panzer · Accepted Answer · 2017-03-10 01:05:43Z

One way would be to convert your label array to bool and use it for indexing:

classes = [] blabels = labels.astype(bool) for i in range(10): classes.append(images[blabels[:, i], :])

Or as a one-liner using list comprehension:

classes = [images[l.astype(bool), :] for l in labels.T]

Pouyan · Accepted Answer · 2017-03-10 00:51:14Z

_classes= [[] for x in range(10)] for image_index , element in enumerate(labels): _classes[element.index(1)].append(image_index)

for example the _classes[0] will contain the indexes of images which are classified as class0 .

if you are using numpy you can use nonzero(element == 1)[0][0] instead of element.index(1) — Pouyan, CommentedMar 10, 2017 at 1:00

Collectives™ on Stack Overflow

Filtering a numpy array using another array of labels

2 Answers 2

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Related