Map elements to a list of unique indexes
Let’s say I have a list of elements:
my_list = ['CatA', 'CatB', 'CatC', 'CatA', 'CatA', 'CatC']
I want to convert this list to a unique element indexed list.
So CatA is assigned to index 0, CatB is assigned to index
1, and CatC
is assigned to index 2.
The result I want is:
result = [0, 1, 2, 0, 0, 2]
Currently, I’m creating the final indexed list by creating a dictionary that assigns a unique id to each element and then using list understanding:
unique_classes = np.unique(my_list)
conversion_dict = dict(unique_classes, range(len(unique_classes))
result = [conversion_dict[i] for i in my_list]
My question is: is there an easier and more straightforward way to do this?
I’m thinking about having a large list of categories, so it needs to be efficient but prevents me from manually creating unique lists, dictionaries, and list understanding.
Solution
That’s it:
my_list = ['CatA', 'CatB', 'CatC', 'CatA', 'CatA', 'CatC']
first_occurances = dict()
result = []
for i, v in enumerate(my_list):
try:
index = first_occurances[v]
except KeyError:
index = i
first_occurances[v] = i
result.append(index)
The complexity will be O(n).
Basically, what you do is store the index where the first value appears in dict. If
first_occurances
does not have the value v
, then we save the current index i
.