Python – Sort and filter data from csv.dictreader

Sort and filter data from csv.dictreader… here is a solution to the problem.

Sort and filter data from csv.dictreader

import csv

with open ('data_airbnb.csv', newline='') as f:
    reader = csv. DictReader(f, delimiter = ',')
    data_list = list(reader)

Here is 1 example data_list content:

[OrderedDict([('room_id', '3179080'), ('survey_id', '1280'), ('host_id', '15295886'), ('room_type', 'Shared room'), ('country', ''), ('city', 'Singapore'), ('borough', ' '), ('neighborhood', 'TS17'), ('reviews', '15'), ('overall_satisfaction', '5.0'), ('accommodates', '12'), ('bedrooms', '1.0'), ('bathrooms', ''), ('price', '77.0'), ('minstay', ''), ( 'last_modified', '2017-05-17 09:10:24.216548'), ('latitude', '1.310862'), ('longitude', '103.858828'), ('location', '0101000020E6100000E738B709F7F659403F1BB96E4AF9F43F')])

Dear friends, I am trying to retrieve the top 10 most expensive rooms (prices) using room_id and put them into a list from a data_list with thousands of rows. The example list I show is 1 row of it?

I’ve tried a simple list before, but I keep getting an error accessing this value and don’t know what to do.

Please advise. Thanks

Solution

One way is to sort the dictionary list and select the top 10 elements. You can do this with sorted and custom functions:

res = sorted(data_list, key=lambda x: float(x['price']), reverse=True)[:10]

Explain

  • Lambda stands for anonymous function; You can also use an explicit named function with the same logic.
  • The float transformation is necessary to avoid comparing strings, which are currently used to represent prices in the OrderedDict object.
  • reverse=True ensures that we sort by the highest price first.
  • Since sorted returns a list, you can extract the first 10 elements using regular list slicing with [:10].

Related Problems and Solutions