Python – Complex sorting, easy to do using cmp functions, but how do I plan for Python 3?

Complex sorting, easy to do using cmp functions, but how do I plan for Python 3?… here is a solution to the problem.

Complex sorting, easy to do using cmp functions, but how do I plan for Python 3?

I want to sort the columns returned from the database query to showcase.
In the results, I want to sort by as follows:

  1. The first is the key fields, sorted by position in the query results (as this usually reflects the unique index of the backend).

  2. The rest of the keys are in alphabetical order, which is not interesting because the position reflects the physical field order of the table.

Note: This is not something I want to do at the database level, this is a Python ordering issue.

I can do this in Python 2.7 as follows (see code below), but want to prepare for Python 3.

I’ve written in the past about modern sorting based on operator.attrgetter/itemgetter, including continuous passing, where you sort first by one key function and then by another. But I don’t see how branching will be handled by the 3 critical functional system.

#test data, mangled on purpose
data = [
    dict(fieldname="anotherkey2", pos=1, key=True),
    dict(fieldname="somekey1", pos=0, key=True),
    dict(fieldname="bfield3", pos=2, key=False),
    dict(fieldname="afield", pos=3, key=False),
    dict(fieldname="cfield", pos=4, key=False),
]

#exp keys, first, by position, then non-keys, alphabetic order
exp = ["somekey1","anotherkey2","afield","bfield3","cfield"]

def cmp2(field1, field2):

key1, key2 = field1.get("key"), field2.get("key")

#if both are keys, go by position in cursor results
    if key1 and key2:
        return cmp(field1["pos"], field2["pos"])

#if neither are keys, order alphabetically
    if not (key1 or key2):
        return cmp(field1["fieldname"], field2["fieldname"])

#otherwise, keys go first
    return cmp(key2, key1)

for func in [cmp2]:
    test_data = data[:]
    test_data.sort(cmp=func)
    got = [field["fieldname"] for field in test_data]
    try:
        msg = "fail with function:%s exp:%s:<>:%s:got" % (func.__name__, exp, got)
        assert exp == got, msg
        print ("success with %s: %s" % (func.__name__, got))
    except AssertionError,e:
        print(e)

Output:

success with cmp2: ['somekey1', 'anotherkey2', 'afield', 'bfield3', 'cfield']

In addition, the cmp_to_key recipe in Sorting HOWTO looks scary and not pythonic-compliant. Each magic function has a lot of duplicate code. I’m not sure how to functools.cmp_to_key is relevant.

I

guess what I could do is pre-decorate the field dictionary with extra properties that define how to sort. Similar to sortby = (not key, pos if key else 0, fieldname) tuple, but hopefully with a more concise approach.

This works, but… Is there anything better?

def pre_compute(data):
    for row in data:
        key, pos, fieldname = row["key"], row["pos"], row["fieldname"]
        sortby = (not key, (pos if key else 0), fieldname)
        row["sortby"] = sortby

for func in [pre_compute]:
    test_data = data[:]

func(test_data)

test_data.sort(key=itemgetter('sortby'))

got = [field["fieldname"] for field in test_data]
    try:
        msg = "fail with function:%s exp:%s:<>:%s:got" % (func.__name__, exp, got)
        assert exp == got, msg
        print ("success with %s: %s" % (func.__name__, got))
    except AssertionError,e:
        print(e)

Solution

cmp_to_key() (the stand-alone version, or the version built into the functools module) converts any functions available for the cmp= parameter that can be used with sort to the newer key= parameter. This would be the most straightforward solution to your problem (although it might be better to let the database do it for you, as some reviewers have noted).

Related Problems and Solutions