Python – The sorting algorithm used by Panda’s sort_values when the kind parameter is not applied

The sorting algorithm used by Panda’s sort_values when the kind parameter is not applied… here is a solution to the problem.

The sorting algorithm used by Panda’s sort_values when the kind parameter is not applied

In Pindas’ sort_values method, the kind parameter is applied only when sorting a single column or label. Why is this? What sort algorithm is used in these cases where the kind parameter is not applied? Is it stable sorting?

(For documentation, see .)。 )


This is a docstring from the source file , declare get_group_index_sorter(group_index, ngroups):

algos.groupsort_indexer implements `counting sort` and it is at least
O(ngroups), where
    ngroups = prod(shape)
    shape = map(len, keys)
that is, linear in the number of combinations (cartesian product) of unique
values of groupby keys. This can be huge when doing multi-key groupby.
np.argsort(kind='mergesort') is O(count x log(count)) where count is the
length of the data-frame;

Both algorithms are `stable` sort and that is necessary for correctness of

groupby operations. e.g. consider:

PS Here is a “call chain” :

pandas.core.frame.DataFrame.sort_values() -> \
  pandas.core.sorting.lexsort_indexer() ->  \
    pandas.core.sorting.indexer_from_factorized() -> \

Related Problems and Solutions