Python – Use regular expressions in Django to get a sorted set of queries by a specified field

Use regular expressions in Django to get a sorted set of queries by a specified field… here is a solution to the problem.

Use regular expressions in Django to get a sorted set of queries by a specified field

I have the following in the manager:

class NullIf(Func):
template = "NULLIF(%(expressions)s, '')"

class MySiteManager(models. Manager):

def get_queryset(self):
    qws = MySiteQuerySet(self.model, using=self._db).filter(
        some_id=settings. BASE_SOME_ID).annotate(
            # This is made for sorting by short labels as by numeric values
            short_label_numeric=Cast(
                NullIf(Func(
                    F('short_label'),
                    Value('^(\D+)|( \w+)'),
                    Value(''),
                    Value('g'),
                    function='regexp_replace')),
                models. BigIntegerField())
            ).order_by('short_label_numeric', 'short_label')

for q in qws:
        print(q.short_label, end='\n')

return qws

The output of the printed value is as follows:

1
10
100
101
102
103
104
105
106
107
108
109
11
110
111
112
113
114
115
116
117
118
119
12
120
121
122
123
124
125
126
127
128
129
13
130
131
132
133
134
135
136
137
138
139
14
140
141
142
143
144
145
146
147
148
149
15
150
151
152
153
154
155
156
157
158
159
16
17
18
19
20
200 degrees Celsius
21
22
23
24
25
26
260
261
262
263
264
265fs
266fs
267c
268c
269c
27
273c
274c
275c
276c
28
29
2c
30
302
31
52%
53%
34
55%
36
37
38
3c
4 pcs
5 pcs
501
502
503
504
505
506
507
508
509
510
511
512
513
514
31s
516
517
518
519
520
521
522
524
6 pcs
601
602
603
20°
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
7
701
702
703
30°
705
706
707
708
709
710
8 pcs
801
802
803
804
805
806
807
808
809
810
9
901
902
S1
S10
S11
S12
S13
S14
S15
S16
S17
S18
S19
S2
S20
S3
S4
S5
S6
S7
S8
S9

And my question:
How to build a query set using the output, e.g. 1 2 3 3c 4 5 6 6c … 264 265fs 266fs 267c 268c 269c … S1 S2 S3 S4 ??? Does anyone have any assumptions?

Solution

The main idea is to sort by number and then by the character part of the label, I can’t reproduce and test, but the solution might look like this:

The first is sql:

SELECT 
  (regexp_matches(short_label, '^\d+')) [1]::numeric AS ln,
  regexp_matches(short_label, '^\D+') as ls,
  short_label
FROM YOUR_APP_TABLENAME ORDER BY 1, 2, 3;

Notes in orm:

For the first SQL condition I created custom func

In [1]: from myapp.models import *

In [2]: from django.db.models import F, Func, Value
   ...: 
   ...: class StartNumeric(Func):
   ...:     function = 'REGEXP_MATCHES'
   ...:     template = "(%(function)s(%(expressions)s, '^\d+'))[1]::int"
   ...: 
   ...: qs = Ingredient.objects.annotate(
   ...:     ln=StartNumeric('short_label'),
   ...:     ls=Func('short_label', Value('^\D+'), function='regexp_matches'),
   ...:     ).values('ln').order_by('ln', 'ls', 'short_label')
   ...: 
   ...:     

In [3]: print(qs.query)
SELECT (REGEXP_MATCHES("myapp_ingredient"." short_label", '^\d+'))[1]::int AS "ln" FROM "myapp_ingredient" ORDER BY "ln" ASC, regexp_matches("myapp_ingredient"." short_label", ^\D+) ASC, "myapp_ingredient"." short_label" ASC

In [4]: data = qs.values_list('short_label', flat=True)
   ...: print(list(data))
   ...: 
   ...: 
['1', '2c', '3c', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21', '22', '23', '24', '25', '26', '27', '28', '29', '30', '31', '32c', '33c', '34', '35c', '36', '37', '38', '100', '101', '102', '103', '104', '105', '106', '107', '108', '109', '110', '111', '112', '113', '114', '115', '116', '117', '118', '119', '120', '121', '122', '123', ' 124', '125', '126', '127', '128', '129', '130', '131', '132', '133', '134', '135', '136', '137', '138', '139', '140', '141', '142', '143', '144', '145', '146', '147', '148', '149', '150', '151' , '152', '153', '154', '155', '156', '157', '158', '159', '200c', '260', '261', '262', '263', '264', '265fs', '266fs', '267c', '268c', '269c', '273c', '274c', '275c', '276c', '302', '501', ' 502', '503', '504', '505', '506', '507', '508', '509', '510', '511', '512', '513', '514', '515', '516', '517', '518', '519', '520', '521', '522', '524', '601', '602', '603', '604', '605', '606' , '607', '608', '609', '610', '611', '612', '613', '614', '615', '616', '617', '618', '619', '620', '621', '622', '623', '701', '702', '703', '703', '704', '705', '706', '707', '708', '709', ' 710', '801', '802', '803', '804', '805', '806', '807', '808', '809', '810', '901', '902', 'aaaa', 'ddd', 'ddeee', 'rrrrr', 'S1', 'S10', 'S11', 'S12', 'S13', 'S14', 'S15', 'S16', 'S17', 'S18', ' S19', 'S2', 'S20', 'S3', 'S4', 'S5', 'S6', 'S7', 'S8', 'S9', 'vvvv', 'zzzz']

Hope that helps

Related Problems and Solutions