Python – Regardless of the data source, what is the correct process for validating and saving data using the Django/Django Rest Framework?

Regardless of the data source, what is the correct process for validating and saving data using the Django/Django Rest Framework?… here is a solution to the problem.

Regardless of the data source, what is the correct process for validating and saving data using the Django/Django Rest Framework?

I

have a specific model that I want to perform custom validation on. I want to guarantee that at least one identifier field is always present when creating a new instance, so that it is not possible to create an instance without one of these fields, although there is no particular need for a separate field.

from django.db import models

class Security(models. Model):
    symbol = models. CharField(unique=True, blank=True)
    sedol = models. CharField(unique=True, blank=True)
    tradingitemid = models. Charfield(unique=True, blank=True)

Regardless of where the raw data came from (for example, an API post or an internal function that gets this data from another source such as a .csv file), I want a clean, reliable way to do this.

I

know I can override the model’s .save() method and perform validation, but best practice notes here It is suggested that throwing a validation error in the .save() method is a bad idea, because View will only return a 500 response instead of returning a validation error to the publish request.

I know I can use validator Define a custom serializer to use the Django Rest Framework for models that validate data (this is a great solution for creating objects with ModelViewSets, and I can guarantee to use this serializer every time). But this data integrity guarantee only works on that API endpoint, and then is as good as developers remembering to use that serializer every time an object is created elsewhere in the codebase (objects can be created from source throughout the codebase, except for the networking API).

I’m also familiar with Django’s .clean() and .full_clean() These seem like perfect solutions, except that it again relies on the developer always remembering to call these methods – a guarantee that is as good as the developer’s memory power. I know these methods are automatically called when using ModelForm, but again, for my use case, models can also be created from .csv download – I need a generic guarantee, which is a best practice. I could put .clean() inside the model’s .save() method, but this one answer (and related comments and links in the post) It seems to make this approach controversial and possibly an anti-pattern.

Is there a clean, straightforward way to guarantee that this model will never be saved without one of the following three fields: 1. No 500 error is thrown through View, 2. Does not rely on developers explicitly using the correct serializer throughout the codebase when creating objects, and 3. Does not rely on hacking calls to .clean() into the model’s .save() method (seemingly antipattern)? I feel like there has to be a clean solution here, it’s not to put some hodgepodge of validations in the serializer, some in the .clean() method, cracking the .save() method to call .clean() (it will be called twice via the save of ModelForms), etc….

Solution

Surely imagine a design where save() performs dual tasks and handles validation for you. For various reasons (partially summarized in the link here), Django decided to split it into two steps. So I agree with the consensus you found that trying to shoehorn validation into Model.save() is an antipattern. It goes against Django’s design and can cause problems.

You’ve found the “perfect solution” to verify with Model.full_clean(). I disagree with you, remembering this is a burden for developers. I mean, remembering to do anything right can be hard, especially for a huge and powerful framework, but this particular thing is simple, well documented, and is the foundation of Django’s ORM design.

This is especially true when you consider what is actually difficult for developers, and that’s the error handling itself. It’s not like developers can just execute model.validate_and_save(). Instead, they have to:

try:
    model.validate_and_save()
except ValidationError:
    # handle error - this is the hard part

And Django’s idiom is:

try:
    model.full_clean()
except ValidationError:
    # handle error - this is the hard part
else:
    model.save()

I don’t think the Django version is harder. (That is, there’s nothing stopping you from writing your own handy method for validate_and_save.) )

Finally, I recommend adding a database constraint based on your requirements as well. When you add a constraint that it knows how to enforce at the database level, that’s what Django does. For example, when you use unique=True on a field, Django creates a database constraint and adds Python code to verify the requirement. But if you want to create a constraint that Django doesn’t know about, you can do the same thing yourself. In addition to writing your own version of Python in clean(), you only need to write a migration that creates the appropriate database constraints. That way, if there is an error in your code and validation is not completed, you end up with an uncaught exception (IntegrityError) instead of corrupted data.

Related Problems and Solutions