Serializes and deserializes objects in user-defined classes
Let’s say I have a class hierarchy like this:
class SerializableWidget(object):
# some code
class WidgetA(SerilizableWidget):
# some code
class WidgetB(SerilizableWidget):
# some code
I want to be able to serialize instances of WidgetA
and WidgetB
(and possibly other widgets) into json
text files. Then, I want to be able to deserialize them without knowing their exact class in advance :
some_widget = deserielize_from_file(file_path) # pseudocode, doesn't have to be exactly a method like this
And some_widget
need to be constructed from the exact subclass of the SerilizableWidget
. What should I do about it? What methods exactly do I need to override/implement in each class of the hierarchy?
Suppose all the fields of the above class are primitive types. How do I override some __to_json__
and __from_json__
methods, and so on?
Solution
There are many ways you can solve this problem. An example is using object_hook
and default
parameters to json.load and json.dump
, respectively.
You just need to store the class with the serialized version of the object, and then you have to use the mapping of which class corresponds to which name when loading.
The following example uses a dispatcher
class decorator to store class names and objects at serialization time and look them up when deserializing. You just need to use the _as_dict
method on each class to convert the data to a dictionary:
import json
@dispatcher
class Parent(object):
def __init__(self, name):
self.name = name
def _as_dict(self):
return {'name': self.name}
@dispatcher
class Child1(Parent):
def __init__(self, name, n=0):
super().__init__(name)
self.n = n
def _as_dict(self):
d = super()._as_dict()
d['n'] = self.n
return d
@dispatcher
class Child2(Parent):
def __init__(self, name, k='ok'):
super().__init__(name)
self.k = k
def _as_dict(self):
d = super()._as_dict()
d['k'] = self.k
return d
Test now. First, let’s create a list with 3 different types of objects.
>>> obj = [Parent('foo'), Child1('bar', 15), Child2('baz', 'works')]
Serializing it produces data with the class name in each object:
>>> s = json.dumps(obj, default=dispatcher.encoder_default)
>>> print(s)
[
{"__class__": "Parent", "name": "foo"},
{"__class__": "Child1", "name": "bar", "n": 15},
{"__class__": "Child2", "name": "baz", "k": "works"}
]
And load it back to generate the correct object:
obj2 = json.loads(s, object_hook=dispatcher.decoder_hook)
print(obj2)
[
<__main__. Parent object at 0x7fb6cd561cf8>,
<__main__. Child1 object at 0x7fb6cd561d68>,
<__main__. Child2 object at 0x7fb6cd561e10>
]
Finally, this is the dispatcher's
implementation:
class _Dispatcher:
def __init__(self, classname_key='__class__'):
self._key = classname_key
self._classes = {} # to keep a reference to the classes used
def __call__(self, class_): # decorate a class
self._classes[class_.__name__] = class_
return class_
def decoder_hook(self, d):
classname = d.pop(self._key, None)
if classname:
return self._classes[classname](**d)
return d
def encoder_default(self, obj):
d = obj._as_dict()
d[self._key] = type(obj).__name__
return d
dispatcher = _Dispatcher()