If you dumps
an object, and look at the module pickle.py
: https://github.com/python/cpython/blob/3.9/Lib/pickle.py#L107, you'll see that pickle
converts an object to a series of opcodes (and recursively stored data). This is what is basically what is written to disk when you use dump
. I authored the part of hickle
that stores arbitrary objects -- by first using dill.dumps
to generate a string of optcodes and data, then using HDF to store the string. If you turn on tracing in dill
, you can see how the opcodes and data are stored in the string.
>>> x = dict(a=[1,2,3], b=set((4,5,6)))
>>> import dill
>>> dill.detect.trace(True)
>>> dill.dumps(x)
D2: <dict object at 0x11023c870>
T1: <class 'set'>
F2: <function _load_type at 0x11070f2f0>
# F2
# T1
# D2
b'x80x03}qx00(Xx01x00x00x00aqx01]qx02(Kx01Kx02Kx03eXx01x00x00x00bqx03cdill._dill
_load_type
qx04Xx03x00x00x00setqx05x85qx06Rqx07]qx08(Kx04Kx05Kx06ex85qRq
u.'
It creates a dict, which stores a list of ints (no special function needed), then stores a special function (load_type
) to help reconstitute the set, and finally stores the set of ints. Optcodes at the beginning signify the version and protocol.
So, yes, you can access the state (in serialized form) before it is dumped to file.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…