![]() |
![]() |
Section 9: Set/Get Checkpoints
In section 6 we reviewed the life cycle of BigChunkLoaders and SmallChunkCollectors.
In regular intervals:
- The scheduler selects a patient from the dataset.
- A
BigChunkLoaderextracts a big chunk from the patient’s records. TheBigChunkLoaderhas access to the patient byself.patient. - The extracted big chunk is passed to a
SmallChunkCollector. TheSmallChunkCollectorhas access to the patient byself.patient.
The SmallChunkCollectors and BigChunkLoaders can be reschedulled any time.
Therefore, they may need to memorize some information or “checkpoint”. For this purpose, PyDmed provideds the two functions
set_checkpoint and get_checkpoint.
For instance, here is how a SmallChunkCollector can set/get the number of patches which are extracted for it’s patient.
class SampleSmallchunkCollector(SmallChunkCollector):
@abstractmethod
def extract_smallchunk(self, call_count, bigchunk, last_message_fromroot):
checkpoint = self.get_checkpoint()
if(checkpoint == None):
#If it is the very first `SmallChunk` to be extracted from
# the patient, checkpoint is None.
num_extracted_smallchunks = 0
else:
num_extracted_smallchunks =
checkpoint["num_extracted_smallchunks"]
'''
.
.
.
same as before
.
.
.
'''
self.set_checkpoint({"num_extracted_smallchunks":
num_extracted_smallchunks+1})
return smallchunk
Please note that each “checkpoint” is indeed associated with a unique patient, rather than the SmallChunkCollector or the BigChunkLoader.
![]() |
![]() |

