![]() |
![]() |
Section 9: Set/Get Checkpoints
In section 6 we reviewed the life cycle of BigChunkLoader
s and SmallChunkCollector
s.
In regular intervals:
- The scheduler selects a patient from the dataset.
- A
BigChunkLoader
extracts a big chunk from the patient’s records. TheBigChunkLoader
has access to the patient byself.patient
. - The extracted big chunk is passed to a
SmallChunkCollector
. TheSmallChunkCollector
has access to the patient byself.patient
.
The SmallChunkCollector
s and BigChunkLoader
s can be reschedulled any time.
Therefore, they may need to memorize some information or “checkpoint”. For this purpose, PyDmed provideds the two functions
set_checkpoint
and get_checkpoint
.
For instance, here is how a SmallChunkCollector
can set/get the number of patches which are extracted for it’s patient.
class SampleSmallchunkCollector(SmallChunkCollector):
@abstractmethod
def extract_smallchunk(self, call_count, bigchunk, last_message_fromroot):
checkpoint = self.get_checkpoint()
if(checkpoint == None):
#If it is the very first `SmallChunk` to be extracted from
# the patient, checkpoint is None.
num_extracted_smallchunks = 0
else:
num_extracted_smallchunks =
checkpoint["num_extracted_smallchunks"]
'''
.
.
.
same as before
.
.
.
'''
self.set_checkpoint({"num_extracted_smallchunks":
num_extracted_smallchunks+1})
return smallchunk
Please note that each “checkpoint” is indeed associated with a unique patient, rather than the SmallChunkCollector
or the BigChunkLoader
.
![]() |
![]() |