When building ML Models, model serde is probably the hardest problem (in my experience) that ml engineers have to deal with. The problem is that different model training libraries have different interfaces to doing serde which means that our SerializationStrategy interface ought to be as flexible as possible. Currently, SerializationStrategy is setup so that people subclass it and implement the serialize and deserialize abstract methods. This would be fine except that these methods require a file object handle instead of a file path. This is not conducive to API's like Keras which require you to send a file path instead. However, Max correctly pointed out that certain serde could also come from a non file buffer. As a result, you would have to do something like this to make it work:
class KerasModelSerializationStrategy(SerializationStrategy): def __init__(self): super(KerasModelSerializationStrategy, self).__init__( 'keras_strategy', read_mode='r', write_mode='w' ) def serialize(self, value, write_file_obj): pass def deserialize(self, read_file_obj): pass def serialize_to_file(self, value, write_path): value.save(write_path) return write_path def deserialize_from_file(self, read_path): return load_model(read_path)
This is not ideal and is emblematic of a design with artificial coupling.
To solve this I implemented the following SerializationStrategy design which strives for more explicit decoupling of files and buffers. The con is that you will have to write more with open(...) as blah code, but the pro is more readable and "hopefully" easier to maintain code (just because there is less indirection going on). Also the above KerasSerializationStrategy code then is simplified to:
class KerasModelSerializationStrategy(FileBasedSerializationStrategy): def serialize_to_file(self, value, write_path): check.str_param(write_file_path, 'write_path') value.save(write_path) def deserialize_from_file(self, read_path): return load_model(read_path)