mlrun.package.packagers.default_packager.DefaultPackager#

class mlrun.package.packagers.default_packager.DefaultPackager[source]#

Bases: Packager

A default packager that handles all types and pack them as pickle files.

The default packager implements all the required methods and have a default logic that should be satisfying most use cases. In order to work with this class, you shouldn’t override the abstract class methods, but follow the guidelines below:

  • The class variable PACKABLE_OBJECT_TYPE: The type of object this packager can pack and unpack (used in the is_packable method).

  • The class variable PACK_SUBCLASSES: A flag that indicates whether to pack all subclasses of the PACKABLE_OBJECT_TYPE (used in the is_packable method). Default is False.

  • The class variable DEFAULT_PACKING_ARTIFACT_TYPE: The default artifact type to pack as. It is being returned from the method get_default_packing_artifact_type.

  • The class variable DEFAULT_UNPACKING_ARTIFACT_TYPE: The default artifact type to unpack from. It is being returned from the method get_default_unpacking_artifact_type.

  • The abstract class method pack(): The method is implemented to get the object and send it to the relevant packing method by the artifact type given using the following naming: “pack_<artifact_type>”. (if artifact type was not provided, the default one will be used). For example: if the artifact type is x then the class method pack_x must be implemented. The signature of each pack class method must be:

    @classmethod
    def pack_x(cls, obj: Any, key: str, ...) -> Union[Tuple[Artifact, dict], dict]:
        pass
    

    Where ‘x’ is the artifact type, ‘obj’ is the object to pack, key is the key to name the artifact and are additional custom log hint configurations and the returning values are the packed artifact and the instructions for unpacking it, or in case of result, the dictionary of the result with its key and value. The log hint configurations are sent by the user and shouldn’t be mandatory, meaning they should have a default value (otherwise, the user will have to add them to every log hint).

  • The abstract class method unpack(): The method is implemented to get a DataItem and send it to the relevant unpacking method by the artifact type using the following naming: “unpack_<artifact_type>” (if artifact type was not provided, the default one will be used). For example: if the artifact type stored within the DataItem is x then the class method unpack_x must be implemented. The signature of each unpack class method must be:

    @classmethod
    def unpack_x(cls, data_item: mlrun.DataItem, ...) -> Any:
        pass
    

    Where ‘x’ is the artifact type, ‘data_item’ is the artifact’s data item to unpack, are the instructions that were originally returned from pack_x (Each instruction must be optional (have a default value) to support objects from this type that were not packaged but customly logged) and the returning value is the unpacked object.

  • The abstract class method is_packable(): The method is implemented to validate the object type and artifact type automatically by the following rules:

    • Object type validation: Checking if the object type given match to the variable PACKABLE_OBJECT_TYPE with respect to the PACK_SUBCLASSES class variable.

    • Artifact type validation: Checking if the artifact type given is in the list returned from get_supported_artifact_types.

  • The abstract class method is_unpackable(): The method is left as implemented in Packager.

  • The abstract class method get_supported_artifact_types(): The method is implemented to look for all pack + unpack class methods implemented to collect the supported artifact types. If PackagerX has pack_y, unpack_y and pack_z, unpack_z that means the artifact types supported are y and z.

  • The abstract class method get_default_packing_artifact_type(): The method is implemented to return the new class variable DEFAULT_PACKING_ARTIFACT_TYPE. You may still override the method if the default artifact type you need may change according to the object that’s about to be packed.

  • The abstract class method get_default_unpacking_artifact_type(): The method is implemented to return the new class variable DEFAULT_UNPACKING_ARTIFACT_TYPE. You may still override the method if the default artifact type you need may change according to the data item that’s about to be unpacked.

Important to remember

From the Packager docstring:

  • Linking artifacts (“extra data”): In order to link between packages (using the extra data or metrics spec attributes of an artifact), you should use the key as if it exists and as value ellipses (…). The manager will link all packages once it is done packing.

    For example, given extra data keys in the log hint as extra_data, setting them to an artifact should be:

    artifact = Artifact(key="my_artifact")
    artifact.spec.extra_data = {key: ... for key in extra_data}
    
  • Clearing outputs: Some packagers may produce files and temporary directories that should be deleted once done with logging the artifact. The packager can mark paths of files and directories to delete after logging using the class method add_future_clearing_path.

    For example, in the following packager’s pack method we can write a text file, create an Artifact and then mark the text file to be deleted once the artifact is logged:

    with open("./some_file.txt", "w") as file:
        file.write("Pack me")
    artifact = Artifact(key="my_artifact")
    cls.future_clear(path="./some_file.txt")
    return artifact, None
    

Packager Summary

Packing Type: Any type

Packing Sub-Classes: False

Priority: Default priority (5)

Default Artifact Types:

  • Packing: object

  • Unpacking: object

Artifact Types:

  • object - Pack a python object, pickling it into a pkl file and store it in an artifact.

    • pickle_module_name - The pickle module name to use for serializing the object.

  • result - Pack an object as a result.

Attributes

DEFAULT_PACKING_ARTIFACT_TYPE

The default artifact type to pack as.

DEFAULT_UNPACKING_ARTIFACT_TYPE

The default artifact type to unpack from.

PACKABLE_OBJECT_TYPE

The type of object this packager can pack and unpack.

PACK_SUBCLASSES

A flag for indicating whether to pack all subclasses of the PACKABLE_OBJECT_TYPE as well.

PRIORITY

The priority of this packager in the packagers collection of the manager (lower is better).

DEFAULT_PACKING_ARTIFACT_TYPE = 'object'#

The default artifact type to pack as.

DEFAULT_UNPACKING_ARTIFACT_TYPE = 'object'#

The default artifact type to unpack from.

PACKABLE_OBJECT_TYPE: Type = Ellipsis#

The type of object this packager can pack and unpack.

PACK_SUBCLASSES = False#

A flag for indicating whether to pack all subclasses of the PACKABLE_OBJECT_TYPE as well.

PRIORITY: int = Ellipsis#

The priority of this packager in the packagers collection of the manager (lower is better).

Methods

__init__()

add_future_clearing_path(path[, ...])

Mark a path to be cleared by this packager's manager post logging the packaged artifacts.

get_default_packing_artifact_type(obj)

Get the default artifact type for packing an object of this packager.

get_default_unpacking_artifact_type(data_item)

Get the default artifact type used for unpacking a data item holding an object of this packager.

get_future_clearing_path_list()

Get the packager's future clearing path list.

get_supported_artifact_types()

Get all the supported artifact types on this packager.

is_packable(obj[, artifact_type, configurations])

Check if this packager can pack an object of the provided type as the provided artifact type.

is_unpackable(data_item, type_hint[, ...])

Check if this packager can unpack an input according to the user given type hint and the provided artifact type.

pack(obj[, key, artifact_type, configurations])

Pack an object as the given artifact type using the provided configurations.

pack_object(obj, key[, pickle_module_name])

Pack a python object, pickling it into a pkl file and store it in an artifact.

pack_result(obj, key)

Pack an object as a result.

unpack(data_item[, artifact_type, instructions])

Unpack the data item's artifact by the provided type using the given instructions.

unpack_object(data_item[, ...])

Unpack the data item's object, unpickle it using the instructions and return.

__init__()#
classmethod add_future_clearing_path(path: Union[str, Path], add_temp_paths_only: bool = True)#

Mark a path to be cleared by this packager’s manager post logging the packaged artifacts.

Parameters:
  • path – The path to clear.

  • add_temp_paths_only – Whether to add only temporary files. When running locally on local files DataItem.local() will return the local given path which should not be deleted. This flag helps to avoid deleting files in that scenario.

classmethod get_default_packing_artifact_type(obj: Any) str[source]#

Get the default artifact type for packing an object of this packager.

Parameters:

obj – The about to be packed object.

Returns:

The default artifact type.

classmethod get_default_unpacking_artifact_type(data_item: DataItem) str[source]#

Get the default artifact type used for unpacking a data item holding an object of this packager. The method will be used when a data item is sent for unpacking without it being a package, but a simple url or an old / manually logged artifact.

Parameters:

data_item – The about to be unpacked data item.

Returns:

The default artifact type.

classmethod get_future_clearing_path_list() List[str]#

Get the packager’s future clearing path list.

Returns:

The clearing path list.

classmethod get_supported_artifact_types() List[str][source]#

Get all the supported artifact types on this packager.

Returns:

A list of all the supported artifact types.

classmethod is_packable(obj: Any, artifact_type: Optional[str] = None, configurations: Optional[dict] = None) bool[source]#

Check if this packager can pack an object of the provided type as the provided artifact type.

The method is implemented to validate the object’s type and artifact type by checking if the object type given match to the variable PACKABLE_OBJECT_TYPE with respect to the PACK_SUBCLASSES class variable. If it does, it will check if the artifact type given is in the list returned from get_supported_artifact_types.

Parameters:
  • obj – The object to pack.

  • artifact_type – The artifact type to log the object as.

  • configurations – The log hint configurations passed by the user.

Returns:

True if packable and False otherwise.

classmethod is_unpackable(data_item: DataItem, type_hint: Type, artifact_type: Optional[str] = None) bool#

Check if this packager can unpack an input according to the user given type hint and the provided artifact type.

The default implementation tries to match the packable object type of this packager to the given type hint, if it does match, it will look for the artifact type in the list returned from get_supported_artifact_types.

Parameters:
  • data_item – The input data item to check if unpackable.

  • type_hint – The type hint of the input to unpack (the object type to be unpacked).

  • artifact_type – The artifact type to unpack the object as.

Returns:

True if unpackable and False otherwise.

classmethod pack(obj: Any, key: Optional[str] = None, artifact_type: Optional[str] = None, configurations: Optional[dict] = None) Union[Tuple[Artifact, dict], dict][source]#

Pack an object as the given artifact type using the provided configurations.

Parameters:
  • obj – The object to pack.

  • key – The key of the artifact.

  • artifact_type – Artifact type to log to MLRun. If passing None, the default artifact type will be used.

  • configurations – Log hints configurations to pass to the packing method.

Returns:

If the packed object is an artifact, a tuple of the packed artifact and unpacking instructions dictionary. If the packed object is a result, a dictionary containing the result key and value.

classmethod pack_object(obj: Any, key: str, pickle_module_name: str = 'cloudpickle') Tuple[Artifact, dict][source]#

Pack a python object, pickling it into a pkl file and store it in an artifact.

Parameters:
  • obj – The object to pack and log.

  • key – The artifact’s key.

  • pickle_module_name – The pickle module name to use for serializing the object.

Returns:

The artifacts and it’s pickling instructions.

classmethod pack_result(obj: Any, key: str) dict[source]#

Pack an object as a result.

Parameters:
  • obj – The object to pack and log.

  • key – The result’s key.

Returns:

The result dictionary.

classmethod unpack(data_item: DataItem, artifact_type: Optional[str] = None, instructions: Optional[dict] = None) Any[source]#

Unpack the data item’s artifact by the provided type using the given instructions.

Parameters:
  • data_item – The data input to unpack.

  • artifact_type – The artifact type to unpack the data item as. If passing None, the default artifact type will be used.

  • instructions – Additional instructions noted in the package to pass to the unpacking method.

Returns:

The unpacked data item’s object.

Raises:

MLRunPackageUnpackingError – In case the packager could not unpack the data item.

classmethod unpack_object(data_item: DataItem, pickle_module_name: str = 'cloudpickle', object_module_name: Optional[str] = None, python_version: Optional[str] = None, pickle_module_version: Optional[str] = None, object_module_version: Optional[str] = None) Any[source]#

Unpack the data item’s object, unpickle it using the instructions and return.

Warnings of mismatching python and module versions between the original pickling interpreter and this one may be raised.

Parameters:
  • data_item – The data item holding the pkl file.

  • pickle_module_name – Module to use for unpickling the object.

  • object_module_name – The original object’s module. Used to verify the current interpreter object module version match the pickled object version before unpickling the object.

  • python_version – The python version in which the original object was pickled. Used to verify the current interpreter python version match the pickled object version before unpickling the object.

  • pickle_module_version – The pickle module version. Used to verify the current interpreter module version match the one who pickled the object before unpickling it.

  • object_module_version – The original object’s module version to match to the interpreter’s module version.

Returns:

The un-pickled python object.