mlrun.package.packager.Packager#

class mlrun.package.packager.Packager[source]#

Bases: ABC

The abstract base class for a packager. Packager has two main duties:

  1. Packing - get an object that was returned from a function and log it to MLRun. The user can specify packing configurations to the packager using log hints. The packed object can be an artifact or a result.

  2. Unpacking - get an mlrun.DataItem (an input to a MLRun function) and parse it to the desired hinted type. The packager uses the instructions it noted itself when originally packing the object.

Custom Implementation (Inherit Packager)

The Packager has one class variable and five class methods that must be implemented:

  • PACKABLE_OBJECT_TYPE - A class variable to specify the object type this packager handles. Used for the is_packable and repr methods. An ellipses (...) means any type.

  • PRIORITY - The priority of this packager among the rest of the packagers. Valid values are integers between 1-10 where 1 is the highest priority and 10 is the lowest. If not set, a default priority of 5 is set for MLRun builtin packagers and 3 for user custom packagers.

  • get_default_packing_artifact_type() - A class method to get the default artifact type for packing an object when it is not provided by the user.

  • get_default_unpacking_artifact_type() - A class method to get the default artifact type for unpacking a data item when it is not representing a package, but a simple url or an old / manually logged artifact.

  • get_supported_artifact_types() - A class method to get the supported artifact types this packager can pack an object as. Used for the is_packable and repr methods.

  • pack() - A class method to pack a returned object using the provided log hint configurations while noting itself instructions for how to unpack it once needed (only relevant for packed artifacts since results do not need unpacking).

  • unpack() - A class method to unpack an MLRun DataItem, parsing it to its desired hinted type using the instructions noted while originally packing it.

The class methods is_packable and is_unpackable are implemented with the following basic logic:

  • is_packable() - a class method to know whether to use this packager to pack an object by its type and artifact type. It compares the object's type with the PACKABLE_OBJECT_TYPE and checks that the artifact type is in the returned supported artifacts list from get_supported_artifact_types.

  • is_unpackable() - a class method to know whether to use this packager to unpack a data item by the user- noted type hint and optionally stored artifact type in the data item (in case it was packaged before). It matches the PACKABLE_OBJECT_TYPE to the type hint given (same logic as IDE matchups, meaning subclasses are considered as unpackable) and checks if the artifact type is in the returned supported artifacts list from get_supported_artifact_types.

Preferably, each packager should handle a single type of object.

Linking Artifacts (extra data)

To link between packages (using the extra data or metrics spec attributes of an artifact), use the key as if it exists and as value ellipses (...). The manager links all packages once it is done packing.

For example, given extra data keys in the log hint as extra_data, setting them to an artifact would be:

artifact = Artifact(key="my_artifact")
artifact.spec.extra_data = {key: ... for key in extra_data}

Clearing Outputs

Some of the packagers may produce files and temporary directories that should be deleted once the artifacts are logged. The packager can mark paths of files and directories to delete after logging using the class method add_future_clearing_path().

For example, in the following packager's pack method, you can write a text file, create an Artifact, and then mark the text file to be deleted once the artifact is logged:

with open("./some_file.txt", "w") as file:
    file.write("Pack me")
artifact = Artifact(key="my_artifact")
self.add_future_clearing_path(path="./some_file.txt")
return artifact, None

Attributes

PACKABLE_OBJECT_TYPE

The type of object this packager can pack and unpack.

PRIORITY

The priority of this packager in the packagers collection of the manager (lower is better).

future_clearing_path_list

Get the packager's future clearing path list.

priority

Get the packager's priority.

PACKABLE_OBJECT_TYPE: type = Ellipsis#

The type of object this packager can pack and unpack.

PRIORITY: int = Ellipsis#

The priority of this packager in the packagers collection of the manager (lower is better).

future_clearing_path_list#

Get the packager's future clearing path list.

Returns:

The clearing path list.

priority#

Get the packager's priority.

Returns:

The packager's priority.

Methods

__init__()

add_future_clearing_path(path)

Mark a path to be cleared by this packager's manager after logging the packaged artifacts.

get_data_item_local_path(data_item[, ...])

Get the local path to the item handled by the data item provided.

get_default_packing_artifact_type(obj)

Get the default artifact type used for packing.

get_default_unpacking_artifact_type(data_item)

Get the default artifact type used for unpacking a data item holding an object of this packager.

get_supported_artifact_types()

Get all the supported artifact types on this packager.

is_packable(obj[, artifact_type, configurations])

Check if this packager can pack an object of the provided type as the provided artifact type.

is_unpackable(data_item, type_hint[, ...])

Check if this packager can unpack an input according to the user-given type hint and the provided artifact type.

pack(obj[, key, artifact_type, configurations])

Pack an object as the given artifact type using the provided configurations.

unpack(data_item[, artifact_type, instructions])

Unpack the data item's artifact by the provided type using the given instructions.

__init__()[source]#
add_future_clearing_path(path: str | Path)[source]#

Mark a path to be cleared by this packager's manager after logging the packaged artifacts.

Parameters:

path -- The path to clear post logging the artifacts.

get_data_item_local_path(data_item: DataItem, add_to_future_clearing_path: bool | None = None) str[source]#

Get the local path to the item handled by the data item provided. The local path can be the same as the data item in case the data item points to a local path, or will be downloaded to a temporary directory and return this newly created temporary local path.

Parameters:
  • data_item -- The data item to get its item local path.

  • add_to_future_clearing_path -- Whether to add the local path to the future clearing paths list. If None, it will add the path to the list only if the data item is not of kind 'file', meaning it represents a local file and hence we don't want to delete it post running automatically. We wish to delete it only if the local path is temporary (and that will be in case kind is not 'file', so it is being downloaded to a temporary directory).

Returns:

The data item local path.

abstract get_default_packing_artifact_type(obj: Any) str[source]#

Get the default artifact type used for packing. The method is used when an object is sent for packing without an artifact type noted by the user.

Parameters:

obj -- The about to be packed object.

Returns:

The default artifact type.

abstract get_default_unpacking_artifact_type(data_item: DataItem) str[source]#

Get the default artifact type used for unpacking a data item holding an object of this packager. The method is used when a data item is sent for unpacking without it being a package, but is a simple url or an old / manually logged artifact.

Parameters:

data_item -- The about-to-be unpacked data item.

Returns:

The default artifact type.

abstract get_supported_artifact_types() list[str][source]#

Get all the supported artifact types on this packager.

Returns:

A list of all the supported artifact types.

is_packable(obj: Any, artifact_type: str | None = None, configurations: dict | None = None) bool[source]#

Check if this packager can pack an object of the provided type as the provided artifact type.

The default implementation checks if the packable object type of this packager is equal to the given object's type. If it matches, it looks for the artifact type in the list returned from get_supported_artifact_types.

Parameters:
  • obj -- The object to pack.

  • artifact_type -- The artifact type to log the object as.

  • configurations -- The log hint configurations passed by the user.

Returns:

True if packable and False otherwise.

is_unpackable(data_item: DataItem, type_hint: type, artifact_type: str | None = None) bool[source]#

Check if this packager can unpack an input according to the user-given type hint and the provided artifact type.

The default implementation tries to match the packable object type of this packager to the given type hint. If it matches, it looks for the artifact type in the list returned from get_supported_artifact_types.

Parameters:
  • data_item -- The input data item to check if unpackable.

  • type_hint -- The type hint of the input to unpack (the object type to be unpacked).

  • artifact_type -- The artifact type to unpack the object as.

Returns:

True if unpackable and False otherwise.

abstract pack(obj: Any, key: str | None = None, artifact_type: str | None = None, configurations: dict | None = None) tuple[mlrun.artifacts.base.Artifact, dict] | dict[source]#

Pack an object as the given artifact type using the provided configurations.

Parameters:
  • obj -- The object to pack.

  • key -- The key of the artifact.

  • artifact_type -- Artifact type to log to MLRun.

  • configurations -- Log hints configurations to pass to the packing method.

Returns:

If the packed object is an artifact, a tuple of the packed artifact and unpacking instructions dictionary. If the packed object is a result, a dictionary containing the result key and value.

abstract unpack(data_item: DataItem, artifact_type: str | None = None, instructions: dict | None = None) Any[source]#

Unpack the data item's artifact by the provided type using the given instructions.

Parameters:
  • data_item -- The data input to unpack.

  • artifact_type -- The artifact type to unpack the data item as.

  • instructions -- Additional instructions noted in the package to pass to the unpacking method.

Returns:

The unpacked data item's object.