mlrun.package.packager.Packager#
- class mlrun.package.packager.Packager[source]#
Bases:
ABCThe abstract base class for a packager. Packager has two main duties:
Packing - get an object that was returned from a function and log it to MLRun. The user can specify packing configurations to the packager using log hints. The packed object can be an artifact or a result.
Unpacking - get an
mlrun.DataItem(an input to a MLRun function) and parse it to the desired hinted type. The packager uses the instructions it noted itself when originally packing the object.
Custom Implementation (Inherit Packager)
The Packager has one class variable and five class methods that must be implemented:
PACKABLE_OBJECT_TYPE- A class variable to specify the object type this packager handles. Used for theis_packableandreprmethods. An ellipses (...) means any type.PRIORITY- The priority of this packager among the rest of the packagers. Valid values are integers between 1-10 where 1 is the highest priority and 10 is the lowest. If not set, a default priority of 5 is set for MLRun builtin packagers and 3 for user custom packagers.get_default_packing_artifact_type()- A class method to get the default artifact type for packing an object when it is not provided by the user.get_default_unpacking_artifact_type()- A class method to get the default artifact type for unpacking a data item when it is not representing a package, but a simple url or an old / manually logged artifact.get_supported_artifact_types()- A class method to get the supported artifact types this packager can pack an object as. Used for theis_packableand repr methods. If needed to declare different artifact types for packing and unpacking, override the following two methods:get_supported_packing_artifact_types()- Returns the artifact types available for packing. Defaults to delegating toget_supported_artifact_types. Override to declare pack-only artifact types.get_supported_unpacking_artifact_types()- Returns the artifact types available for unpacking. Defaults to delegating toget_supported_artifact_types. Override to declare unpack-only artifact types.
pack()- A class method to pack a returned object using the provided log hint configurations while noting itself instructions for how to unpack it once needed (only relevant for packed artifacts since results do not need unpacking).unpack()- A class method to unpack an MLRunDataItem, parsing it to its desired hinted type using the instructions noted while originally packing it.can_bundle()- A class method to check if the packager can be used as a bundle (a collection of packages) of the required type hint with the provided collection type.can_unbundle()- A class method to check if the packager can be used to unbundle itself into a collection of objects for packing each one of them separately.
The class methods
is_packableandis_unpackableare implemented with the following basic logic:is_packable()- a class method to know whether to use this packager to pack an object by its type and artifact type. It compares the object's type with thePACKABLE_OBJECT_TYPEand checks that the artifact type is in the returned supported artifacts list fromget_supported_packing_artifact_types.is_unpackable()- a class method to know whether to use this packager to unpack a data item by the user- noted type hint and optionally stored artifact type in the data item (in case it was packaged before). It matches thePACKABLE_OBJECT_TYPEto the type hint given (same logic as IDE matchups, meaning subclasses are considered as unpackable) and checks if the artifact type is in the returned supported artifacts list fromget_supported_unpacking_artifact_types.
Preferably, each packager should handle a single type of object.
Bundles
A bundle means the type of object handled by this packager can be used to hold a collection of other objects - like a
listor adictof packages. A bundle can be sent as alistordictin a function's run input so the packager manager will receive a list or dictionary of data items. A packager that support bundles means it can initialize an object that will hold the unpacked data items later on - based on the type hint the user required. For example:def my_function(data: ExmapleDict[str, MyType]): ... >>> mlrun_function = mlrun.code_to_function("my_code.py", kind="job") >>> run_object = mlrun_function.run( ... handler="my_function", ... inputs={ ... "data": { ... "a": "store://my_item1_Artifact", ... "b": "store://my_item2_Artifact", ... } ... }, ... )
In the example above, after the packager manager used the packager of
MyTypeto unpack all sent items, the packager manager will choose the packager ofExmapleDictto initialize it on the dictionary of unpacked items to create a bundle.A packager can be a bundle if it implements the mandatory methods
can_bundle(),can_unbundle(), and the methods:bundle()- Initialize a bundled object from a collection of objects.unbundle()- Unbundle a bundled object into a simple collection of objects (either alistor adict) for later packing them separately.
Linking Artifacts (extra data and metrics (for models))
To link between packages (using the extra data or metrics spec attributes of an artifact), use the key as if it exists and as value ellipses (...). The manager links all packages once it is done packing.
For example, given extra data keys in the log hint as extra_data, setting them to an artifact would be:
artifact = Artifact(key="my_artifact") artifact.spec.extra_data = {key: ... for key in extra_data}
Clearing Outputs
Some of the packagers may produce files and temporary directories that should be deleted once the artifacts are logged. The packager can mark paths of files and directories to delete after logging using the class method
add_future_clearing_path().For example, in the following packager's
packmethod, you can write a text file, create an Artifact, and then mark the text file to be deleted once the artifact is logged:with open("./some_file.txt", "w") as file: file.write("Pack me") artifact = Artifact(key="my_artifact") self.add_future_clearing_path(path="./some_file.txt") return artifact, None
Attributes
The type of object this packager can pack and unpack.
The priority of this packager in the packagers collection of the manager (lower is better).
Get the packager's future clearing path list.
Get the packager's priority.
- PACKABLE_OBJECT_TYPE: type = Ellipsis#
The type of object this packager can pack and unpack.
- PRIORITY: int = Ellipsis#
The priority of this packager in the packagers collection of the manager (lower is better).
- future_clearing_path_list#
Get the packager's future clearing path list.
- Returns:
The clearing path list.
- priority#
Get the packager's priority.
- Returns:
The packager's priority.
Methods
__init__()add_future_clearing_path(path)Mark a path to be cleared by this packager's manager after logging the packaged artifacts.
bundle(collection)Initialize a bundle object with the collection given using this packager.
can_bundle(bundle_hint, collection_type)Check if the packager can be used to initialize a bundle (a collection of packages) of the required type hint with the provided collection type.
can_unbundle(bundled_object)Check if the packager can unbundle a bundled object to a collection of the required type hint.
get_data_item_local_path(data_item[, ...])Get the local path to the item handled by the data item provided.
Get the default artifact type used for packing.
get_default_unpacking_artifact_type(data_item)Get the default artifact type used for unpacking a data item holding an object of this packager.
Get all the supported artifact types on this packager.
Get the supported artifact types for packing on this packager.
Get the supported artifact types for unpacking on this packager.
is_packable(obj[, artifact_type, configurations])Check if this packager can pack an object of the provided type as the provided artifact type.
is_unpackable(data_item, type_hint[, ...])Check if this packager can unpack an input according to the user-given type hint and the provided artifact type.
pack(obj[, key, artifact_type, configurations])Pack an object as the given artifact type using the provided configurations.
unbundle(bundled_object)Unbundle the given object into a collection of objects (for later pack them each separately).
unpack(data_item[, artifact_type, instructions])Unpack the data item's artifact by the provided type using the given instructions.
- add_future_clearing_path(path: str | Path)[source]#
Mark a path to be cleared by this packager's manager after logging the packaged artifacts.
- Parameters:
path -- The path to clear post logging the artifacts.
- bundle(collection: dict | list) Any[source]#
Initialize a bundle object with the collection given using this packager.
- Parameters:
collection -- The collection of objects to bundle.
- Returns:
The bundled object.
- Raises:
NotImplementedError -- In case the packager does not support bundling.
- abstractmethod can_bundle(bundle_hint: type, collection_type: type[dict] | type[list]) bool[source]#
Check if the packager can be used to initialize a bundle (a collection of packages) of the required type hint with the provided collection type.
- Parameters:
bundle_hint -- The bundle type hint to check if the PACKABLE_OBJECT_TYPE matches to.
collection_type -- The collection type that will be used in the type hint's constructor.
- Returns:
True if it can be used as a bundle and False otherwise.
- abstractmethod can_unbundle(bundled_object: Any) bool[source]#
Check if the packager can unbundle a bundled object to a collection of the required type hint.
- Parameters:
bundled_object -- The bundled object to check if the packager can unbundle it.
- Returns:
True if it can be unbundled and False otherwise.
- get_data_item_local_path(data_item: DataItem, add_to_future_clearing_path: bool | None = None) str[source]#
Get the local path to the item handled by the data item provided. The local path can be the same as the data item in case the data item points to a local path, or will be downloaded to a temporary directory and return this newly created temporary local path.
- Parameters:
data_item -- The data item to get its item local path.
add_to_future_clearing_path -- Whether to add the local path to the future clearing paths list. If None, it will add the path to the list only if the data item is not of kind 'file', meaning it represents a local file and hence we don't want to delete it post running automatically. We wish to delete it only if the local path is temporary (and that will be in case kind is not 'file', so it is being downloaded to a temporary directory).
- Returns:
The data item local path.
- abstractmethod get_default_packing_artifact_type(obj: Any) str[source]#
Get the default artifact type used for packing. The method is used when an object is sent for packing without an artifact type noted by the user.
- Parameters:
obj -- The about to be packed object.
- Returns:
The default artifact type.
- abstractmethod get_default_unpacking_artifact_type(data_item: DataItem) str[source]#
Get the default artifact type used for unpacking a data item holding an object of this packager. The method is used when a data item is sent for unpacking without it being a package, but is a simple url or an old / manually logged artifact.
- Parameters:
data_item -- The about-to-be unpacked data item.
- Returns:
The default artifact type.
- abstractmethod get_supported_artifact_types() list[str][source]#
Get all the supported artifact types on this packager.
- Returns:
A list of all the supported artifact types.
- get_supported_packing_artifact_types() list[str][source]#
Get the supported artifact types for packing on this packager.
By default, delegates to
get_supported_artifact_types. Override this method to declare artifact types that are only available for packing (not unpacking).- Returns:
A list of artifact types this packager can pack objects as.
- get_supported_unpacking_artifact_types() list[str][source]#
Get the supported artifact types for unpacking on this packager.
By default, delegates to
get_supported_artifact_types. Override this method to declare artifact types that are only available for unpacking (not packing).- Returns:
A list of artifact types this packager can unpack data items as.
- is_packable(obj: Any, artifact_type: str | None = None, configurations: dict | None = None) bool[source]#
Check if this packager can pack an object of the provided type as the provided artifact type.
The default implementation checks if the packable object type of this packager is equal to the given object's type. If it matches, it looks for the artifact type in the list returned from
get_supported_packing_artifact_types.- Parameters:
obj -- The object to pack.
artifact_type -- The artifact type to log the object as.
configurations -- The log hint configurations passed by the user.
- Returns:
True if packable and False otherwise.
- is_unpackable(data_item: DataItem, type_hint: type, artifact_type: str | None = None) bool[source]#
Check if this packager can unpack an input according to the user-given type hint and the provided artifact type.
The default implementation tries to match the packable object type of this packager to the given type hint. If it matches, it looks for the artifact type in the list returned from
get_supported_unpacking_artifact_types.- Parameters:
data_item -- The input data item to check if unpackable.
type_hint -- The type hint of the input to unpack (the object type to be unpacked).
artifact_type -- The artifact type to unpack the object as.
- Returns:
True if unpackable and False otherwise.
- abstractmethod pack(obj: Any, key: str | None = None, artifact_type: str | None = None, configurations: dict | None = None) tuple[Artifact, dict] | dict[source]#
Pack an object as the given artifact type using the provided configurations.
- Parameters:
obj -- The object to pack.
key -- The key of the artifact.
artifact_type -- Artifact type to log to MLRun.
configurations -- Log hints configurations to pass to the packing method.
- Returns:
If the packed object is an artifact, a tuple of the packed artifact and unpacking instructions dictionary. If the packed object is a result, a dictionary containing the result key and value.
- unbundle(bundled_object: Any) dict | list[source]#
Unbundle the given object into a collection of objects (for later pack them each separately).
- Returns:
The unbundled collection of objects - a list or dict.
- Raises:
NotImplementedError -- In case the packager does not support bundling.
- abstractmethod unpack(data_item: DataItem, artifact_type: str | None = None, instructions: dict | None = None) Any[source]#
Unpack the data item's artifact by the provided type using the given instructions.
- Parameters:
data_item -- The data input to unpack.
artifact_type -- The artifact type to unpack the data item as.
instructions -- Additional instructions noted in the package to pass to the unpacking method.
- Returns:
The unpacked data item's object.