- mlrun.frameworks.sklearn.apply_mlrun(model: sklearn.base.BaseEstimator | sklearn.base.BiclusterMixin | sklearn.base.ClassifierMixin | sklearn.base.ClusterMixin | sklearn.base.DensityMixin | sklearn.base.RegressorMixin | sklearn.base.TransformerMixin = None, model_name: str = 'model', tag: str = '', model_path: str = None, modules_map: dict[str, Union[NoneType, str, list[str]]] | str = None, custom_objects_map: dict[str, Union[str, list[str]]] | str = None, custom_objects_directory: str = None, context: MLClientCtx = None, artifacts: list[mlrun.frameworks._ml_common.plan.MLPlan] | list[str] | dict[str, dict] = None, metrics: list[mlrun.frameworks.sklearn.metric.Metric] | list[Union[tuple[Union[Callable, str], dict], Callable, str]] | dict[str, Union[tuple[Union[Callable, str], dict], Callable, str]] = None, x_test: list | tuple | dict | ndarray | DataFrame | Series | scipy.sparse.base.spmatrix = None, y_test: list | tuple | dict | ndarray | DataFrame | Series | scipy.sparse.base.spmatrix = None, sample_set: list | tuple | dict | ndarray | DataFrame | Series | scipy.sparse.base.spmatrix | DataItem | str = None, y_columns: list[str] | list[int] = None, feature_vector: str = None, feature_weights: list[float] = None, labels: dict[str, Union[str, int, float]] = None, parameters: dict[str, Union[str, int, float]] = None, extra_data: dict[str, Union[str, bytes, mlrun.artifacts.base.Artifact, mlrun.datastore.base.DataItem]] = None, auto_log: bool = True, **kwargs) SKLearnModelHandler [source]#
Wrap the given model with MLRun's interface providing it with mlrun's additional features.
- Parameters:
model -- The model to wrap. Can be loaded from the model path given as well.
model_name -- The model name to use for storing the model artifact. Default: "model".
tag -- The model's tag to log with.
model_path -- The model's store object path. Mandatory for evaluation (to know which model to update). If model is not provided, it will be loaded from this path.
modules_map --
A dictionary of all the modules required for loading the model. Each key is a path to a module and its value is the object name to import from it. All the modules will be imported globally. If multiple objects needed to be imported from the same module a list can be given. The map can be passed as a path to a json file as well. For example:
{ "module1": None, # import module1 "module2": ["func1", "func2"], # from module2 import func1, func2 "module3.sub_module": "func3", # from module3.sub_module import func3 }
If the model path given is of a store object, the modules map will be read from the logged modules map artifact of the model.
custom_objects_map --
A dictionary of all the custom objects required for loading the model. Each key is a path to a python file and its value is the custom object name to import from it. If multiple objects needed to be imported from the same py file a list can be given. The map can be passed as a path to a json file as well. For example:
{ "/.../": "MyModel", "/.../": ["object1", "object2"], }
All the paths will be accessed from the given 'custom_objects_directory', meaning each py file will be read from 'custom_objects_directory/<MAP VALUE>'. If the model path given is of a store object, the custom objects map will be read from the logged custom object map artifact of the model. Notice: The custom objects will be imported in the order they came in this dictionary (or json). If a custom object is depended on another, make sure to put it below the one it relies on.
custom_objects_directory -- Path to the directory with all the python files required for the custom objects. Can be passed as a zip file as well (will be extracted during the run before loading the model). If the model path given is of a store object, the custom objects files will be read from the logged custom object artifact of the model.
context -- MLRun context to work with. If no context is given it will be retrieved via 'mlrun.get_or_create_ctx(None)'
artifacts -- A list of artifacts plans to produce during the run.
metrics -- A list of metrics to calculate during the run.
x_test -- The validation data for producing and calculating artifacts and metrics post training. Without this, validation will not be performed.
y_test -- The test data ground truth for producing and calculating artifacts and metrics post training or post predict / predict_proba.
sample_set -- A sample set of inputs for the model for logging its stats along the model in favour of model monitoring. If not given the 'x_train' will be used by default.
y_columns -- List of names of all the columns in the ground truth labels in case its a pd.DataFrame or a list of integers in case the dataset is a np.ndarray. If not given 'y_train' is given then the labels / indices in it will be used by default.
feature_vector -- Feature store feature vector uri (store://feature-vectors/<project>/<name>[:tag])
feature_weights -- List of feature weights, one per input column.
labels -- Labels to log with the model.
parameters -- Parameters to log with the model.
extra_data -- Extra data to log with the model.
auto_log -- Whether to apply MLRun's auto logging on the model. Auto logging will add the default artifacts and metrics to the lists of artifacts and metrics. Default: True.
- Returns:
The model handler initialized with the provided model.