mlrun.frameworks.lgbm#

mlrun.frameworks.lgbm.apply_mlrun(model: Union[lightgbm.LGBMModel, lightgbm.Booster] = None, model_name: str = 'model', tag: str = '', model_path: str = None, modules_map: Union[Dict[str, Union[None, str, List[str]]], str] = None, custom_objects_map: Union[Dict[str, Union[str, List[str]]], str] = None, custom_objects_directory: str = None, context: mlrun.execution.MLClientCtx = None, model_format: str = 'pkl', artifacts: Union[List[mlrun.frameworks._ml_common.plan.MLPlan], List[str], Dict[str, dict]] = None, metrics: Union[List[mlrun.frameworks.sklearn.metric.Metric], List[Union[Tuple[Union[Callable, str], dict], Callable, str]], Dict[str, Union[Tuple[Union[Callable, str], dict], Callable, str]]] = None, x_test: Union[list, tuple, dict, numpy.ndarray, pandas.core.frame.DataFrame, pandas.core.series.Series, scipy.sparse.base.spmatrix, lightgbm.Dataset] = None, y_test: Union[list, tuple, dict, numpy.ndarray, pandas.core.frame.DataFrame, pandas.core.series.Series, scipy.sparse.base.spmatrix, lightgbm.Dataset] = None, sample_set: Union[list, tuple, dict, numpy.ndarray, pandas.core.frame.DataFrame, pandas.core.series.Series, scipy.sparse.base.spmatrix, lightgbm.Dataset, mlrun.datastore.base.DataItem, str] = None, y_columns: Union[List[str], List[int]] = None, feature_vector: str = None, feature_weights: List[float] = None, labels: Dict[str, Union[str, int, float]] = None, parameters: Dict[str, Union[str, int, float]] = None, extra_data: Dict[str, Union[str, bytes, mlrun.artifacts.base.Artifact, mlrun.datastore.base.DataItem]] = None, auto_log: bool = True, mlrun_logging_callback_kwargs: Dict[str, Any] = None, **kwargs) Optional[mlrun.frameworks.lgbm.model_handler.LGBMModelHandler][source]#

Apply MLRun’s interface on top of LightGBM by wrapping the module itself or the given model, providing both with MLRun’s quality of life features.

Parameters
  • model – The model to wrap. Can be loaded from the model path given as well.

  • model_name – The model name to use for storing the model artifact. Default: “model”.

  • tag – The model’s tag to log with.

  • model_path – The model’s store object path. Mandatory for evaluation (to know which model to update). If model is not provided, it will be loaded from this path.

  • modules_map

    A dictionary of all the modules required for loading the model. Each key is a path to a module and its value is the object name to import from it. All the modules will be imported globally. If multiple objects needed to be imported from the same module a list can be given. The map can be passed as a path to a json file as well. For example:

    {
        "module1": None,  # import module1
        "module2": ["func1", "func2"],  # from module2 import func1, func2
        "module3.sub_module": "func3",  # from module3.sub_module import func3
    }
    

    If the model path given is of a store object, the modules map will be read from the logged modules map artifact of the model.

  • custom_objects_map

    A dictionary of all the custom objects required for loading the model. Each key is a path to a python file and its value is the custom object name to import from it. If multiple objects needed to be imported from the same py file a list can be given. The map can be passed as a path to a json file as well. For example:

    {
        "/.../custom_model.py": "MyModel",
        "/.../custom_objects.py": ["object1", "object2"]
    }
    

    All the paths will be accessed from the given ‘custom_objects_directory’, meaning each py file will be read from ‘custom_objects_directory/<MAP VALUE>’. If the model path given is of a store object, the custom objects map will be read from the logged custom object map artifact of the model. Notice: The custom objects will be imported in the order they came in this dictionary (or json). If a custom object is depended on another, make sure to put it below the one it relies on.

  • custom_objects_directory – Path to the directory with all the python files required for the custom objects. Can be passed as a zip file as well (will be extracted during the run before loading the model). If the model path given is of a store object, the custom objects files will be read from the logged custom object artifact of the model.

  • context – MLRun context to work with. If no context is given it will be retrieved via ‘mlrun.get_or_create_ctx(None)’

  • artifacts – A list of artifacts plans to produce during the run.

  • metrics – A list of metrics to calculate during the run.

  • x_test – The validation data for producing and calculating artifacts and metrics post training. Without this, validation will not be performed.

  • y_test – The test data ground truth for producing and calculating artifacts and metrics post training or post predict / predict_proba.

  • sample_set – A sample set of inputs for the model for logging its stats along the model in favour of model monitoring.

  • y_columns – List of names of all the columns in the ground truth labels in case its a pd.DataFrame or a list of integers in case the dataset is a np.ndarray. If not given but ‘y_train’ / ‘y_test’ is given then the labels / indices in it will be used by default.

  • feature_vector – Feature store feature vector uri (store://feature-vectors/<project>/<name>[:tag])

  • feature_weights – List of feature weights, one per input column.

  • labels – Labels to log with the model.

  • parameters – Parameters to log with the model.

  • extra_data – Extra data to log with the model.

  • auto_log – Whether to apply MLRun’s auto logging on the model. Auto logging will add the default artifacts and metrics to the lists of artifacts and metrics. Default: True.

  • mlrun_logging_callback_kwargs – Key word arguments for the MLRun callback. For further information see the documentation of the class ‘MLRunLoggingCallback’. Note that ‘context’ is already given here.

Returns

If a model was provided via model or model_path the model handler initialized with the provided model will be returned. Otherwise, None.