mlrun.frameworks.xgboost#
- mlrun.frameworks.xgboost.apply_mlrun(model: xgboost.XGBModel = None, model_name: str = 'model', tag: str = '', model_path: str = None, modules_map: Union[Dict[str, Union[None, str, List[str]]], str] = None, custom_objects_map: Union[Dict[str, Union[str, List[str]]], str] = None, custom_objects_directory: str = None, context: MLClientCtx = None, artifacts: Union[List[MLPlan], List[str], Dict[str, dict]] = None, metrics: Union[List[Metric], List[Union[Tuple[Union[Callable, str], dict], Callable, str]], Dict[str, Union[Tuple[Union[Callable, str], dict], Callable, str]]] = None, x_test: Union[list, tuple, dict, ndarray, DataFrame, Series, scipy.sparse.base.spmatrix, xgboost.DMatrix] = None, y_test: Union[list, tuple, dict, ndarray, DataFrame, Series, scipy.sparse.base.spmatrix, xgboost.DMatrix] = None, sample_set: Union[list, tuple, dict, ndarray, DataFrame, Series, scipy.sparse.base.spmatrix, xgboost.DMatrix, DataItem, str] = None, y_columns: Union[List[str], List[int]] = None, feature_vector: str = None, feature_weights: List[float] = None, labels: Dict[str, Union[str, int, float]] = None, parameters: Dict[str, Union[str, int, float]] = None, extra_data: Dict[str, Union[str, bytes, Artifact, DataItem]] = None, auto_log: bool = True, **kwargs) XGBoostModelHandler [source]#
Wrap the given model with MLRun’s interface providing it with mlrun’s additional features.
- Parameters
model – The model to wrap. Can be loaded from the model path given as well.
model_name – The model name to use for storing the model artifact. Default: “model”.
tag – The model’s tag to log with.
model_path – The model’s store object path. Mandatory for evaluation (to know which model to update). If model is not provided, it will be loaded from this path.
modules_map –
A dictionary of all the modules required for loading the model. Each key is a path to a module and its value is the object name to import from it. All the modules will be imported globally. If multiple objects needed to be imported from the same module a list can be given. The map can be passed as a path to a json file as well. For example:
{ "module1": None, # import module1 "module2": ["func1", "func2"], # from module2 import func1, func2 "module3.sub_module": "func3", # from module3.sub_module import func3 }
If the model path given is of a store object, the modules map will be read from the logged modules map artifact of the model.
custom_objects_map –
A dictionary of all the custom objects required for loading the model. Each key is a path to a python file and its value is the custom object name to import from it. If multiple objects needed to be imported from the same py file a list can be given. The map can be passed as a path to a json file as well. For example:
{ "/.../custom_model.py": "MyModel", "/.../custom_objects.py": ["object1", "object2"] }
All the paths will be accessed from the given ‘custom_objects_directory’, meaning each py file will be read from ‘custom_objects_directory/<MAP VALUE>’. If the model path given is of a store object, the custom objects map will be read from the logged custom object map artifact of the model. Notice: The custom objects will be imported in the order they came in this dictionary (or json). If a custom object is depended on another, make sure to put it below the one it relies on.
custom_objects_directory – Path to the directory with all the python files required for the custom objects. Can be passed as a zip file as well (will be extracted during the run before loading the model). If the model path given is of a store object, the custom objects files will be read from the logged custom object artifact of the model.
context – MLRun context to work with. If no context is given it will be retrieved via ‘mlrun.get_or_create_ctx(None)’
artifacts – A list of artifacts plans to produce during the run.
metrics – A list of metrics to calculate during the run.
x_test – The validation data for producing and calculating artifacts and metrics post training. Without this, validation will not be performed.
y_test – The test data ground truth for producing and calculating artifacts and metrics post training or post predict / predict_proba.
sample_set – A sample set of inputs for the model for logging its stats along the model in favour of model monitoring.
y_columns – List of names of all the columns in the ground truth labels in case its a pd.DataFrame or a list of integers in case the dataset is a np.ndarray. If not given but ‘y_train’ / ‘y_test’ is given then the labels / indices in it will be used by default.
feature_vector – Feature store feature vector uri (store://feature-vectors/<project>/<name>[:tag])
feature_weights – List of feature weights, one per input column.
labels – Labels to log with the model.
parameters – Parameters to log with the model.
extra_data – Extra data to log with the model.
auto_log – Whether to apply MLRun’s auto logging on the model. Auto logging will add the default artifacts and metrics to the lists of artifacts and metrics. Default: True.
- Returns
The model handler initialized with the provided model.