class mlrun.datastore.DataItem(key: str, store: mlrun.datastore.base.DataStore, subpath: str, url: str = '', meta=None, artifact_url=None)[source]

Bases: object

Data input/output class abstracting access to various local/remote data sources

DataItem objects are passed into functions and can be used inside the function, when a function run completes users can access the run data via the run.artifact(key) which returns a DataItem object. users can also convert a data url (e.g. s3://bucket/key.csv) to a DataItem using mlrun.get_dataitem(url).


# using data item inside a function
def my_func(context, data: DataItem):
    df = data.as_df()

# reading run results using DataItem (run.artifact())
train_run ={'dataset': dataset},
                                params={'label_column': 'label'})

test_set = train_run.artifact('test_set').as_df()

# create and use DataItem from uri
data = mlrun.get_dataitem('http://xyz/data.json').get()
property artifact_url

DataItem artifact url (when its an artifact) or url for simple dataitems

as_df(columns=None, df_module=None, format='', **kwargs)[source]

return a dataframe object (generated from the dataitem).

  • columns – optional, list of columns to select

  • df_module – optional, dataframe class (e.g. pd, dd, cudf, ..)

  • format – file format, if not specified it will be deducted from the suffix


download to the target dir/path


target_path – local target path for the downloaded item

get(size=None, offset=0, encoding=None)[source]

read all or a byte range and return the content

  • size – number of bytes to get

  • offset – fetch from offset (in bytes)

  • encoding – encoding (e.g. “utf-8”) for converting bytes to str

property key

DataItem key

property kind

DataItem store kind (file, s3, v3io, ..)


return a list of child file names


get the local path of the file, download to tmp first if its a remote object


return a list of child file names

property meta

Artifact Metadata, when the DataItem is read from the artifacts store


return fsspec file handler, if supported

put(data, append=False)[source]

write/upload the data, append is only supported by some datastores

  • data – data (bytes/str) to write

  • append – append data to the end of the object, NOT SUPPORTED BY SOME OBJECT STORES!


show the data object content in Jupyter


format – format to use (when there is no/wrong suffix), e.g. ‘png’


return FileStats class (size, modified, content_type)

property store

DataItem store object

property suffix

DataItem suffix (file extension) e.g. ‘.png’


upload the source file (src_path)


src_path – source file path to read from and upload

property url



DataItem url e.g. /dir/path, s3

mlrun.datastore.get_store_resource(uri, db=None, secrets=None, project=None)[source]

get store resource object by uri