Machine Learning Manager
API Reference
- class forepaas.ml.Notebook
Bases:
object- get_notebook(notebook_id)
get_notebook returns the notebook from a notebook id
- Parameters:
notebook_id (str) – ID of the notebook to retrieve
- Returns:
notebook configuration
- Return type:
dict
- list_notebooks()
list_notebooks returns a list of all notebooks
- Returns:
all notebook configurations
- Return type:
list
- forepaas.ml.count_testing_dataset(pipeline_id=None)
Returns total number of entries for the test dataset. This is either rows for structured data, or files for unstructured.
- Parameters:
pipeline_id (str) – test dataset pipeline id, defaults to None
- Returns:
number of items in the test dataset
- Return type:
int
- forepaas.ml.count_train_dataset(pipeline_id=None)
Returns total number of entries for the train dataset. This is either rows for structured data, or files for unstructured.
- Parameters:
pipeline_id (str) – train dataset pipeline id, defaults to None
- Returns:
number of items in the train dataset
- Return type:
int
- forepaas.ml.create_model(model_configuration)
create_model adds a model to the MLM
- Parameters:
model_configuration (dict) – configuration of the model
- Returns:
request status response
- Return type:
dict
- forepaas.ml.create_pipeline(pipeline_configuration, params={})
create_pipeline will create a ML pipeline from a configuration file
- Parameters:
pipeline_configuration (dict) – a pipeline configuration json
params (dict, optional) – additional arguments
- forepaas.ml.delete_model(model)
delete_model removes a model from the MLM
- Parameters:
model (str) – model id to remove
- Returns:
request status response
- Return type:
dict
- forepaas.ml.format_scoring(scoring)
gets scoring func from a score configuration
- Parameters:
scoring (dict) – scoring configuration
- Returns:
scoring function
- Return type:
function
- forepaas.ml.get_estimator(model_configuration, path=None)
get_estimator loads a persistent model file from the data store and returns the model. Supported files are pkl, h5, and pth.
- Parameters:
model_configuration –
path (str, optional) – path to estimator
- Returns:
fitted model
- Return type:
fitted model
- forepaas.ml.get_hyper_parameters(train=None)
Retrieves the name and value set during the hyper parameter tuning portion of the MLM pipeline, returning key value pairs
- Parameters:
train (dict, optional) – train configuration, defaults to None
- Returns:
hyper parameters
- Return type:
dict
- forepaas.ml.get_model(model)
get_model retrieves a model configuration from the MLM
- Parameters:
model – model id
:type model:str :returns: request status response :rtype: dict
- forepaas.ml.get_pipeline(pipeline_id, params={})
get_pipeline returns the ML_CONFIG for a specific ML pipeline. Returns the pipeline configuration.
- Parameters:
pipeline_id (str) – ID of the pipeline to retrieve
params (dict, optional) – additional arguments to provide
- Returns:
pipeline configuration
- Return type:
dict
- forepaas.ml.get_pipeline_dataset(pipeline_id, params={})
get_pipeline_dataset returns the dataset section of a ML pipeline configuration
- Parameters:
pipeline_id (str) – pipeline id of the dataset to retrieve
params (dict, optional) – additional arguments
- Returns:
dataset configuration
- Return type:
dict
- forepaas.ml.get_pipeline_train(pipeline_id, params={})
get_pipeline_train returns the training section of a ML pipeline configuration
- Parameters:
pipeline_id (str) – pipeline id of the train to retrieve
params (dict, optional) – additional arguments to call
- Returns:
training configuration
- Return type:
dict
- forepaas.ml.get_testing_dataset(pipeline_id=None)
returns the testing dataset for a pipeline. If pipeline is unstructured the returned values will be a dataframe of filepaths.
- Parameters:
pipeline_id (str) – test dataset pipeline id, defaults to None
- Returns:
test dataset
- Return type:
pandas.DataFrame
- forepaas.ml.get_train_dataset(pipeline_id=None)
returns the training dataset for a pipeline. If pipeline is unstructured the returned values will be a dataframe of filepaths.
- Parameters:
pipeline_id (str) – train dataset pipeline id, defaults to None
- Returns:
Train dataset
- Return type:
pandas.DataFrame
- forepaas.ml.get_train_scoring_function(train=None)
Retrieves the scoring function used by a specific train
- Parameters:
train (dict, optional) – train configuration, defaults to None
- Returns:
scoring function
- Return type:
function
- forepaas.ml.list_model(filter=None)
list_model lists all models in the MLM
- Parameters:
filter (dict,optional) – filter for model
- Returns:
request status response
- Return type:
dict
- forepaas.ml.list_pipelines(params={})
list_pipelines returns a list of ML configuration for all pipelines in the ML
- Parameters:
params (dict, optional) – additional arguments
- Returns:
pipeline configurations
- Return type:
List[dict]
- forepaas.ml.predict(data, model_id=None, consumer_id=None, framework='sklearn', input_type='json', uri=None, return_type='dict', headers=None, timeout=360)
Passes data into a deployed ML model based on consumer id returning predicted values from those features
- Parameters:
data (pandas.Dataframe or dict) – Data to input to model
model_id (str) – model id
consumer_id (str) – consumer id
framework (str) – ML library framework used for model, supported: sklearn (default), keras,pytorch
input_type (str) – Data input type, defaults to json. supported: ‘json’,’file’
uri (str) – uri of ml
return_type (str) – how the returned data is structured, accepted values are ‘json’ or ‘dataframe’
headers (dict) – additional headers to be included in request
timeout (int) – time in seconds before request times out
- Returns:
predictions
- Return type:
Union[List[dict], pandas.DataFrame]
- forepaas.ml.random_split(dataset, params={})
Random split
- Parameters:
dataset (lists, numpy arrays, scipy-sparse matrices or pandas dataframes) – dataset to split
params (dict, optional) – additionnal parameters for splitting, defaults to {}
- Returns:
Train and test dataset
- Return type:
Tuple(dataset, dataset)
- forepaas.ml.save_model(model, conf, execution_id=None)
save_model saves a model to the data_store
- Parameters:
model (SciKit learn, or other .pkl framework, model) – model to be saved
conf (dict) – configuration of the pipeline
execution_id (string, optional) – execution id
- Returns:
file information
- Return type:
dict
- forepaas.ml.save_model_file(model, model_path)
save_model_file saves a persistent model file based on framework to the data store. Currently supports pkl, h5, and pth.
- Parameters:
model (persistent file) – model file
model_path (str) – full file path to save the model to
- forepaas.ml.update_model(model, conf)
update_model updates an existing model
- Parameters:
model (str) – model id to update
:param conf:configuration of model :type conf: dict :returns: request status response :rtype: dict
- forepaas.ml.update_pipeline_dataset(pipeline_id, dataset_conf, params={})
update_pipeline_dataset updates the dataset section of a ML pipeline configuration
- Parameters:
pipeline_id (str) – pipeline id of the dataset to update
dataset_conf (dict) – dataset configuration
params (dict, optional) – additional arguments
- forepaas.ml.update_pipeline_train(pipeline_id, train_conf, params={})
update_pipeline_train uppdates the training section of a ML pipeline configuration
- Parameters:
pipeline_id (str) – pipeline id of the train to update
train_conf (dict) – configuration of the training
params (dict, optional) – additional arguments