Python SDK

SeeMe.ai Python SDK #

The Python SDK is a convenient wrapper to communicate with the SeeMe.ai API and gives easy access to all of your datasets, models, jobs, … on the platform.

This document provides a detailed overview of all the methods and their parameters.

Get Started #

Installation #

Install the SDK from the command line:

pip install --upgrade seeme

or in your Jupyter notebook:

!pip install --upgrade seeme

Verify the version you installed:

import seeme
print(seeme.__version__)

Create a client #

A client lets you to interact with the SeeMe.ai API, allowing you to manage models, datasets, predictions and jobs.

from seeme import Client
client = Client()
ParameterTypeDescription
usernamestrThe username for the account.
apikeystrThe API key for the username.
backendstrThe backend the client communicates with. Default value https://api.seeme.ai/api/v1
env_filestrThe .env file containing the username, apikey, and backend (see below). Default: “.env”

Register #

Register a new user:

my_username  = "my_username"
my_email     = "my_email@mydomain.com"
my_password  = "supersecurepassword"
my_firstname = "firstname"
my_name      = "last name"
client.register(
    username = my_username, 
    email=my_email, 
    password=my_password, 
    firstname=my_firstname, 
    name=my_name
)

Log in #

Username / password #

Use your username and password to log in:

username = ""
password = ""

client.login(username, password)

Username / apikey #

If you have a username and apikey, you can create a client that is ready to go:

my_username = ""
my_apikey = ""

client = Client(username=my_username, apikey=my_apikey)

.env file #

Log in using and .env file:

client = Client(env_file=".my_env_file")

The .env file should be located where the client is created and contain the following values:

VariablesTypeDescription
SEEME_USERNAMEstrusername for the account you want to use
SEEME_APIKEYstrAPI key for the username you want user
SEEME_BACKENDstrbackend the client communicates with, normally: https://api.seeme.ai/api/v1

Log out #

client.logout()

Advanced #

Custom backend #

If you are running a SeeMe.ai enterprise deployment, you can pass in the url you want the SeeMe.ai client to communicate with:

custom_deployment_url = "http://seeme.yourdomain.com/api/v1"

client = Client(backend=custom_deployment_url)

Models #

Get all models #

Get a list of all models you have access to.

models = client.get_models()

These models are divided in three groups:

  • public: models that are available to everyone;
  • owned: models you created;
  • shared: models that are shared privately with you.

Public models

Public models are provided by SeeMe.ai or other users.

public_models = [model for model in models if model.public]
public_models

Your own models

A list of the models you created:

own_models = [ model for model in models if model.user_id == client.user_id]

Shared models

A list of models that others have shared with you in private.

shared_with_me = [model for model in models if model.shared_with_me]

Create a model #

application_id = client.get_application_id("pytorch", "fastai", "2.1.0", "2.7.13" )
from seeme.types import Model

my_model = Model(
    name= "Cats and dogs",
    description= "Recognize cats and dogs in pictures.",
    privacy_enabled= False,
    auto_convert= True,
    application_id= application_id
)

my_model = client.create_model(my_model)
ParameterTypeDescription
modelModelEntire model object

Every model has the following properties:

PropertyTypeDescription
idstrUnique id
created_atstrThe creation date
updated_atstrLast updated date
namestrThe model name
descriptionstrThe model description
notesstrNotes on the model
user_idstrThe user id of the model creator
can_inferenceboolFlag indicating whether the model can make predictions or not
kindstrType of AI application, possible values: “image_classification”, “object_detection”, “text_classification”, “structured”, “language_model”, “ner”, “language_model”, “llm”.
has_logoboolFlag indicating whether the model has a logo or not
logostrName and extension of the logo file (mostly for internal purpose)
publicboolFlag indicating whether the model is public or not
configstrAdditional config stored in a JSON string
active_version_idstrThe id of the current model version (see versions below)
application_idstrThe application id of the active model version (see applications)
has_ml_modelboolFlag indicating whether the model has a Core ML model
has_onnx_modelboolFlag indicating whether the model has an ONNX model
has_onnx_int8_modelboolFlag indicating whether the model has an 8-bit quantized model
has_tflite_modelboolFlag indicating whether the model has a Tensorflow Lite model
has_labels_fileboolFlag indicating whether a file will all the labels (classes) is available
shared_with_meboolFlag indicating whether the model has been shared with you
auto_convertboolFlag indicating whether the model will be automatically converted to the supported model formats (see applications). Default value: True.
privacy_enabledboolFlag indicating whether privacy is enabled. If set to ‘True’, no inputs (images, text files, …) will be stored on the server or the mobile/edge device. Default value: False.

Get a model #

Use the model id to get all the metadata of the model:

client.get_model(my_model.id)
ParameterTypeDescription
model_idstrUnique id for the model

Update a model #

Update any property of the model:

my_model = client.get_model(my_model.id)
my_model.description = "Updated for documentation purposes"

client.update_model(my_model)
ParameterTypeDescription
modelModelThe entire model object

Delete a model #

Delete a model using its id.

client.delete_model(my_model.id)
ParameterTypeDescription
model_idstrUnique id for the model

Upload a model file #

You can upload the model file by calling the upload_model. Make sure the application_id is set to the desired AI application, framework, and version.

my_model = client.upload_model(my_model.id, folder="directory/to/model", filename="your_exported_model_file.pkl")
ParameterTypeDescription
model_idstrUnique id for the model
folderstrName of the folder that contains the model file (without trailing ‘/’), default value “data”
filenamestrName of the file to be uploaded, default value “export.pkl”

This returns an updated my_model, where if successful the can_inference will be set to True.

If auto_convert is enabled, all possible conversions for the selected application_id (see Applications below) will be available.

Download model file(s) #

Download the file(s) associated with the current active (production) model.

client.download_active_model(my_model, asset_type="pkl", download_folder=".")
ParameterTypeDescription
modelModelThe entire model object
asset_typeAssetTypeThe model type you want to download. Default: PKL; Possible values: PKL, MLMODEL, TFLITE, ONNX, ONNX_INT8, LABELS, NAMES, WEIGHTS, CFG, CONVERSION_CFG, LOGO.
download_folderstrThe folder where you would like to download the model. Default: . (i.e. current directory)

If the asset_type exists, the model file will be downloaded to my_model.active_model_id.{asset_type}. One exception, the labels file will receive a .txt extension.

If you want to download a specific model version, have a look at Download a model version

Make a prediction #

result = client.predict(model_id, item, application_type=ApplicationType.IMAGE_CLASSIFICATION)
ParameterTypeDescription
model_idstrUnique id for the model
itemstrThe item you wish to make a prediction on:For “IMAGE_CLASSIFICATION” and “OBJECT_DETECTION” specify the full file location (including directory).For “TEXT_CLASSIFICATION” or “NER” pass in the string you would like to predict.For “STRUCTURED” pass in the JSON object you want to use for your prediction. For “LANGUAGE_MODEL” pass in the initial prompt to generate text. For “LLM” pass in the next chat message.
application_typeApplicationTypeThe type of prediction you want to make.Default value: IMAGE_CLASSIFICATION;Possible values: IMAGE_CLASSIFICATION, OBJECT_DETECTION, TEXT_CLASSIFICATION, STRUCTURED, NER, OCR, LANGUAGE_MODEL, LLM.

Inference properties in more detail:

PropertyTypeDescription
idstrUnique id for the model
created_atstrThe creation date
updated_atstrLast updated date
namestrStores the input value/filename. For “image_classification” and “object_detection” the original filename; For ’text_classification" and “structured” the actual text/structured input.
descriptionstrThe inference description
predictionstrThe prediction (deprecated: see inference_items)
confidencefloatThe prediction confidence (deprecated: see inference_items)
model_idstrThe id of the model the prediction was made on.
model_version_idstrThe id of the model_version the prediction was made on (see ‘Model Versions’ below.
extensionstrThe extension of the predicted image, in case of “image_classification” and “object_detection”
user_idstrThe id of the user that requested the prediction
error_reportedboolFlag indicating whether a user has reported the prediction is/might be wrong.
errorstrA description of the error in case something went wrong.
application_idstrThe application_id used to make the prediction.
inference_hoststrThe name of the inference engine used to make the prediction.
inference_timestrThe time it took to make the prediction.
end_to_end_timestrInference time including upload and return (if relevant)
dataset_item_idstrThe id of the dataset_item that was used for the prediction. Used to evaluate datasets (see Datasets below)
ResultstrA string version of the object detection prediction (deprecated).
inference_itemsList[InferenceItem]A list of individual predictions (see below).
hiddenboolFlag indicating whether this prediction has been hidden in the Data Engine.
privacy_enabledboolFlag indicating whether this prediction was made when the model was in privacy_enabled mode.

Inference Item properties

PropertyTypeDescription
idstrUnique id for the model
created_atstrThe creation date
updated_atstrLast updated date
predictionstrThe prediction
confidencefloatThe prediction confidence
inference_idstrThe id of the inference the item belongs to
coordinatesstrThe coordinates for the prediction when using object_detection or ner.

Image classification #

On an image classification model:

item = "path/to/file.png"

result = client.predict(my_model.id, item)

Object detection #

On an object detection model:

item = "path/to/file.png"

result = client.predict(my_model.id, item, application_type=ApplicationType.OBJECT_DETECTION)

Text classification #

On a text classification model:

item = "The text you want to classify."

result = client.predict(my_model.id, item, application_type=ApplicationType.TEXT_CLASSIFICATION)

Named entity recognition #

On a text named entity recognition model:

item = "The text where I want to extract information out of."

result = client.predict(my_model.id, item, application_type=ApplicationType.NER)

Tabular #

On a structured/tabular data model:

inputs = {
    "temperature": "30",
    "day": "Tuesday"
}

result = client.predict(my_model.id, item, application_type=ApplicationType.STRUCTURED)

Language model #

On a language model:

item = "The story of AI language models is"

result = client.predict(my_model.id, item, application_type=ApplicationType.LANGUAGE_MODEL)

LLM #

On a large language model:

item = "The story of AI language models is"

result = client.predict(my_model.id, item, application_type=ApplicationType.LLM)
my_model = client.upload_logo(my_model.id, folder="directory/to/logo", filename="logo_filename.jpg")
ParameterTypeDescription
model_idstrUnique id for the model
folderstrName of the folder that contains the logo file (without trailing ‘/’), default value “data”
filenamestrName of the file to be uploaded, default value “logo.jpg”. Supported formats: jpg, jpeg, png.
client.get_logo(my_model)
ParameterTypeDescription
modelModelThe entire model object

Model Versions #

An AI Model has one or multiple versions associated with it:

  • the current live version
  • previous versions
  • future versions

Get all model versions #

Get a list of all versions for a specific model.

versions = client.get_model_versions(my_model.id)
ParameterTypeDescription
model_idstrThe model id

Create a model version #

new_version = ModelVersion (
    name="A higher accuracy achieved",
    application_id="b4b9aaf0-cb37-4629-8f9b-8877aeb09a53"
)

new_version = client.create_model_version(my_model.id, new_version)
ParameterTypeDescription
model_idstrThe model id
versionModelVersionThe model version object

Every model version has the following properties. Note that these are partially similar to the model properties:

Shared with the model entity:

PropertyTypeDescription
idstrUnique id for the model version
created_atstrThe creation date
updated_atstrLast updated date
namestrThe model version name
descriptionstrThe model version description
user_idstrThe user id of the model version creator
can_inferenceboolFlag indicating whether the model version can make predictios or not
has_logoboolFlag indicating whether the model has a logo or not (not used for now)
logostrName and extension of the logo file (mostly for internal purpose)
configstrAdditional config stored in a JSON string
application_idstrThe application ID (see applications below)
has_ml_modelboolFlag indicating whether the model version has a Core ML model
has_onnx_modelboolFlag indicating whether the model version has an ONNX model
has_onnx_int8_modelboolFlag indicating whether the model version has an 8-bit quantized model
has_tflite_modelboolFlag indicating whether the model version has a Tensorflow Lite model
has_labels_fileboolFlag indicating whether a file will all the labels (classes) is available

Different from the model entity.

PropertyTypeDescription
model_idstrThe id of the model this version belongs to.
versionstrThe label of the version
version_numberintAutomatically incrementing number of the version.
dataset_version_idstrThe id of the dataset version this model version was trained on.
job_idstrThe id of the job used to build this model version.
metricsList[Metric]A list of Metrics for this model version.

Every metric has the following properties:

PropertyTypeDescription
idstrUnique id for the metric
created_atstrThe creation date
updated_atstrLast updated date
namestrThe metric name
descriptionstrThe metric description
model_version_idstrThe model version id for the metric
valuefloatThe metric value

Get model version #

Use the model and version id to get the full model version:

model_version = client.get_model_version(my_model.id, new_version.id)
ParameterTypeDescription
model_idstrThe model id
version_idstrThe model version id

Update model version #

Update any property of the model version:

model_version.description = "SOTA comes and goes, but versions are forever!"

client.update_model_version(model_version)
ParameterTypeDescription
model versionModelVersionThe entire model version object

Upload model file for a version #

Upload a model file (or model files) for a new version of your AI model.

Make sure the application_id is set to the desired AI application, framework, and version.

client.upload_model_version(new_version, folder="directory/to/model", filename="your_exported_model_file_v2.pkl")
ParameterTypeDescription
versionModelVersionThe entire model version object
folderstrName of the folder that contains the model file (without trailing ‘/’), default value “data”
filenamestrName of the file to be uploaded, default value “export.pkl”

Download a model version #

client.download_model_version(my_version, asset_type=AssetType.PKL, download_folder="data")
ParameterTypeDescription
versionModelVersionThe entire model version object
asset_typeAssetTypeThe asset type you want to download. Default: AssetType.PKL; Possible values:PKL,MLMODEL,TFLITE,ONNX,ONNX_INT8,LABELS,NAMES,WEIGHTS,CFG,CONVERSION_CFG,LOGO.
download_folderstrThe folder where you would like to download the model. Default: . (i.e. current directory)

If the asset_type exists, the model file will be downloaded to {my_model.active_model_id}.{asset_type}. One exception, the labels file will receive a .txt extension.

Make a prediction on this version #

client.version_inference(model_version_id, item, application_type=ApplicationType.IMAGE_CLASSIFICATION)
ParameterTypeDescription
model_version_idstrUnique model version id
itemstrThe item you wish to make a prediction on:For “IMAGE_CLASSIFICATION” and “OBJECT_DETECTION” specify the full file location (including directory).For “TEXT_CLASSIFICATION” or “NER” pass in the string you would like to predict.For “STRUCTURED” pass in the JSON object you want to use for your prediction. For “LANGUAGE_MODEL” pass in the initial prompt to generate text. For “LLM” pass in the next chat message.
application_typeApplicationTypeThe type of prediction you want to make.Default value: IMAGE_CLASSIFICATION;Possible values: IMAGE_CLASSIFICATION, OBJECT_DETECTION, TEXT_CLASSIFICATION, STRUCTURED, NER, OCR, LANGUAGE_MODEL, LLM.

For more details on the Inference object returned, have a look at the Inference properties

Image classification #

On an image classification model version:

item = "path/to/file.png"

result = client.version_inference(new_version.id, item)

Object detection #

On an object detection model version:

item = "path/to/file.png"

result = client.version_inference(new_version.id, item, application_type=ApplicationType.OBJECT_DETECTION)

Text classification #

On a text classification model version:

item = "The text you want to classify."

result = client.version_inference(new_version.id, item, application_type=ApplicationType.TEXT_CLASSIFICATION)

Tabular #

On a strctured/tabular data model version:

inputs = {
    "temperature": "30",
    "day": "Tuesday"
}

result = client.version_inference(my_model.id, item, application_type=ApplicationType.STRUCTURED)

Delete model version #

Delete a model version:

client.delete_model_version(my_model.id, new_version.id)
ParameterTypeDescription
model_idstrThe model id
version_idstrThe model version id

Datasets #

Get all datasets #

Get a list of all your datasets:

datasets = client.get_datasets()

The get_datasets() method does not take any parameter.

Create a dataset #

my_dataset = Dataset(
    name= "Cats & dogs dataset",
    description= "A dataset with labelled images of cats and dogs.",
    multi_label= False,
    notes= "Cats and dogs is often used as a demo dataset.",
    default_splits= True,
    content_type= ContentType.IMAGES
)

my_dataset = client.create_dataset(my_dataset)
ParameterTypeDescription
datasetDatasetThe entire dataset object

Properties in more detail:

PropertyTypeDescription
idstrUnique id for the dataset
created_atstrThe creation date
updated_atstrLast updated date
namestrThe dataset name
descriptionstrThe dataset description
user_idstrThe unique id of the dataset creator
notesstrMore elaborate notes about the dataset
versionsList[DatasetVersion]A list of all the version of the dataset (see below)
multi_labelboolFlag indicating whether tiems can have multiple labels
default_splitsboolCreate default splits (“train”, “valid”, “test”) when creating the dataset.
has_logoboolFlag indicating whether the dataset has a logo or not
logostrName and extension of the logo file
content_typeDatasetContentTypeType of items in the dataset. Possible values IMAGES, TEXT, TABULAR, NER.

Get dataset #

my_dataset = client.get_dataset(my_dataset.id)
ParameterTypeDescription
dataset_idstrThe dataset id

Update dataset #

my_dataset.notes += "~25k labelled images of cats and dogs; 22500 for training, 2000 for validation."

client.update_dataset(my_dataset)
ParameterTypeDescription
datasetDatasetThe entire dataset object

Delete dataset #

client.delete_dataset(my_dataset.id)
ParameterTypeDescription
dataset_idstrThe dataset id
my_dataset = client.upload_dataset_logo(my_dataset.id, folder="directory/to/logo", filename="logo_filename.jpg")
ParameterTypeDescription
dataset_idstrUnique id for the dataset
folderstrName of the folder that contains the logo file (without trailing ‘/’), default value “data”
filenamestrName of the file to be uploaded, default value “logo.jpg”. Supported formats: jpg, jpeg, png.
client.get_dataset_logo(my_dataset)
ParameterTypeDescription
datasetDatasetThe entire dataset object

Dataset Versions #

A dataset can have multiple versions.

Get all dataset versions #

dataset_versions = client.get_dataset_versions(my_dataset.id)
ParameterTypeDescription
dataset_idstrThe dataset id

Create dataset version #

new_dataset_version = DatasetVersion(
    name= "v2",
    description= "Even more images of dogs and cats"
)

new_dataset_version = client.create_dataset_version(my_dataset.id, new_dataset_version)
new_dataset_version
ParameterTypeDescription
dataset_idstrThe dataset id
dataset_versionDatasetVersionThe dataset version object

Datast properties in detail:

PropertyTypeDescription
idstrUnique id for the dataset
created_atstrThe creation date
updated_atstrLast updated date
namestrThe dataset version name
descriptionstrThe dataset version description
user_idstrThe unique id of the dataset creator
labelsList[Label]A list of the labels in this version
dataset_idstrThe id of the dataset this version belongs to
splitsList[DatasetSplit]A list of splits in this dataset version
default_splitstrThe id of split that will be displayed by default
configstrVersion specific configuration

Get a dataset version #

dataset_version = client.get_dataset_version(new_dataset_version.dataset_id, new_dataset_version.id)
ParameterTypeDescription
dataset_idstrThe dataset id

Update a dataset version #

new_dataset_version.description = "Even more image of cats and dogs."

client.update_dataset_version(my_dataset.id, new_dataset_version)
ParameterTypeDescription
dataset_idstrThe dataset id
dataset_versionDatasetVersionThe dataset version object

Delete a dataset version #

client.delete_dataset_version(my_dataset.id, new_dataset_version)
ParameterTypeDescription
dataset_idstrThe dataset id
dataset_versionDatasetVersionThe dataset version object

Dataset Splits #

A dataset version can have multiple splits, usually separating training, validation and test data.

Get all splits #

splits = client.get_dataset_splits(my_dataset.id, new_dataset_version.id)
ParameterTypeDescription
dataset_idstrThe dataset id
dataset_version_idstrThe dataset version id

Create a split #

new_split = DatasetSplit(
    name= "train",
    description= "training data for our model to learn from" 
)

new_split = client.create_dataset_split(my_dataset.id, new_dataset_version.id, new_split)
ParameterTypeDescription
dataset_idstrThe dataset id
dataset_version_idstrThe dataset version id
splitDatasetSplitThe split object

Dataset split properties in more detail:

PropertyTypeDescription
idstrUnique id for the dataset split
created_atstrThe creation date
updated_atstrLast updated date
namestrThe dataset split name
descriptionstrThe dataset split description
user_idstrThe unique id of the dataset split creator
version_idThe unique id of the dataset version the split belongs to

Get a split #

my_split = client.get_dataset_split(my_dataset.id, new_dataset_version.id, new_split.id)
ParameterTypeDescription
dataset_idstrThe dataset id
dataset_version_idstrThe dataset version id
split_idstrThe split id

Update a split #

my_split.description = "Training data"

client.update_dataset_split(my_dataset.id, new_dataset_version.id, my_split)
ParameterTypeDescription
dataset_idstrThe dataset id
dataset_version_idstrThe dataset version id
splitDatasetSplitThe split object

Delete a split #

client.delete_dataset_split(my_dataset.id, new_dataset_version.id, my_split)
ParameterTypeDescription
dataset_idstrThe dataset id
dataset_version_idstrThe dataset version id
splitDatasetSplitThe split object

Dataset Labels #

A dataset version can have multiple labels.

Get all labels #

labels = client.get_dataset_labels(my_dataset.id, new_dataset_version.id.)
ParameterTypeDescription
dataset_idstrThe dataset id
dataset_version_idstrThe dataset version id

Create a label #

new_label = Label(
    name= "Birds",
    description= "Adding birds to the mix"
)

new_label = client.create_dataset_label(my_dataset.id, new_dataset_version.id, new_label)
ParameterTypeDescription
dataset_idstrThe dataset id
dataset_version_idstrThe dataset version id
labelLabelThe label object

Label properties in more detail:

PropertyTypeDescription
idstrUnique id for the dataset split
created_atstrThe creation date
updated_atstrLast updated date
namestrThe dataset label name
descriptionstrThe dataset label description
user_idstrThe unique id of the dataset label creator
version_idstrThe unique id of the dataset label the split belongs to
colorstrThe hex code for the color to be used/associated with this label
indexintMake sure we can always sort the labels in the same sequence (handled automatically)
shortcutstrThe shortcut character that can be used during labelling in the UI

Get a label #

my_label = client.get_dataset_label(my_dataset.id, new_dataset_version.id, new_label.id)
ParameterTypeDescription
dataset_idstrThe dataset id
dataset_version_idstrThe dataset version id
label_idstrThe label id

Update a label #

my_label.color = "#00ff00"

my_label = client.update_dataset_label(my_dataset.id, new_dataset_version.id, my_label.id)
ParameterTypeDescription
dataset_idstrThe dataset id
dataset_version_idstrThe dataset version id
labelLabelThe label object

Delete a label #

client.delete_dataset_label(my_dataset.id, new_dataset_version.id, my_label)
ParameterTypeDescription
dataset_idstrThe dataset id
dataset_version_idstrThe dataset version id
label_idstrThe label id

Dataset Items #

A dataset version can contain many dataset items. Items are used to access and store the actual items in your dataset version such as images or text.

Get dataset items #

items = client.get_dataset_items(my_dataset.id, new_dataset_version.id)
ParameterTypeDescription
dataset_idstrThe dataset id
dataset_version_idstrThe dataset version id
paramsdictAdditional query parameter:Default value: NoneAccepted keys: pageSize: int; Number of items to be returned. Default value: 10. pageCount: int: The number of the page you want to view. onlyUnlabelled: bool: If True only return items that are not labelled/annotated.labelId: str; Only return items that are labelled with this id. splitId: str; Only return items that are part of this split.
# All params are optional, but here we combine them together for demo purposes. 
params = {
    "onlyUnlabelled": True,
    "pageSize": 25,
    "pageCount": 0,
    "labelId": new_label.id,
    "splitId": my_split.id
}

client.get_dataset_items(my_dataset.id, new_dataset_version.id, params)

Create a dataset item #

item = DatasetItem(
    name= "An optional name",
    splits= [my_split]
)

item = client.create_dataset_item(my_dataset.id, new_dataset_version.id, item)
ParameterTypeDescription
dataset_idstrThe dataset id
dataset_version_idstrThe dataset version id
itemDatasetItemThe dataset item object

Dataset item properties in detail:

PropertyTypeDescription
idstrUnique id for the dataset item
created_atstrThe creation date
updated_atstrLast updated date
namestrAn optional name
descriptionstrThe dataset item description
user_idstrThe unique id of the dataset label creator
splitsList[DatasetSplit]The list of dataset splits the dataset item belongs to
annotationsList[Annotation]The list of annotations for the dataset item
extensionstrThe extension for the dataset item.

Get a dataset item #

item = client.get_dataset_item(my_dataset["id"], new_dataset_version["id"], item["id"])
ParameterTypeDescription
dataset_idstrThe dataset id
dataset_version_idstrThe dataset version id
item_idstrThe dataset item id

Update a dataset item #

item.description = "A better description"

client.update_dataset_item(my_dataset.id, new_dataset_version.id, item)
ParameterTypeDescription
dataset_idstrThe dataset id
dataset_version_idstrThe dataset version id
itemDatasetItemThe dataset item object

Delete a dataset item #

client.delete_dataset_item(my_dataset.id, new_dataset_version.id, item.id)
ParameterTypeDescription
dataset_idstrThe dataset id
dataset_version_idstrThe dataset version id
itemstrThe dataset item id

Upload a dataset item image #

client.upload_dataset_item_image(my_dataset.id, new_dataset_version.id, item.id, folder="directory/to/item", filename="item_filename.jpg")
ParameterTypeDescription
dataset_idstrThe dataset id
dataset_version_idstrThe dataset version id
itemDatasetItemThe dataset item object

Download a dataset item image #

download_location = f"{item.id}.{item.extension}"

client.download_dataset_item_image(my_dataset.id, new_dataset_version.id, item.id, download_location, thumbnail=False)
ParameterTypeDescription
dataset_idstrThe dataset id
dataset_version_idstrThe dataset version id
item_idstrThe dataset item id
download_locationstrThe full location and filename of where to save the item image.
thumbnailboolFlag indicating whether to download the full image or its thumbnail.

Annotations #

Annotations link your dataset items to one or more labels in your dataset.

Create an annotation #

Create an annotation with label_id, split_id, and item_id.

annotation = Annotation(
    label_id= my_label.id,
    split_id= my_split.id,
    item_id= item.id
)

annotation = client.annotate(my_dataset.id, new_dataset_version.id, annotation)
ParameterTypeDescription
dataset_idstrThe dataset id
dataset_version_idstrThe dataset version id
annotationAnnotationThe annotation object

Annotation properties in detail:

PropertyTypeDescription
idstrUnique id
created_atstrThe creation date
updated_atstrLast updated date
label_idstrThe label ID the annotation belongs to
item_idstrThe dataset item ID the annotation belongs to
split_idstrThe dataset split ID the annotation belongs to
coordinatesstrThe coordinates of the annotation, used for object_detection and ner.
user_idstrThe unique id of the annotation creator

Update an annotation #

Update a given annotation

annotation.coordinates = "14 20 34 48"

annotation = client.update_annotation(my_dataset.id, new_dataset_version.id, annotation)
ParameterTypeDescription
dataset_idstrThe dataset id
dataset_version_idstrThe dataset version id
annotationAnnotationThe annotation object

Delete an annotation #

Delete a given annotation

client.delete_annotation(my_dataset.id, new_dataset_version.id, annotation.id)
ParameterTypeDescription
dataset_idstrThe dataset id
dataset_version_idstrThe dataset version id
annotation_idstrThe annotation object

Export Dataset Version #

client.download_dataset(
    my_dataset.id,
    new_dataset_version.id,
    split_id="",
    extract_to_dir="data",
    download_file="dataset.zip",
    remove_download_file=True,
    export_format=""
)
ParameterTypeDescription
dataset_idstrThe dataset id
dataset_version_idstrThe dataset version id
split_idstr(Optional) Specify the split_id if you only want to download that dataset split.
extract_to_dirstrThe directory to extract to. Default value: “data”
download_filestrThe name of the download file. Default value: “dataset.zip”
remove_download_fileboolFlag indicating whether to remove or keep the downloaded zip file. Default value: True
export_formatDatasetFormatThe format of your dataset. Supported export formats: FOLDERS, YOLO, CSV, SPACY_NER

A note on the exported formats.

Image classification #

export_format: FOLDERS

The .zip file contains a folder for every dataset split: “train”, “valid”, “test”.

Every dataset split folder contains a ’label-named’ folder for every label: “cats”, “dogs”.

Every label folder contains all the images of the dataset items for the given label in the given dataset split.

Dataset items are named by the “name” and “extension” property of the dataset item. If the “name” property is empty, the “id” is used to name the file.

Dataset items that have no label, will be added to the split folder to which they belong.

.
+-- train/
|  +-- cats/
|     +-- cat1.jpg
|     +-- cat12.jpeg
|  +-- dogs/
|     +-- dog2.jpg
|     +-- dog4.png
|  +-- cat17.jpeg
|  +-- dog15.jpg
+-- valid/
|  +-- cats/
|     +-- cat4.jpg
|     +-- cat8.jpg
|  +-- dogs/
|     +-- dog9.jpg
|     +-- dog14.png
+-- test/
|  +-- cats/
|     +-- cat90.jpg
|     +-- cat34.jpeg
|  +-- dogs/
|     +-- dog81.jpg
|     +-- dog98.png

Text classification #

export_format: FOLDERS

The .zip file contains a folder for every dataset split in your dataset, e.g. “train”, “test”, “unsup”.

Every dataset split folder contains a ’label-named’ folder for every label: “pos”, “neg”.

Every label folder contains all the images of the dataset items for the given label in the given dataset split.

Dataset items are named by the “name” and “extension” property of the dataset item. If the “name” property is empty, the “id” is used to name the file.

Dataset items that have no label, will be added to the split folder to which they belong.

.
+-- train/
|  +-- pos/
|     +-- 1.txt
|     +-- 3.txt
|  +-- neg/
|     +-- 2.txt
|     +-- 4.txt
+-- test/
|  +-- pos/
|     +-- 5.txt
|     +-- 7.txt
|  +-- neg/
|     +-- 6.txt
|     +-- 8.txt
+-- unsup/
|     +-- 13.txt
|     +-- 14.txt

Object detection #

export_format: YOLO

Object detection datasets are exported in YOLO v4 format.

For every dataset split a dataset_split_name.txt file gets created containing all the filenames for that dataset split.

Every dataset item will have an image and a txt file associated with it. The txt file contains a list of annotations in Yolo format: label_index relative_x relative_y relative_width relative_height.

The .names file contains the list of labels, where the index corresponds to the label_index in the annotation .txt files.

The config.json file contains a contains a json object with the color for every label.

.
+-- train.txt
+-- valid.txt
+-- test.txt
+-- 1.jpg
+-- 1.txt
+-- 3.jpg
+-- 3.txt
+-- ...
+-- `dataset_version_id`.names
+-- config.json

A little more detail on the config.json file:

{ 
    "colors": { 
        "label_1": "#36dfd4" , 
        "label_2": "#f0699e" 
    }
}

Tabular #

export_format: CSV

Tabular datasets are exported in a .zip file that contains a dataset_version_id.csv file accompanied by a config.json, which provides more details on how the data should be interpreted.

.
+-- dataset_version_id.csv
+-- config.json

A little more details about the config.json file:

{
    "multi_label": false,
    "label_column": "labels",
    "split_column": "split",
    "label_separator": " ",
    "filename": "dataset_version_id.csv",
    "csv_separator": ","
}
ParameterTypeDescription
multi_labelboolIs the dataset multi label?
label_columnstrThe column name that contains the labels
split_columnstrThe column name that contains the name of the split the row belongs to
label_separatorstrIf multi_label, use this separator to split the labels
filenamestrThe name of the .csv file that contains the data
csv_separatorstrUse this separator to split each row into columns

Named entity recognition #

export_format: SPACY_NER

For an named entity recognition dataset with splits:

  • train
  • valid
  • test

the zip file should be structured in the following way:

.
+-- train.json
+-- valid.json
+-- test.json
+-- config.json
config.json #

The config.json file contains a list of dataset splits, as well as a color code for every label.

{
    "splits": [
        "train",
        "valid",
        "test"
    ],
    "colors": {
        "label_name": "#82ebfd",
        "label_name2": "#e95211"
    }
}
split_name.json #

For every dataset split, there is a ‘split_name’.json file with the following structure:

[
    {
        "id": "the_dataset_item_id",
        "name": "the_original_filename" ,
        "text": "The textual content of the file that has been annotated.",
        "annotations": [{
            "start": 4,
            "end": 11,
            "label": "label_name",
        },
         {
             ...
         }
        ]
    },
    {
        ...
    }
]

Import Dataset Version #

Image classification #

format: FOLDERS

For an image classification dataset with splits:

  • train
  • valid
  • test

and labels:

  • cats
  • dogs

the zip file should be structured in the following way

.
+-- train/
|  +-- cats/
|     +-- cat1.jpg
|     +-- cat12.jpeg
|  +-- dogs/
|     +-- dog2.jpg
|     +-- dog4.png
|  +-- cat17.jpeg
|  +-- dog15.jpg
+-- valid/
|  +-- cats/
|     +-- cat4.jpg
|     +-- cat8.jpg
|  +-- dogs/
|     +-- dog9.jpg
|     +-- dog14.png
+-- test/
|  +-- cats/
|     +-- cat90.jpg
|     +-- cat34.jpeg
|  +-- dogs/
|     +-- dog81.jpg
|     +-- dog98.png

Text classification #

format: FOLDERS

For an text classification dataset with splits:

  • train
  • valid
  • unsup

and labels:

  • pos
  • neg

the zip file should be structured in the following way

.
+-- train/
|  +-- pos/
|     +-- 1.txt
|     +-- 3.txt
|  +-- neg/
|     +-- 2.txt
|     +-- 4.txt
+-- test/
|  +-- pos/
|     +-- 5.txt
|     +-- 7.txt
|  +-- neg/
|     +-- 6.txt
|     +-- 8.txt
+-- unsup/
|     +-- 13.txt
|     +-- 14.txt

Object detection #

format: YOLO

Object detection datasets are imported in YOLO format.

For every dataset split a dataset_split_name.txt file gets created containing all the filenames for that dataset split.

Every dataset item will have an image and a txt file associated with it. The txt file contains a list of annotations in Yolo format: label_index relative_x relative_y relative_width relative_height.

The .names file contains the list of labels, where the index corresponds to the label_index in the annotation .txt files.

The config.json file contains a contains a json object with the color for every label.

.
+-- train.txt
+-- valid.txt
+-- test.txt
+-- 1.jpg
+-- 1.txt
+-- 3.jpg
+-- 3.txt
+-- ...
+-- dataset_version_id.names
+-- config.json

A little more detail on the config.json file:

{ 
    "colors": { 
        "label_name1": "#36dfd4" , 
        "label_name2": "#f0699e" 
    }
}

Tabular #

format: CSV

Tabular datasets are imported from .csv files accompanied by a config.json file that provides more details on how the data should be interpreted.

.
+-- dataset.csv
+-- config.json
Config file #

A little more details about the config.json file:

{
    "multi_label": false,
    "label_column": "labels",
    "split_column": "split",
    "label_separator": " ",
    "filename": "dataset.csv",
    "csv_separator": ","
}
ParameterTypeDescription
multi_labelboolIs the dataset multi label?
label_columnstrThe column name that contains the labels
split_columnstrThe column name that contains the name of the split the row belongs to
label_separatorstrIf multi_label, use this separator to split the labels
filenamestrThe name of the .csv file that contains the data
csv_separatorstrUse this separator to split each row into columns

Named entity recognition #

format: SPACY_NER

For an named entity recognition dataset with splits:

  • train
  • valid
  • test

the zip file should be structured in the following way:

.
+-- train.json
+-- valid.json
+-- test.json
+-- config.json
config.json #

The config.json file contains a list of dataset splits, as well as a color code for every label.

{
    "splits": [
        "train",
        "valid",
        "test"
    ],
    "colors": {
        "label_name": "#82ebfd",
        "label_name2": "#e95211"
    }
}
split_name.json #

For every dataset split, there is a ‘split_name’.json file with the following structure:

[{
    "id": "the_dataset_item_id",
    "name": "the_original_filename" ,
    "text": "The textual content of the file that has been annotated.",
    "annotations": [{
        "start": 4,
        "end": 11,
        "label": "label_name",
    },
     {
         ...
     }
    ]
},
{
    ...
}
]
ParameterTypeDescription
multi_labelboolIs the dataset multi label?
dataset_idstrThe dataset id
dataset_version_idstrThe dataset version id
folderstrThe folder that contains the .zip file. Default value: “data”
filenamestrThe name of the upload file. Default value: “dataset.zip”
formatDatasetFormatThe format of your dataset. Supported import formats: FOLDERS, YOLO, CSV, NER

Jobs #

Get all jobs #

jobs = client.get_jobs(
    application_id="", 
    states=[JobStatus.WAITING, JobStatus.STARTED,
                           JobStatus.FINISHED, JobStatus.ERROR],
    job_types=[JobType.TRAINING]
)
ParameterTypeDescription
application_idstrOnly return jobs for this application id
statesList[JobStatus]Only return jobs for the listed job states
job_typesList[JobType]Only return jobs for the listed job types

Create a job #

Create a training job:

my_job = Job(
    name= "Train a new model for cats and dogs",
    description= "",
    job_type= JobType.TRAINING,
    application_id= "acf26cf4-e19f-425e-b5cb-031830a46df4", # See Applications to get the correct application_id for your job
    dataset_id= "8037b73a-5512-4a45-89e2-29761771fff6", # Update to your dataset_id
    dataset_version_id= "1d3bf8d6-e39b-498e-9c08-680d2f8a3c47", # Update to your dataset_version_id
    items= [
        JobItem(
            name= "image_size",
            value= "224",
            value_type= ValueType.NUMBER,
            label= "Image Size"
        ),
        JobItem(
            name= "arch",
            value= "resnet50",
            value_type= ValueType.TEXT,
            label= "Architecture"
        ),
        JobItem(
            name= "batch_size",
            value= "50",
            value_type= ValueType.NUMBER,
            label= "Batch size"
        )
    ]
)

my_job = client.create_job(my_job)
ParameterTypeDescription
jobJobThe entire Job entity

Job properties in detail.

PropertyTypeDescription
idstrUnique id for the application
namestrName of your job
descriptionstrDescribe the job you are running
job_typeJobTypeThe type of job: TRAINING, VALIDATION, CONVERSION.
application_idstrThe application id of the job. (see applications
statusJobStatusThe status of your job: WAITING, STARTED, FINISHED, ERROR.
status_messagestrMore information about the status of your job; gets updated by the agent handling the job.
user_idstrThe id of the user that requested the job.
cpu_start_timestrThe CPU compute starting time.
cpu_end_timestrThe CPU compute end time.
gpu_start_timestrThe GPU compute starting time.
gpu_end_timestrThe GPU compute end time.
agent_namestrThe agent responsible for handling the job.
dataset_idstrThe id of the dataset being used.
dataset_version_idstrThe id of the dataset version being used.
model_idstrThe id of the model the finished job will be added to. If left blank upon job creation, a new model will be created, and its id will be updated in this property.
model_version_idstrThe id of the model version the job resulted in. Leave blank upon job creation. The property will/should be updated after uploading the model.
start_model_idstrThe model id used to continue training from. (requires continual_training support in the specified application_id)
start_model_version_idstrThe model version id used to continue training from. (requires continual_training support in the specified application_id)
created_atstrThe creation date
updated_atstrLast updated date
itemsList[JobItem]A list of job specific steps and settings.

JobItem in more detail:

PropertyTypeDescription
idstrUnique id for the job item
namestrName of your job
descriptionstrDescribe the job you are running
job_idstrThe job id it belongs to
valuestrThe item value
default_valuestrThe item default value
value_typeValueTypeThe type of the item value: INT, FLOAT, TEXT, MULTI, BOOL, STRING_ARRAY
labelstrThe label for the item
created_atstrThe creation date
updated_atstrLast updated date

Supported job items per applications #

Every job can be configured depending on the AI application you are using. Below we list all the possibilties per AI application.

Image classification #

Architecture

job_item = JobItem(
    name= "arch",
    value= "resnet50",
    value_type= ValueType.TEXT,
    label= "Architecture"
)

Image size:

job_item = JobItem(
    name= "image_size",
    value= "224",
    value_type= ValueType.INT,
    label= "Image Size"
)

Image resize:

job_item = JobItem(
    name= "image_resize",
    value= "460",
    value_type= ValueType.INT,
    label= "Image Resize"
)

Batch size:

job_item = JobItem(
    name= "batch_size",
    value= "32",
    value_type= ValueType.INT,
    label= "Batch Size"
)

Number of epochs:

job_item = JobItem(
    name= "nb_epochs",
    value= "50",
    value_type= ValueType.INT,
    label= "Number of epochs"
)

Mix up %:

job_item = JobItem(
    name= "mixup",
    value= "0",
    value_type= ValueType.INT,
    label= "Mix up %"
)

Seed:

job_item = JobItem(
    name= "seed",
    value= "12",
    value_type= ValueType.INT,
    label= "Seed"
)

Object detection #

Architecture:

job_item = JobItem(
    name= "arch",
    value= "yolov4-tiny",
    value_type= ValueType.INT,
    label= "Architecture"
)

Image size:

job_item = JobItem(
    name= "image_size",
    value= "416",
    value_type= ValueType.INT,
    label= "Image Size"
)

Batch size:

job_item = JobItem(
    name= "batch_size",
    value= "64",
    value_type= ValueType.INT,
    label= "Batch Size"
)

Subdivisions:

job_item = JobItem(
    name= "subdivisions",
    value= "8",
    value_type= ValueType.INT,
    label= "Subdivisions"
)

Named entity recognition (NER) #

Language:

job_item = JobItem(
    name= "language",
    value= "en",
    value_type= ValueType.TEXT,
    label= "Language"
)

Optimize for:

job_item = JobItem(
    name= "optimize",
    value= "accuracy",
    value_type= ValueType.MULTI,
    label= "Optimize for"
)

Tabular #

Fast.ai Tabular #

Architecture:

job_item = JobItem(
    name= "arch",
    value= "[200 - 100]",
    value_type= ValueType.TEXT,
    label= "Architecture"
)

Batch size:

job_item = JobItem(
    name "batch_size",
    value= "32",
    value_type= ValueType.INT,
    label= "Batch Size"
)

Number of epochs:

job_item = JobItem(
    name= "nb_epochs",
    value= "50",
    value_type= Valuetype.INT,
    label= "Number of epochs"
)

Minimal feature importance (%):

job_item = JobItem(
    name= "min_feat_importance",
    value= "0",
    value_type= ValueType.INT,
    label= "Minimal feature importance (%)"
)

Add info:

job_item = JobItem(
    name= "add_info",
    value= True,
    value_type= ValueType.BOOL,
    label= "Add info"
)
XGBoost #

Booster:

job_item = JobItem(
    name= "booster",
    value= "gbtree", # "gbtree", "dart", "gblinear"
    value_type= ValueType.MULTI,
    label= "Booster"
)

Nb Estimators:

job_item = JobItem(
    name= "nb_estimators",
    value= "100",
    value_type= ValueType.INT,
    label= "Nb Estimators"
)

Max depth:

job_item = JobItem(
    name= "max_depth",
    value= "6",
    value_type= ValueType.INT,
    label= "Number of epochs"
)

Learning rate (eta):

job_item = JobItem(
    name= "learning_rate",
    value= "0.3",
    value_type= ValueType.FLOAT,
    label= "Learning rate (eta)"
)

Subsample (0 - 1):

job_item = JobItem(
    name= "subsample",
    value= "0.5",
    value_type= ValueType.FLOAT,
    label= "Subsample (0 - 1)"
)

Minimal feature importance (%):

job_item = JobItem(
    name= "min_feat_importance",
    value= "0",
    value_type= ValueType.INT,
    label= "Minimal feature importance (%)"
)

Add info:

job_item = JobItem(
    name= "add_info",
    value= True,
    value_type= ValueType.BOOL,
    label= "Add info"
)
CatBoost #

Nb Estimators:

job_item = JobItem(
    name= "nb_estimators",
    value= "100",
    value_type= ValueType.INT,
    label= "Nb Estimators"
)

Learning rate:

job_item = JobItem(
    name= "learning_rate",
    value= "0.1",
    value_type= ValueType.FLOAT,
    label= "Learning rate"
)

Minimal feature importance (%):

job_item = JobItem(
    name= "min_feat_importance",
    value= "0",
    value_type= ValueType.INT,
    label= "Minimal feature importance (%)"
)

Add info:

job_item = JobItem(
    name= "add_info",
    value= True,
    value_type= ValueType.BOOL,
    label= "Add info"
)
LightGBM #

Boosting Type:

job_item = JobItem(
    name= "booster",
    value= "gbdt", # "gbdt", "dart", "rf"
    value_type= ValueType.MULTI,
    label= "Boosting Type"
)

Nb Estimators:

job_item = JobItem(
    name= "nb_estimators",
    value= "100",
    value_type= ValueType.INT,
    label= "Nb Estimators"
)

Max depth:

job_item = JobItem(
    name= "max_depth",
    value= "-1",
    value_type= ValueType.INT,
    label= "Number of epochs"
)

Learning rate:

job_item = JobItem(
    name= "learning_rate",
    value= "0.1",
    value_type= ValueType.FLOAT,
    label= "Learning rate"
)

Max number of leaves:

job_item = JobItem(
    name= "num_leaves",
    value= "31",
    value_type= ValueType.INT,
    label= "Max number of leaves"
)

Minimal feature importance (%):

job_item = JobItem(
    name= "min_feat_importance",
    value= "0",
    value_type= ValueType.INT,
    label= "Minimal feature importance (%)"
)

Add info:

job_item = JobItem(
    name= "add_info",
    value= True,
    value_type= ValueType.BOOL,
    label= "Add info"
)

Get a job #

my_job = client.get_job(my_job.id)
ParameterTypeDescription
job_idstrThe job id

Update a job #

my_job.description = "Update any property"

my_job = client.update_job(my_job)
ParameterTypeDescription
jobJobThe entire job object

Delete a job #

client.delete_job(my_job.id)
ParameterTypeDescription
job_idstrThe job id

Applications #

SeeMe.ai supports multiple types of AI models, frameworks, and framework versions. To access, manage, and describe these, we use applications:

Get all supported applications #

Print a list of the applications in your SeeMe.ai client:

client.applications

Every application has the following properties:

PropertyTypeDescription
idstrUnique id for the application
created_atstrThe creation date
updated_atstrLast updated date
frameworkstrThe framework used to train the model
framework_versionstrThe framework version used to train the model
base_frameworkstrThe base framework used by the framework
base_framework_versionstrThe base framework version used by the framework
applicationApplicationTypeThe type of application: “image_classification”, “object_detection”, “text_classification”, “structured”.
inference_hoststrThe internal host of the inference engine (if not used at the edge)
can_convert_to_onnxboolModels can automatically be converted to ONNX
can_convert_to_onnx_int8boolModels can automatically be converted to ONNX int8
can_convert_to_coremlboolModels can automatically be converted to Core ML
can_convert_to_tensorflowboolModels can automatically be converted to Tensorflow (mostly for further conversion to Tensorflow Lite for example)
can_convert_to_tfliteboolModels can automatically be converted to Tensorflow Lite
has_embedding_supportboolThe application supports embeddings
continual_trainingboolContinue training from your own previous model version.
has_labels_fileboolA labels file is available
inference_extensionsstrThe list of files with these extensions that need to be uploaded before the model can perform predictions.

You can update the local list of applications by:

client.update_applications()

Get the application id #

Before you can upload and use your model to make predictions, you need to add an application_id:

from torch import __version__ as torch_version
from fastai import __version__ as fastai_version

# Get the application_id for your framework (version).
application_id = client.get_application_id(
    base_framework=Framework.PYTORCH,
    framework=Framework.FASTAI,
    base_framework_version=torch_version,
    framework_version=fastai_version,
    application=ApplicationType.IMAGE_CLASSIFICATION
)

If your combination is not supported, you will get a NotImplementedError with contact information of support@seeme.ai.

Feedback #

For questions, feedback, corrections: contact support@seeme.ai.