SeeMe.ai Python SDK #
The Python SDK is a convenient wrapper to communicate with the SeeMe.ai API and gives easy access to all of your datasets, models, jobs, … on the platform.
This document provides a detailed overview of all the methods and their parameters.
Get Started #
Installation #
Install the SDK from the command line:
$ pip install --upgrade seeme
or in your Jupyter notebook:
!pip install -Uq seeme
Verify the version you installed:
import seeme
seeme.__version__
Create a client #
Create a client to interact with the SeeMe.ai API, allowing you to manage models, datasets, predictions and jobs.
from seeme import Client
client = Client()
Parameters | Required | Description |
---|---|---|
username | No | username for the account you want to use |
apikey | No | API key for the username you want user |
backend | No | backend the client communicates with, default value https://api.seeme.ai/api/v1 |
env_file | No | .env file containing the username , apikey , and backend , default value “.env” |
Register #
Register a new user:
my_username = "my_username"
my_email = "my_email@mydomain.com"
my_password = "supersecurepassword"
my_firstname = "firstname"
my_name = "last name"
client.register(
username=my_username,
email=my_email,
password=my_password,
firstname=my_firstname,
name=my_name
)
Log in #
Username / password #
Use your username and password to log in:
username = ""
password = ""
client.login(username, password)
Username / apikey #
If you have a username and apikey, you can create a client that is ready to go:
my_username = ""
my_apikey = ""
client = Client(username=my_username, apikey=my_apikey)
.env file #
Log in using and .env
file:
client = Client(env_file=".my_env_file")
The .env
file should be located where the client is created and contain the following values:
Variables | Required | Description |
---|---|---|
SEEME_USERNAME | Yes | username for the account you want to use |
SEEME_APIKEY | Yes | API key for the username you want user |
SEEME_BACKEND | Yes | backend the client communicates with, usually: https://api.seeme.ai/api/v1 |
Log out #
client.logout()
Advanced #
Custom backend #
If you are running your own SeeMe.ai deployment, you can pass in the url you want the SeeMe.ai client to communicate with:
alternative_url = "http://seeme.yourdomain.com/api/v1"
client = Client(backend=alternative_url)
Models #
Get all models #
Get a list of all models you have access to.
models = client.get_models()
These models are divided in three groups:
- public: models that are available to everyone;
- owned: models you created;
- shared: models that are shared privately with you.
Public models
Public models are provided by SeeMe.ai or other users.
public_models = [model for model in models if model["public"]]
public_models
Your own models
A list of the models you created:
own_models = [ model for model in models if model["user_id"] == client.user_id]
Shared models
A list of models that others have shared with you in private.
shared_with_me = [model for model in models if model["shared_with_me"]]
Create a model #
application_id = client.get_application_id("pytorch", "fastai", "2.0.0", "2.7.12" )
application_id
my_model = {
"name": "Cats and dogs",
"description": "Recognize cats and dogs in pictures.",
"privacy_enabled": False,
"auto_convert": True,
"application_id": application_id
}
my_model = client.create_model(my_model)
Every model has the following properties:
{
'id': '22791060-71f5-4e71-935e-28b52ef4c047',
'created_at': '2021-10-28T08:49:06.698391719Z',
'updated_at': '2021-10-28T08:49:06.698391719Z',
'name': 'Cats and dogs',
'description': 'Recognize cats and dogs in pictures.',
'active_version_id': 'fc92b1fe-2ee0-4d99-9a09-0a37d0c68ea1',
'user_id': 'd7159432-f218-44ac-aebe-e5d661d62862',
'can_inference': False,
'kind': '',
'has_logo': False,
'logo': '',
'public': False,
'config': '',
'application_id': '',
'has_ml_model': False,
'has_onnx_model': False,
'has_tflite_model': False,
'has_labels_file': False,
'shared_with_me': False,
'auto_convert': True,
'privacy_enabled': False
}
Properties in more detail:
Property | Description |
---|---|
id | Unique id for the model |
created_at | The creation date |
updated_at | Last updated date |
name | The model name |
description | The model description |
accuracy | The model accuracy (deprecated) |
user_id | The user id of the model creator |
can_inference | Flag indicating whether the model can make predictions or not |
kind | Type of AI model, possible values: “image_classification”, “object_detection”, “text_classification”, “structured”, “language_model” |
has_logo | Flag indicating whether the model has a logo or not |
logo | Name and extension of the logo file (mostly for internal purpose) |
public | Flag indicating whether the model is public or not |
config | Additional config stored in a JSON string |
active_version_id | The id of the current model version (see versions below) |
application_id | The application id (see applications) |
has_ml_model | Flag indicating whether the model has a Core ML model |
has_onnx_model | Flag indicating whether the model has an ONNX model |
has_tflite_model | Flag indicating whether the model has a Tensorflow Lite model |
has_labels_file | Flag indicating whether a file will all the labels (classes) is available |
shared_with_me | Flag indicating whether the model has been shared with you |
auto_convert | Flag indicating whether the model will be automatically converted to the supported model formats (see applications). Default value: True . |
privacy_enabled | Flag indicating whether privacy is enabled. If set to ‘True’, no inputs (images, text files, …) will be stored on the server, or the mobile/edge device. Default value: False . |
# DEPRECATED
# just calls `create_model` underneath
my_model = client.create_full_model(my_model)
Get a model #
Use the model id
to get all the metadata of the model:
client.get_model(my_model["id"])
Parameters | Description |
---|---|
model_id | Unique id for the model |
Update a model #
Update any property of the model:
my_model = client.get_model(my_model["id"])
my_model["description"] = "Updated for documentation purposes"
client.update_model(my_model)
Parameters | Description |
---|---|
model | The entire model object |
Delete a model #
Delete a model using its id
.
client.delete_model(my_model["id"])
Parameters | Description |
---|---|
model | The entire model object |
Upload a model file #
You can upload the model file by calling the upload_model
. Make sure the application_id
is set to the desired AI application, framework, and version.
my_model = client.upload_model(my_model["id"], folder="directory/to/model", filename="your_exported_model_file.pkl")
Parameters | Description |
---|---|
model_id | Unique id for the model |
folder | Name of the folder that contains the model file (without trailing ‘/’), default value “data” |
filename | Name of the file to be uploaded, default value “export.pkl” |
This returns an updated my_model
, where if successful the can_inference
should now be set to True
.
If auto_convert
is enabled and depending on the used application_id
(see Applications below), the model will have updated values for has_ml_model
, has_tflite_model
, has_onnx_model
, has_labels_file
.
Download model file(s) #
Download the file(s) associated with the current active (production) model.
client.download_active_model(my_model, asset_type="pkl", download_folder="my_docs")
Parameters | Description |
---|---|
model | The entire model object |
asset_type | The model type you want to download. Default: pkl ; Possible values: mlmodel , tflite , onnx , names , labels |
download_folder | The folder where you would like to download the model. Default: . (i.e. current directory) |
If the asset_type
exists, the model file will be downloaded to my_model["active_model_id"].{asset_type}
. One exception, the labels
file will receive a .txt
extension.
Make a prediction #
result = client.predict(model_id, item, input_type="image_classification")
Parameters | Description |
---|---|
model_id | Unique id for the model |
item | The item you wish to make a prediction on:For “image_classification” and “object_detection” specify the full file location (including directory).For “text_classification” pass in the string you would like to predict.For “structured” pass in a JSON dump of the JSON object you want to use for your prediction. For “language_model” pass in the initial seed to generate text. |
input_type | The type of prediction you want to make.Default value: “image_classification”;Possible values: “image_classification”, “object_detection”, “text_classification”, “structured”, “ner”, “ocr”, “language_model”. |
Image classification #
On an image classification model:
item = "path/to/file.png"
result = client.predict(my_model["id"], item)
# Or
# result = client.predict(my_model["id"], item, input_type="image_classification")
Object detection #
On an object detection model:
!ls
item = "path/to/file.png"
result = client.predict(my_model["id"], item, input_type="object_detection")
Text classification #
On a text classification model:
item = "The text you want to classify."
result = client.predict(my_model["id"], item, input_type="text_classification")
Named entity recognition #
On a text named entity recognition model:
item = "The text where I want to extract information out of."
result = client.predict(my_model["id"], item, input_type="ner")
Language model #
On a language model:
item = "The story of AI language models is"
result = client.predict(my_model["id"], item, input_type="langauge_model")
Tabular #
On a structured/tabular data model:
import json
inputs = {
"temperature": "30",
"day": "Tuesday"
}
item = json.dumps(inputs)
result = client.predict(my_model["id"], item, input_type="structured")
An example reply for a prediction:
{
'id': '891ed37a-fe77-415c-a4ab-3764e68aaa40',
'created_at': '2021-10-28T12:33:17.277236872Z',
'update_at': '2021-10-28T12:33:17.563600762Z',d
'name': '10789.jpg',
'description': '',
'prediction': 'cats',
'confidence': 0.9846015,
'model_id': '56086c08-c552-4159-aafa-ac4b25a64fda',
'model_version_id': '1a38b7ab-070f-4686-9a47-4ba7a76a5168',
'extension': 'jpg',
'user_id': 'd7159432-f218-44ac-aebe-e5d661d62862',
'error_reported': False,
'error': '',
'application_id': 'b4b9aaf0-cb37-4629-8f9b-8877aeb09a53',
'inference_host': 'image-pt-1-8-1-fa-2-3-1',
'inference_time': '284.418389ms',
'end_to_end_time': '',
'dataset_item_id': '',
'result': '',
'inference_items': None,
'hidden': False,
'privacy_enabled': False
}
Properties in more detail:
Property | Description |
---|---|
id | Unique id for the model |
created_at | The creation date |
updated_at | Last updated date |
name | Stores the input value/filename. For “image_classification” and “object_detection” the original filename; For ’text_classification" and “structured” the actual text/structured input. |
description | Additional info. (Optional) |
prediction | The model prediction or value |
confidence | The prediction confidence |
model_id | The id of the model the prediction was made on. |
model_version_id | The id of the model_version the prediction was made on (see ‘Model Versions’ below. |
extension | The extension of the predicted image, in case of “image_classification” and “object_detection” |
user_id | The id of the user that requested the prediction |
error_reported | Flag indicating whether a user has reported the prediction is/might be wrong. |
error | Contains the error if something went wrong. |
application_id | The application_id used to make the prediction. |
inference_host | The name of the inference engine used to make the prediction. |
inference_time | The time it took to make the prediction. |
end_to_end_time | Inference time including upload and return (if relevant) |
dataset_item_id | The id of the dataset_item that was used for the prediction. Used to evaluate datasets (see Datasets below) |
Result | A string version of the object detection prediction (legacy). |
inference_items | A list of individual predictions, when using “object_detection”. |
hidden | Flag indicating whether this prediction has been hidden in the Data Engine (TODO: link to Data Engine docs) |
privacy_enabled | Flag indicating whether this prediction was made when the model was in privacy_enabled mode. |
# DEPRECATED
# Used internally by `predict` but will be removed.
result = client.inference(my_model["id"], item, input_type="image_classification")
Upload model logo #
my_model = client.upload_logo(my_model["id"], folder="directory/to/logo", filename="logo_filename.jpg")
Parameters | Description |
---|---|
model_id | Unique id for the model |
folder | Name of the folder that contains the logo file (without trailing ‘/’), default value “data” |
filename | Name of the file to be uploaded, default value “logo.jpg”. Supported formats: jpg , jpeg , png . |
Download model logo #
client.get_logo(my_model)
Parameters | Description |
---|---|
model | The entire model object |
Model Versions #
An AI Model has one or multiple versions associated with it:
- the current live version
- previous versions
- future versions
Get all model versions #
Get a list of all versions for a specific model.
versions = client.get_model_versions(my_model["id"])
Parameters | Description |
---|---|
model_id | The model id |
Create a model version #
new_version = {
"name": "A higher accuracy achieved",
"application_id": "b4b9aaf0-cb37-4629-8f9b-8877aeb09a53"
}
new_version = client.create_model_version(my_model["id"], new_version)
Parameters | Description |
---|---|
model_id | The model id |
version | The model version object |
Every model version has the following properties. Note that these are partially similar to the model properties:
{
'id': '53535e2f-4b36-4235-b8d6-ed87a5d2f323',
'created_at': '2021-10-29T14:16:46.603463402Z',
'update_at': '2021-10-29T14:16:46.603463402Z',
'name': 'A higher accuracy achieved',
'description': '',
'model_id': '22791060-71f5-4e71-935e-28b52ef4c047',
'user_id': 'd7159432-f218-44ac-aebe-e5d661d62862',
'can_inference': False,
'has_logo': False,
'logo': '',
'config': '',
'application_id': '878aea66-16b7-4f10-9d82-a2a92a35728a',
'version': '',
'version_number': 4,
'has_ml_model': False,
'has_onnx_model': False,
'has_tflite_model': False,
'has_labels_file': False,
'dataset_version_id': '',
'job_id': ''
}
Properties in more detail.
Shared with the model entity:
Property | Description |
---|---|
id | Unique id for the model version |
created_at | The creation date |
updated_at | Last updated date |
name | The model version name |
description | The model version description |
accuracy | The model version accuracy |
user_id | The user id of the model version creator |
can_inference | Flag indicating whether the model version can make predictios or not |
has_logo | Flag indicating whether the model has a logo or not (not used for now) |
logo | Name and extension of the logo file (mostly for internal purpose) |
config | Additional config stored in a JSON string |
application_id | The application ID (see applications below) |
has_ml_model | Flag indicating whether the model has a Core ML model |
has_onnx_model | Flag indicating whether the model has an ONNX model |
has_tflite_model | Flag indicating whether the model has a Tensorflow Lite model |
has_labels_file | Flag indicating whether a file will all the labels (classes) is available |
Different from the model entity.
Property | Description |
---|---|
model_id | The id of the model this version belongs to. |
version | The label of the version |
version_number | Automatically incrementing number of the version. |
dataset_version_id | The id of the dataset version this model version was trained on. |
job_id | The id of the job used to build this model version. |
Get model version #
Use the model and version id to get the full model version:
model_version = client.get_model_version(my_model["id"], new_version["id"])
Parameters | Description |
---|---|
model_id | The model id |
version_id | The model version id |
Update model version #
Update any property of the model version:
model_version["description"] = "We hit SOTA!"
client.update_model_version(model_version)
Parameters | Description |
---|---|
model version | The entire model version object |
Upload model file for a version #
Upload a model file (or model files) for a new version of your AI model.
Make sure the application_id
is set to the desired AI application, framework, and version.
client.upload_model_version(new_version, folder="directory/to/model", filename="your_exported_model_file_v2.pkl")
Parameters | Description |
---|---|
version | The entire model version object |
folder | Name of the folder that contains the model file (without trailing ‘/’), default value “data” |
filename | Name of the file to be uploaded, default value “export.pkl” |
Download a specific version #
client.download_model(my_version, asset_type="pkl", download_folder="data")
Parameters | Description |
---|---|
version | The entire model version object |
asset_type | The model type you want to download. Default: pkl ; Possible values: pkl , mlmodel , tflite , onnx , names , labels |
download_folder | The folder where you would like to download the model. Default: . (i.e. current directory) |
If the asset_type
exists, the model file will be downloaded to my_model["active_model_id"].{asset_type}
. One exception, the labels
file will receive a .txt
extension.
# DEPRECATED
client.download_version(my_version, asset_type="pkl")
Make a prediction on this version #
client.version_inference(model_version_id, item, input_type="image_classification")
Parameters | Description |
---|---|
model_version_id | Unique id for the model version |
item | The item you wish to make a prediction on:For “image_classification” and “object_detection” specify the full file location (including directory).For “text_classification” pass in the string you would like to predict.For “structured” pass in a JSON dump of the JSON object you want to use for your prediction. |
input_type | The type of prediction you want to make.Default value: “image_classification”;Possible values: “image_classification”, “object_detection”, “text_classification”, “structured”. |
Image classification #
On an image classification model version:
item = "path/to/file.png"
result = client.version_inference(new_version["id"], item)
# Or
# result = client.version_inference(new_version["id"], item, input_type="image_classification")
Object detection #
On an object detection model version:
item = "path/to/file.png"
result = client.version_inference(new_version["id"], item, input_type="object_detection")
Text classification #
On a text classification model version:
item = "The text you want to classify."
result = client.version_inference(new_version["id"], item, input_type="text_classification")
Tabular #
On a strctured/tabular data model version:
import json
inputs = {
"temperature": "30",
"day": "Tuesday"
}
item = json.dumps(inputs)
result = client.version_inference(my_model["id"], item, input_type="structured")
An example reply for a prediction:
{
'id': '891ed37a-fe77-415c-a4ab-3764e68aaa40',
'created_at': '2021-10-28T12:33:17.277236872Z',
'update_at': '2021-10-28T12:33:17.563600762Z',d
'name': '10789.jpg',
'description': '',
'prediction': 'cats',
'confidence': 0.9846015,
'model_id': '56086c08-c552-4159-aafa-ac4b25a64fda',
'model_version_id': '1a38b7ab-070f-4686-9a47-4ba7a76a5168',
'extension': 'jpg',
'user_id': 'd7159432-f218-44ac-aebe-e5d661d62862',
'error_reported': False,
'error': '',
'application_id': 'b4b9aaf0-cb37-4629-8f9b-8877aeb09a53',
'inference_host': 'image-pt-1-8-1-fa-2-3-1',
'inference_time': '284.418389ms',
'end_to_end_time': '',
'dataset_item_id': '',
'result': '',
'inference_items': None,
'hidden': False,
'privacy_enabled': False
}
Properties in more detail:
Property | Description |
---|---|
id | Unique id for the model |
created_at | The creation date |
updated_at | Last updated date |
name | Stores the input. For “image_classification” and “object_detection” the original filename; For ’text_classification" and “structured” the actual text/structured input. |
description | Additional info. (Optional) |
prediction | The model prediction or value |
confidence | The prediction confidence |
model_id | The id of the model the prediction was made on. |
model_version_id | The id of the model_version the prediction was made on (see ‘Model Versions’ below. |
extension | The extension of the predicted image, in case of “image_classification” and “object_detection” |
user_id | The id of the user that requested the prediction |
error_reported | Flag indicating whether a user has reported the prediction is/might be wrong. |
error | Contains the error if something went wrong. |
application_id | The application_id used to make the prediction. |
inference_host | The name of the inference engine used to make the prediction. |
inference_time | The time it took to make the prediction. |
end_to_end_time | Inference time including upload and return (if relevant) |
dataset_item_id | The id of the dataset_item that was used for the prediction. Used to evaluate datasets (see Datasets below) |
Result | A string version of the object detection prediction (legacy). |
inference_items | A list of individual predictions, when using “object_detection”. |
hidden | Flag indicating whether this prediction has been hidden in the Data Engine (TODO: link to Data Engine docs) |
privacy_enabled | Flag indicating whether this prediction was made when the model was in privacy_enabled mode. |
# DEPRECATED
# Used internally by `version_inference` but will be removed.
result = client.version_p(my_model["id"], item, input_type="image_classification")
Delete model version #
Delete a model version:
client.delete_model_version(my_model["id"], new_version["id"])
Parameters | Description |
---|---|
model_id | The model id |
version_id | The model version id |
Datasets #
Get all datasets #
Get a list of all your datasets:
datasets = client.get_datasets()
The get_datasets()
method does not take any parameters.
Create a dataset #
from seeme import DATASET_CONTENT_TYPE_IMAGES
my_dataset = {
"name": "Cats & dogs dataset",
"description": "A dataset with labelled images of cats and dogs.",
"multi_label": False,
"notes": "Cats and dogs is often used as a demo dataset.",
"default_splits": True,
"content_type": DATASET_CONTENT_TYPE_IMAGES
}
my_dataset = client.create_dataset(my_dataset)
Parameters | Description |
---|---|
dataset | The entire dataset object |
Every dataset has the following properties:
{
'id': '2bb8b9c1-c027-44c2-b438-6d50911964bd',
'created_at': '2021-11-02T10:52:56.386712086Z',
'update_at': '2021-11-02T10:52:56.386712086Z',
'deleted_at': None,
'name': 'Cats & dogs dataset',
'description': 'A dataset with labelled images of cats and dogs.',
'user_id': 'd7159432-f218-44ac-aebe-e5d661d62862',
'notes': 'Cats and dogs is often used as a demo dataset.',
'versions': [],
'multi_label': False,
'default_splits': True,
'has_logo': False,
'logo': '',
'content_type': 'images'
}
Properties in more detail:
Property | Description |
---|---|
id | Unique id for the dataset |
created_at | The creation date |
updated_at | Last updated date |
name | The dataset name |
description | The dataset description |
user_id | The unique id of the dataset creator |
notes | More elaborate notes about the dataset |
versions | A list of all the version of the dataset (see below) |
multi_label | Flag indicating whether tiems can have multiple labels |
default_splits | Create default splits (“train”, “valid”, “test”) when creating the dataset. |
has_logo | Flag indicating whether the dataset has a logo or not |
logo | Name and extension of the logo file (mostly for internal purpose) |
content_type | Type of items in the dataset. Possible values DATASET_CONTENT_TYPE_IMAGES , DATASET_CONTENT_TYPE_TEXT , DATASET_CONTENT_TYPE_TABULAR , DATASET_CONTENT_TYPE_NER . |
Get dataset #
my_dataset = client.get_dataset(my_dataset["id"])
Parameters | Description |
---|---|
dataset_id | The dataset id |
Update dataset #
my_dataset["notes"] += "~25k labelled images of cats and dogs; 22500 for training, 2000 for validation."
client.update_dataset(my_dataset)
Parameters | Description |
---|---|
dataset | The entire dataset object |
Delete dataset #
client.delete_dataset(my_dataset)
Parameters | Description |
---|---|
dataset | The entire dataset object |
Upload dataset logo #
my_dataset = client.upload_dataset_logo(my_dataset["id"], folder="directory/to/logo", filename="logo_filename.jpg")
Parameters | Description |
---|---|
dataset_id | Unique id for the dataset |
folder | Name of the folder that contains the logo file (without trailing ‘/’), default value “data” |
filename | Name of the file to be uploaded, default value “logo.jpg”. Supported formats: jpg , jpeg , png . |
Download dataset logo #
client.get_dataset_logo(my_dataset)
Parameters | Description |
---|---|
model | The entire model object |
Dataset Versions #
A dataset can have multiple versions.
Get all dataset versions #
dataset_versions = client.get_dataset_versions(my_dataset["id"])
Parameters | Description |
---|---|
dataset_id | The dataset id |
Create dataset version #
new_dataset_version = {
"name": "v2",
"description": "Even more images of dogs and cats"
}
new_dataset_version = client.create_dataset_version(my_dataset["id"], new_dataset_version)
new_dataset_version
Parameters | Description |
---|---|
dataset_id | The dataset id |
dataset_version | The dataset version object |
Every dataset_version has the following properties:
{
'id': 'd774d5bd-40fd-4c5f-8043-f1a8ed88e0d2',
'created_at': '2021-11-02T13:17:29.24027517Z',
'update_at': '2021-11-02T13:17:29.24027517Z',
'name': 'v2',
'description': 'Even more images of dogs and cats',
'labels': None,
'user_id': 'd7159432-f218-44ac-aebe-e5d661d62862',
'dataset_id': '2bb8b9c1-c027-44c2-b438-6d50911964bd',
'splits': None,
'default_split': '',
'config': ''
}
Properties in more detail:
Property | Description |
---|---|
id | Unique id for the dataset |
created_at | The creation date |
updated_at | Last updated date |
name | The dataset version name |
description | The dataset version description |
user_id | The unique id of the dataset creator |
labels | A list of the labels in this version |
dataset_id | The id of the dataset this version belongs to |
splits | A list of splits in this dataset version |
default_split | The id of split that will be shown by default |
config | Version specific configuration |
Get a dataset version #
dataset_version = client.get_dataset_version(new_dataset_version["dataset_id"], new_dataset_version["id"])
Parameters | Description |
---|---|
dataset_id | The dataset id |
Update a dataset version #
new_dataset_version["description"] = "Even more image of cats and dogs."
client.update_dataset_version(my_dataset["id"], new_dataset_version)
Parameters | Description |
---|---|
dataset_id | The dataset id |
dataset_version | The dataset version object |
Delete a dataset version #
client.delete_dataset_version(my_dataset["id"], new_dataset_version)
Parameters | Description |
---|---|
dataset_id | The dataset id |
dataset_version | The dataset version object |
Dataset Splits #
A dataset version can have multiple splits, usually separating training, validation and test data.
Get all splits #
splits = client.get_dataset_splits(my_dataset["id"], new_dataset_version["id"])
Parameters | Description |
---|---|
dataset_id | The dataset id |
dataset_version_id | The dataset version id |
Create a split #
new_split = {
"name": "train",
"description": "training data for our model to learn from"
}
new_split = client.create_dataset_split(my_dataset["id"], new_dataset_version["id"], new_split)
Parameters | Description |
---|---|
dataset_id | The dataset id |
dataset_version_id | The dataset version id |
split | The split object |
Every dataset split has the following properties:
{
'id': '3faa6ddb-f5d6-4dda-92a9-beb383126072',
'created_at': '2021-11-02T14:02:15.303555455Z',
'updated_at': '2021-11-02T14:02:15.303555455Z',
'name': 'train',
'description': 'training data for our model to learn from',
'user_id': 'd7159432-f218-44ac-aebe-e5d661d62862',
'version_id': 'd774d5bd-40fd-4c5f-8043-f1a8ed88e0d2'
}
Properties in more detail:
Property | Description |
---|---|
id | Unique id for the dataset split |
created_at | The creation date |
updated_at | Last updated date |
name | The dataset split name |
description | The dataset split description |
user_id | The unique id of the dataset split creator |
version_id | The unique id of the dataset version the split belongs to |
Get a split #
my_split = client.get_dataset_split(my_dataset["id"], new_dataset_version["id"], new_split["id"])
Parameters | Description |
---|---|
dataset_id | The dataset id |
dataset_version_id | The dataset version id |
split_id | The split id |
Update a split #
my_split["description"] = "Training data"
client.update_dataset_split(my_dataset["id"], new_dataset_version["id"], my_split)
Parameters | Description |
---|---|
dataset_id | The dataset id |
dataset_version_id | The dataset version id |
split | The split object |
Delete a split #
client.delete_dataset_split(my_dataset["id"], new_dataset_version["id"], my_split)
Parameters | Description |
---|---|
dataset_id | The dataset id |
dataset_version_id | The dataset version id |
split | The split object |
Dataset Labels #
A dataset version can have multiple labels.
Get all labels #
labels = client.get_dataset_labels(my_dataset["id"], new_dataset_version["id"])
labels
Parameters | Description |
---|---|
dataset_id | The dataset id |
dataset_version_id | The dataset version id |
Create a label #
new_label = {
"name": "Birds",
"description": "Adding birds to the mix"
}
new_label = client.create_dataset_label(my_dataset["id"], new_dataset_version["id"], new_label)
new_label
Parameters | Description |
---|---|
dataset_id | The dataset id |
dataset_version_id | The dataset version id |
label | The label object |
Every dataset split has the following properties:
{
'id': 'ea3be37d-ce48-44bc-4435-39d88a47a2d6',
'created_at': '2021-11-02T14:24:06.187758716Z',
'updated_at': '2021-11-02T14:24:06.187758716Z',
'name': 'Birds',
'description': 'Adding birds to the mix',
'user_id': 'd7159432-f218-44ac-aebe-e5d661d62862',
'version_id': 'd774d5bd-40fd-4c5f-8043-f1a8ed88e0d2',
'color': '',
'index': 2
}
Properties in more detail:
Property | Description |
---|---|
id | Unique id for the dataset split |
created_at | The creation date |
updated_at | Last updated date |
name | The dataset label name |
description | The dataset label description |
user_id | The unique id of the dataset label creator |
version_id | The unique id of the dataset label the split belongs to |
color | The hex code for the color to be used/associated with this label |
index | Make sure we can always sort the labels in the same sequence (handled automatically) |
Get a label #
my_label = client.get_dataset_label(my_dataset["id"], new_dataset_version["id"], new_label["id"])
Parameters | Description |
---|---|
dataset_id | The dataset id |
dataset_version_id | The dataset version id |
label_id | The label id |
Update a label #
my_label["color"] = "#00ff00"
my_label = client.update_dataset_label(my_dataset["id"], new_dataset_version["id"], my_label)
Parameters | Description |
---|---|
dataset_id | The dataset id |
dataset_version_id | The dataset version id |
label | The label object |
Delete a label #
client.delete_dataset_label(my_dataset["id"], new_dataset_version["id"], my_label)
Parameters | Description |
---|---|
dataset_id | The dataset id |
dataset_version_id | The dataset version id |
label | The label object |
Dataset Items #
A dataset version can contain many dataset items. Items are used to access and store the actual items in your dataset version such as images or text.
Get dataset items #
items = client.get_dataset_items(my_dataset["id"], new_dataset_version["id"])
Parameters | Description |
---|---|
dataset_id | The dataset id |
dataset_version_id | The dataset version id |
params | Additional query parameters:Default value: None Accepted parameters: pageSize: Integer value; Number of items to be returned. Default value: 10. pageCount: Integer value: The number of the page you want to view. onlyUnlabelled: Boolean: If True only return items that are not labelled/annotated.labelId: string; Only return items that are labelled with this id. splitId: string; Only return items that are part of this split. |
# All params are optional, but here we combine them together for demo purposes.
params = {
"onlyUnlabelled": True,
"pageSize": 25,
"pageCount": 0,
"labelId": new_label["id"],
"splitId": my_split["id"]
}
client.get_dataset_items(my_dataset["id"], new_dataset_version["id"], params)
my_split
Create a dataset item #
item = {
"name": "An optional name",
"splits": [my_split]
}
item = client.create_dataset_item(my_dataset["id"], new_dataset_version["id"], item)
Parameters | Description |
---|---|
dataset_id | The dataset id |
dataset_version_id | The dataset version id |
item | The dataset item object |
Every dataset item has the following properties:
{
'id': 'e92dcc17-cbc2-4a1e-a332-4c62a3232ede',
'created_at': '2021-11-02T16:07:15.88589873Z',
'updated_at': '2021-11-02T16:07:15.88589873Z',
'name': 'An optional name',
'description': '',
'user_id': 'd7159432-f218-44ac-aebe-e5d661d62862',
'text': '',
'splits': [],
'annotations': None,
'extension': ''
}
Properties in more detail:
Property | Description |
---|---|
id | Unique id for the dataset item |
created_at | The creation date |
updated_at | Last updated date |
name | An optional name |
description | The dataset item description |
user_id | The unique id of the dataset label creator |
splits | The list of dataset splits the dataset item belongs to |
annotations | The list of annotations for the dataset item |
extension | The extension for the dataset item. |
Get a dataset item #
item = client.get_dataset_item(my_dataset["id"], new_dataset_version["id"], item["id"])
Parameters | Description |
---|---|
dataset_id | The dataset id |
dataset_version_id | The dataset version id |
item_id | The dataset item id |
Update a dataset item #
item["description"] = "A better description"
client.update_dataset_item(my_dataset["id"], new_dataset_version["id"], item)
Parameters | Description |
---|---|
dataset_id | The dataset id |
dataset_version_id | The dataset version id |
item | The dataset item object |
Delete a dataset item #
client.delete_dataset_item(my_dataset["id"], new_dataset_version["id"], item)
Parameters | Description |
---|---|
dataset_id | The dataset id |
dataset_version_id | The dataset version id |
item | The dataset item object |
Upload a dataset item image #
client.upload_dataset_item_image(my_dataset["id"], new_dataset_version["id"], item["id"], folder="directory/to/item", filename="item_filename.jpg")
Parameters | Description |
---|---|
dataset_id | The dataset id |
dataset_version_id | The dataset version id |
item | The dataset item object |
Download a dataset item image #
download_location = f"{item['id']}.{item['extension']}"
client.download_dataset_item_image(my_dataset["id"], new_dataset_version["id"], item["id"], download_location, thumbnail=False)
Parameters | Description |
---|---|
dataset_id | The dataset id |
dataset_version_id | The dataset version id |
item_id | The dataset item id |
download_location | The full location and filename of where to save the item image. |
thumbnail | Flag indicating whether to download the full image or its thumbnail. |
Annotations #
Annotations link your dataset items to one or more labels in your dataset.
Create an annotation #
Create an annotation with label_id
, split_id
, and item_id
.
annotation = {
"label_id": my_label["id"],
"split_id": my_split["id"],
"item_id": item["id"]
}
annotation = client.annotate(my_dataset["id"], new_dataset_version["id"], annotation)
Every annotation has the following properties:
{
'id': 'e92dcc17-cbc2-4a1e-a332-4c62a3232ede',
'created_at': '2021-11-02T16:07:15.88589873Z',
'updated_at': '2021-11-02T16:07:15.88589873Z',
'label_id': 'ecd4f023-d13f-440a-8b9c-c341edd2c28b',
'item_id': 'dab56b75-387c-4660-989f-2b3f953772c4',
'split_id': 'b2ccc18d-4dcc-475a-b1a9-a281e574f695',
'coordinates': '',
'user_id': 'd7159432-f218-44ac-aebe-e5d661d62862',
}
Properties in more detail:
Property | Description |
---|---|
id | Unique id |
created_at | The creation date |
updated_at | Last updated date |
label_id | The label ID the annotation belongs to |
item_id | The dataset item ID the annotation belongs to |
split_id | The dataset split ID the annotation belongs to |
user_id | The unique id of the annotation creator |
Parameters | Description |
---|---|
dataset_id | The dataset id |
dataset_version_id | The dataset version id |
annotation | The annotation object |
Update an annotation #
Update a given annotation
annotation["coordinates"] = "14 20 34 48"
annotation = client.update_annotation(my_dataset["id"], new_dataset_version["id"], annotation)
Parameters | Description |
---|---|
dataset_id | The dataset id |
dataset_version_id | The dataset version id |
annotation | The annotation object |
Delete an annotation #
Delete a given annotation
client.delete_annotation(my_dataset["id"], new_dataset_version["id"], annotation)
Parameters | Description |
---|---|
dataset_id | The dataset id |
dataset_version_id | The dataset version id |
annotation | The annotation object |
Export Dataset Version #
client.download_dataset(
my_dataset["id"],
new_dataset_version["id"],
split_id="",
extract_to_dir="data",
download_file="dataset.zip",
remove_download_file=True,
export_format=""
)
Parameters | Description |
---|---|
dataset_id | The dataset id |
dataset_version_id | The dataset version id |
split_id | (Optional) Specify the split_id if you only want to download that dataset split. |
extract_to_dir | The directory to extract to. Default value: “data” |
download_file | The name of the download file. Default value: “dataset.zip” |
remove_download_file | Flag indicating whether to remove or keep the downloaded zip file. Default value: True (= remove) |
export_format | The format of your dataset. Supported export formats: DATASET_FORMAT_FOLDERS, DATASET_FORMAT_YOLO, DATASET_FORMAT_CSV, DATASET_FORMAT_SPACY_NER |
A note on the exported formats.
Image classification #
export_format: DATASET_FORMAT_FOLDERS
The .zip file contains a folder for every dataset split
: “train”, “valid”, “test”.
Every dataset split
folder contains a ’label-named’ folder for every label
: “cats”, “dogs”.
Every label
folder contains all the images of the dataset items
for the given label
in the given dataset split
.
Dataset items
are named by the “name” and “extension” property of the dataset item
. If the “name” property is empty, the “id” is used to name the file.
Dataset items
that have no label, will be added to the split folder
to which they belong.
.
+-- train/
| +-- cats/
| +-- cat1.jpg
| +-- cat12.jpeg
| +-- dogs/
| +-- dog2.jpg
| +-- dog4.png
| +-- cat17.jpeg
| +-- dog15.jpg
+-- valid/
| +-- cats/
| +-- cat4.jpg
| +-- cat8.jpg
| +-- dogs/
| +-- dog9.jpg
| +-- dog14.png
+-- test/
| +-- cats/
| +-- cat90.jpg
| +-- cat34.jpeg
| +-- dogs/
| +-- dog81.jpg
| +-- dog98.png
Text classification #
export_format: DATASET_FORMAT_FOLDERS
The .zip file contains a folder for every dataset split
: “train”, “test”, “unsup”.
Every dataset split
folder contains a ’label-named’ folder for every label
: “pos”, “neg”.
Every label
folder contains all the images of the dataset items
for the given label
in the given dataset split
.
Dataset items
are named by the “name” and “extension” property of the dataset item
. If the “name” property is empty, the “id” is used to name the file.
Dataset items
that have no label, will be added to the split folder
to which they belong.
.
+-- train/
| +-- pos/
| +-- 1.txt
| +-- 3.txt
| +-- neg/
| +-- 2.txt
| +-- 4.txt
+-- test/
| +-- pos/
| +-- 5.txt
| +-- 7.txt
| +-- neg/
| +-- 6.txt
| +-- 8.txt
+-- unsup/
| +-- 13.txt
| +-- 14.txt
Object detection #
export_format: DATASET_FORMAT_YOLO
Object detection datasets are exported in YOLO format.
For every dataset split
a dataset_split_name.txt
file gets created containing all the filenames for that dataset split
.
Every dataset item
will have an image and a txt file associated with it. The txt file contains a list of annotations in Yolo format: label_index relative_x relative_y relative_width relative_height.
The .names file contains the list of labels, where the index corresponds to the label_index in the annotation .txt files.
The config.json file contains a contains a json object with the color for every label.
.
+-- train.txt
+-- valid.txt
+-- test.txt
+-- 1.jpg
+-- 1.txt
+-- 3.jpg
+-- 3.txt
+-- ...
+-- dataset_version_id.names
+-- config.json
Tabular #
export_format: DATASET_FORMAT_CSV
Tabular datasets are exported in a .zip file that contains a dataset_version_id.csv
file accompanied by a config.json
, which provides more details on how the data should be interpreted.
.
+-- dataset_version_id.csv
+-- config.json
A little more details about the config.json file:
{
"multi_label": false,
"label_column": "labels",
"split_column": "split",
"label_separator": " ",
"filename": "dataset_version_id.csv",
"csv_separator": ","
}
Properties | Description |
---|---|
multi_label | Is the dataset multi label? |
label_column | The column name that contains the labels |
split_column | The column name that contains the name of the split the row belongs to |
label_separator | If multi_label , use this separator to split the labels |
filename | The name of the .csv file that contains the data |
csv_separator | Use this separator to split each row into columns |
Named entity recognition #
export_format: DATASET_FORMAT_SPACY_NER
For an named entity recognition dataset with splits:
- train
- valid
- test
the zip file should be structured in the following way:
.
+-- train.json
+-- valid.json
+-- test.json
+-- config.json
config.json #
The config.json
file contains a list of dataset splits, as well as a color code for every label.
{
"splits": [
"train",
"valid",
"test"
],
"colors": {
"label_name": "#82ebfd",
"label_name2": "#e95211"
}
}
split_name.json #
For every dataset split, there is a ‘split_name’.json file with the following structure:
[{
"id": "the_dataset_item_id",
"name": "the_original_filename" ,
"text": "The textual content of the file that has been annotated.",
"annotations": [{
"start": 4,
"end": 11,
"label": "label_name",
},
{
...
}
]
},
{
...
}
]
Import Dataset Version #
Image classification #
format: DATASET_FORMAT_FOLDERS
For an image classification dataset with splits:
- train
- valid
- test
and labels:
- cats
- dogs
the zip file should be structured in the following way
.
+-- train/
| +-- cats/
| +-- cat1.jpg
| +-- cat12.jpeg
| +-- dogs/
| +-- dog2.jpg
| +-- dog4.png
| +-- cat17.jpeg
| +-- dog15.jpg
+-- valid/
| +-- cats/
| +-- cat4.jpg
| +-- cat8.jpg
| +-- dogs/
| +-- dog9.jpg
| +-- dog14.png
+-- test/
| +-- cats/
| +-- cat90.jpg
| +-- cat34.jpeg
| +-- dogs/
| +-- dog81.jpg
| +-- dog98.png
Text classification #
For an text classification dataset with splits:
- train
- valid
- unsup
and labels:
- pos
- neg
the zip file should be structured in the following way
.
+-- train/
| +-- pos/
| +-- 1.txt
| +-- 3.txt
| +-- neg/
| +-- 2.txt
| +-- 4.txt
+-- test/
| +-- pos/
| +-- 5.txt
| +-- 7.txt
| +-- neg/
| +-- 6.txt
| +-- 8.txt
+-- unsup/
| +-- 13.txt
| +-- 14.txt
Object detection #
format: DATASET_FORMAT_YOLO
Object detection datasets are imported in YOLO format.
For every dataset split
a dataset_split_name.txt
file gets created containing all the filenames for that dataset split
.
Every dataset item
will have an image and a txt file associated with it. The txt file contains a list of annotations in Yolo format: label_index relative_x relative_y relative_width relative_height
.
The .names file contains the list of labels, where the index corresponds to the label_index in the annotation .txt files.
The config.json file contains a contains a json object with the color for every label.
.
+-- train.txt
+-- valid.txt
+-- test.txt
+-- 1.jpg
+-- 1.txt
+-- 3.jpg
+-- 3.txt
+-- ...
+-- dataset_version_id.names
+-- config.json
Tabular #
format: DATASET_FORMAT_CSV
Tabular datasets are imported from .csv files accompanied by a config.json
file that provides more details on how the data should be interpreted.
.
+-- dataset.csv
+-- config.json
Config file #
A little more details about the config.json
file:
{
"multi_label": false,
"label_column": "labels",
"split_column": "split",
"label_separator": " ",
"filename": "dataset.csv",
"csv_separator": ","
}
Properties | Description |
---|---|
multi_label | Is the dataset multi label? |
label_column | The column name that contains the labels |
split_column | The column name that contains the name of the split the row belongs to |
label_separator | If multi_label , use this separator to split the labels |
filename | The name of the .csv file that contains the data |
csv_separator | Use this separator to split each row into columns |
Named entity recognition #
format: DATASET_FORMAT_SPACY_NER
For an named entity recognition dataset with splits:
- train
- valid
- test
the zip file should be structured in the following way:
.
+-- train.json
+-- valid.json
+-- test.json
+-- config.json
config.json #
The config.json
file contains a list of dataset splits, as well as a color code for every label.
{
"splits": [
"train",
"valid",
"test"
],
"colors": {
"label_name": "#82ebfd",
"label_name2": "#e95211"
}
}
split_name.json #
For every dataset split, there is a ‘split_name’.json file with the following structure:
[{
"id": "the_dataset_item_id",
"name": "the_original_filename" ,
"text": "The textual content of the file that has been annotated.",
"annotations": [{
"start": 4,
"end": 11,
"label": "label_name",
},
{
...
}
]
},
{
...
}
]
Parameters | Description |
---|---|
dataset_id | The dataset id |
dataset_version_id | The dataset version id |
folder | The folder that contains the .zip file. Default value: “data” |
filename | The name of the upload file. Default value: “dataset.zip” |
format | The format of your dataset. Supported import formats: DATASET_FORMAT_FOLDERS, DATASET_FORMAT_YOLO, DATASET_FORMAT_CSV, DATASET_FORMAT_SPACY_NER |
Jobs #
Get all jobs #
from seeme import JOB_STATUS_WAITING, JOB_STATUS_STARTED, JOB_STATUS_FINISHED, JOB_STATUS_ERROR, JOB_TYPE_TRAINING
jobs = client.get_jobs(
application_id="",
states=[JOB_STATUS_WAITING, JOB_STATUS_STARTED, JOB_STATUS_FINISHED, JOB_STATUS_ERROR],
job_types=[JOB_TYPE_TRAINING]
)
This method returns a list of job
objects with the following properties:
{
'id': '26cb7e7a-25a0-4faa-8e3a-d9ee2f0d0c71',
'created_at': '2022-09-20T21:32:53.561561+02:00',
'updated_at': '2022-09-21T00:48:10.614159+02:00',
'name': 'Cats and dogs classification',
'description': '',
'job_type': 'training',
'application_id': '10b957dd-5dfe-49b0-a500-61bb8f6564c6',
'status': 'finished',
'status_message': 'updating training request: finished',
'user_id': '2b02eb44-8ad0-409c-9515-0405fb25d470',
'cpu_start_time': '2022-09-20T19:37:42.461233',
'cpu_end_time': '2022-09-20T22:48:10.610324',
'gpu_start_time': '2022-09-20T19:38:19.492219',
'gpu_end_time': '2022-09-20T22:47:22.688223',
'agent_name': 'g7',
'dataset_id': '8037b73a-5512-4a45-89e2-29761771fff6',
'dataset_version_id': '1d3bf8d6-e39b-498e-9c08-680d2f8a3c47',
'model_id': '010f68fe-87d6-418a-a622-d37923ec7164',
'model_version_id': '82e31602-79ea-4cc2-971a-0baee4c0fbd7',
'items': [{
'id': '8493abc5-da89-4842-9406-c4914e2917c3',
'created_at': '2022-09-20T21:32:53.562095+02:00',
'updated_at': '2022-09-21T00:48:10.614535+02:00',
'name': 'arch',
'description': 'resnet50',
'job_id': '26cb7e7a-25a0-4faa-8e3a-d9ee2f0d0c71',
'value': 'resnet50',
'default_value': 'resnet50',
'value_type': 'text',
'label': 'Architecture'
}, {
'id': 'aaae29eb-315e-407b-9f7b-d25eb11ce902',
'created_at': '2022-09-20T21:32:53.562095+02:00',
'updated_at': '2022-09-21T00:48:10.614535+02:00',
'name': 'image_size',
'description': '',
'job_id': '26cb7e7a-25a0-4faa-8e3a-d9ee2f0d0c71',
'value': '224',
'default_value': '224',
'value_type': 'number',
'label': 'Image size'
}, {
'id': 'a582644b-31f7-421e-86f2-54dc6ddb9e0b',
'created_at': '2022-09-20T21:32:53.562095+02:00',
'updated_at': '2022-09-21T00:48:10.614535+02:00',
'name': 'batch_size',
'description': '',
'job_id': '26cb7e7a-25a0-4faa-8e3a-d9ee2f0d0c71',
'value': '32',
'default_value': '32',
'value_type': 'number',
'label': 'Batch size'
}, {
'id': 'c1d6291a-1bd4-477d-a425-132cb2422f8f',
'created_at': '2022-09-20T21:32:53.562095+02:00',
'updated_at': '2022-09-21T00:48:10.614535+02:00',
'name': 'nb_epochs',
'description': '',
'job_id': '26cb7e7a-25a0-4faa-8e3a-d9ee2f0d0c71',
'value': '50',
'default_value': '50',
'value_type': 'number',
'label': 'Number of epochs'
}]
Job properties in more detail.
Property | Description |
---|---|
id | Unique id for the application |
name | Name of your model (version) |
description | Describe the job you are running |
job_type | The type of the job. Default: JOB_TYPE_TRAINING (“training”) |
application_id | The application id of the job. (see applications |
status | The status of your job. Supported states: JOB_STATUS_WAITING, JOB_STATUS_STARTED, JOB_STATUS_FINISHED, JOB_STATUS_ERROR. |
status_message | More information about the status of your job, gets updated by the agent handling the job. |
user_id | The id of the user that requested the job. |
cpu_start_time | The CPU compute starting time. |
cpu_end_time | The CPU compute end time. |
gpu_start_time | The GPU compute starting time. |
gpu_end_time | The GPU compute end time. |
agent_name | The agent responsible for handling the job. |
dataset_id | The id of the dataset being used. |
dataset_version_id | The id of the dataset version being used. |
model_id | The id of the model the finished job will be added to. If left blank upon job creation, a new model will be created, and its id will be updated in this property. |
model_version_id | The id of the model version the job resulted in. Leave blank upon job creation. The property will/should be updated after uploading the model. |
created_at | The creation date |
updated_at | Last updated date |
items | A list of job specific steps and settings. |
Job Item in more detail:
Property | Description |
---|---|
id | Unique id for the application |
name | Name of your model (version) |
description | Describe the job you are running |
job_id | The job id it belongs to |
value | The item value |
default_value | The item default value |
value_type | The type of the item value: text or number |
label | The label for the item |
created_at | The creation date |
updated_at | Last updated date |
Create a job #
Create a training job:
from seeme import JOB_TYPE_TRAINING
my_job = {
"name": "Train a new model for cats and dogs",
"description": "",
"job_type": JOB_TYPE_TRAINING, # "training"
"application_id": "acf26cf4-e19f-425e-b5cb-031830a46df4", # See Applications to get the correct application_id for your job
"dataset_id": "8037b73a-5512-4a45-89e2-29761771fff6", # Update to your dataset_id
"dataset_version_id": "1d3bf8d6-e39b-498e-9c08-680d2f8a3c47", # Update to your dataset_version_id
"items": [
{
"name": "image_size",
"value": "224",
"value_type": "number",
"label": "Image Size"
},
{
"name": "arch",
"value": "resnet50",
"value_type": "text",
"label": "Architecture"
},
{
"name": "batch_size",
"value": "50",
"value_type": "number",
"label": "Batch size"
}
]
}
my_job = client.create_job(my_job)
Supported job items per applications #
Every job item
can be customised depending on the AI application you are using. Below we list all the possibilties per AI application.
Image classification #
Architecture
job_item = {
"name": "arch",
"value": "resnet50",
"value_type": "text",
"label": "Architecture"
}
Image size:
job_item = {
"name": "image_size",
"value": "224",
"value_type": "number",
"label": "Image Size"
}
Image resize:
job_item = {
"name": "image_resize",
"value": "460",
"value_type": "number",
"label": "Image Resize"
}
Batch size:
job_item = {
"name": "batch_size",
"value": "32",
"value_type": "number",
"label": "Batch Size"
}
Number of epochs:
job_item = {
"name": "nb_epochs",
"value": "50",
"value_type": "number",
"label": "Number of epochs"
}
Mix up %:
job_item = {
"name": "mixup",
"value": "0",
"value_type": "number",
"label": "Mix up %"
}
Seed:
job_item = {
"name": "seed",
"value": "12",
"value_type": "number",
"label": "Seed"
}
Object detection #
Architecture:
job_item = {
"name": "arch",
"value": "yolov4-tiny",
"value_type": "number",
"label": "Architecture"
}
Image size:
job_item = {
"name": "image_size",
"value": "416",
"value_type": "number",
"label": "Image Size"
}
Batch size:
job_item = {
"name": "batch_size",
"value": "64",
"value_type": "number",
"label": "Batch Size"
}
Subdivisions:
job_item = {
"name": "subdivisions",
"value": "8",
"value_type": "number",
"label": "Subdivisions"
}
Named entity recognition (NER) #
Language:
job_item = {
"name": "language",
"value": "en",
"value_type": "text",
"label": "Language"
}
Optimize for:
job_item = {
"name": "optimize",
"value": "accuracy",
"value_type": "multi",
"label": "Optimize for"
}
Tabular #
Fast.ai Tabular #
Architecture:
job_item = {
"name": "arch",
"value": "[200 - 100]",
"value_type": "text",
"label": "Architecture"
}
Batch size:
job_item = {
"name": "batch_size",
"value": "32",
"value_type": "number",
"label": "Batch Size"
}
Number of epochs:
job_item = {
"name": "nb_epochs",
"value": "50",
"value_type": "number",
"label": "Number of epochs"
}
Minimal feature importance (%):
job_item = {
"name": "min_feat_importance",
"value": "0",
"value_type": "number",
"label": "Minimal feature importance (%)"
}
Add info:
job_item = {
"name": "add_info",
"value": True,
"value_type": "boolean",
"label": "Add info"
}
XGBoost #
Booster:
job_item = {
"name": "booster",
"value": "gbtree", # "gbtree", "dart", "gblinear"
"value_type": "multi",
"label": "Booster"
}
Nb Estimators:
job_item = {
"name": "nb_estimators",
"value": "100",
"value_type": "number",
"label": "Nb Estimators"
}
Max depth:
job_item = {
"name": "max_depth",
"value": "6",
"value_type": "number",
"label": "Number of epochs"
}
Learning rate (eta):
job_item = {
"name": "learning_rate",
"value": "0.3",
"value_type": "number",
"label": "Learning rate (eta)"
}
Subsample (0 - 1):
job_item = {
"name": "subsample",
"value": "0.5",
"value_type": "number",
"label": "Subsample (0 - 1)"
}
Minimal feature importance (%):
job_item = {
"name": "min_feat_importance",
"value": "0",
"value_type": "number",
"label": "Minimal feature importance (%)"
}
Add info:
job_item = {
"name": "add_info",
"value": True,
"value_type": "boolean",
"label": "Add info"
}
CatBoost #
Nb Estimators:
job_item = {
"name": "nb_estimators",
"value": "100",
"value_type": "number",
"label": "Nb Estimators"
}
Learning rate:
job_item = {
"name": "learning_rate",
"value": "0.1",
"value_type": "number",
"label": "Learning rate"
}
Minimal feature importance (%):
job_item = {
"name": "min_feat_importance",
"value": "0",
"value_type": "number",
"label": "Minimal feature importance (%)"
}
Add info:
job_item = {
"name": "add_info",
"value": True,
"value_type": "boolean",
"label": "Add info"
}
LightGBM #
Boosting Type:
job_item = {
"name": "booster",
"value": "gbdt", # "gbdt", "dart", "rf"
"value_type": "multi",
"label": "Boosting Type"
}
Nb Estimators:
job_item = {
"name": "nb_estimators",
"value": "100",
"value_type": "number",
"label": "Nb Estimators"
}
Max depth:
job_item = {
"name": "max_depth",
"value": "-1",
"value_type": "number",
"label": "Number of epochs"
}
Learning rate:
```Python
job_item = {
"name": "learning_rate",
"value": "0.1",
"value_type": "number",
"label": "Learning rate"
}
Max number of leaves:
job_item = {
"name": "num_leaves",
"value": "31",
"value_type": "number",
"label": "Max number of leaves"
}
Minimal feature importance (%):
job_item = {
"name": "min_feat_importance",
"value": "0",
"value_type": "number",
"label": "Minimal feature importance (%)"
}
Add info:
job_item = {
"name": "add_info",
"value": True,
"value_type": "boolean",
"label": "Add info"
}
Get a job #
my_job = client.get_job(my_job["id"])
Update a job #
my_job["description"] = "Update any property"
my_job = client.update_job(my_job)
Delete a job #
client.delete_job(my_job["id"])
Applications #
SeeMe.ai supports multiple types of AI models, frameworks, and framework versions. To access, manage, and describe these, we use applications
:
Get all supported applications #
Print a list of the applications in your SeeMe.ai client:
client.applications
Every application
has the following properties:
{
'id': '878aea66-16b7-4f10-9d82-a2a92a35728a',
'created_at': '2022-09-19T17:57:14.323965Z',
'updated_at': '2021-09-19T17:57:14.323965Z',
'base_framework': 'pytorch',
'base_framework_version': '1.12.1',
'framework': 'fastai',
'framework_version': '2.7.9',
'application': 'image_classification',
'inference_host': 'image-pt-1-12-1-fa-2-7-9',
'can_convert_to_onnx': True,
'can_convert_to_coreml': True,
'can_convert_to_tensorflow': True,
'can_convert_to_tflite': True,
'has_labels_file': True,
'inference_extensions': 'pkl'
}
Properties in more detail.
Property | Description |
---|---|
id | Unique id for the application |
created_at | The creation date |
updated_at | Last updated date |
framework | The framework used to train the model |
framework_version | The framework version used to train the model |
base_framework | The base framework used by the framework |
base_framework_version | The base framework version used by the framework |
application | The type of application: “image_classification”, “object_detection”, “text_classification”, “structured”. |
inference_host | The internal host of the inference engine (if not used at the edge) |
can_convert_to_onnx | Flag indicating whether uploaded models can automatically be converted to ONNX |
can_convert_to_coreml | Flag indicating whether uploaded models can automatically be converted to Core ML |
can_convert_to_tensorflow | Flag indicating whether uploaded models can automatically be converted to Tensorflow (mostly for further conversion to Tensorflow Lite for example) |
can_convert_to_tflite | Flag indicating whether uploaded models can automatically be converted to Tensorflow Lite |
has_labels_file | Flag indicating whether there is file with all the labels available |
inference_extensions | The list of files with these extensions that need to be uploaded before the model can perform predictions/inferences. |
You can update the local list of applications by:
client.applications = client.get_applications()
Get the application id #
Before you can upload and use your model to make predictions, you need to add an application_id
:
from torch import __version__ as torch_version
from fastai import __version__ as fastai_version
# Get the application_id for your framework (version).
application_id = client.get_application_id(
base_framework="pytorch",
framework="fastai",
base_framework_version=torch_version,
framework_version=fastai_version,
application="image_classification"
)
If your combination is not supported, you will get a NotImplementedError
.
Feedback #
For questions, feedback, corrections: contact support@seeme.ai.