Automated Labeling

Labeling data is the biggest bottleneck in machine learning. Post-processors eliminate most of that manual work by using existing AI models to pre-label your data automatically.

The Problem

Manual labeling is slow and expensive:

Labeling Method	Speed	Cost	Quality
Fully manual	~100 items/hour	High	High (if expert)
Post-processor + review	~1000 items/hour	Low	High
Post-processor only	Unlimited	Minimal	Moderate

The sweet spot: use post-processors to generate labels, then have humans review and correct them. This is 10x faster than labeling from scratch.

How It Works

graph LR
    A[Upload Raw Data] --> B[Post-Processor Runs]
    B --> C[Auto-Generated Labels]
    C --> D[Human Review]
    D --> E[Corrected Labels]
    E --> F[Training-Ready Dataset]

Upload your unlabeled data to a dataset
Post-processors run automatically on every item
Labels appear as annotations on each item
Review and correct mistakes in the annotation interface
Train on the corrected dataset

Setup

Step 1: Choose Your Labeling Strategy

Pick the right post-processor type for your task:

Task	Post-Processor Type	What It Produces
Categorize images	`classification`	Category labels
Find objects in images	`detection`	Bounding boxes
Extract entities from text	`ner`	Entity spans
Classify text/documents	`classification`	Category labels
Extract structured data	`llm`	Custom fields
Read text from images	`ocr`	Text content

Step 2: Create the Post-Processor

from seeme import Client

client = Client()

## Get your dataset
dataset = client.get_dataset("your-dataset-id")

# Option A: Use an existing model for labeling
processor = client.create_post_processor(
    dataset_id=dataset.id,
    name="Auto-Label Images",
    model_type="classification",
    model_id=pretrained_model.id,
    output_target="annotations",
    confidence_threshold=0.7,
    auto_create_labels=True,
    enabled=True
)

# Option B: Use an LLM for labeling
llm_processor = client.create_post_processor(
    dataset_id=dataset.id,
    name="LLM Classifier",
    model_type="llm",
    model_id=llm_model.id,
    prompt="""
    Classify this image into exactly one category:
    - cat
    - dog
    - bird
    - other

    Return only the category name, nothing else.
    """,
    output_target="annotations",
    auto_create_labels=True,
    enabled=True
)

Step 3: Upload Data

Upload your unlabeled data. Post-processors run automatically:

import glob

# Upload images
for image_path in glob.glob("./unlabeled_images/*.jpg"):
    client.create_dataset_item(
        version_id=version.id,
        split_id=split.id,
        file_path=image_path
    )

# Monitor processing
jobs = client.get_post_processor_jobs(
    dataset_id=dataset.id,
    status="pending"
)
print(f"{len(jobs)} items queued for labeling")

Step 4: Review and Correct

After processing completes, review the auto-generated labels:

# Get items with their auto-generated annotations
items = client.get_dataset_items(
    version_id=version.id,
    split_id=split.id
)

for item in items:
    annotations = client.get_annotations(item_id=item.id)
    for ann in annotations:
        print(f"Item {item.name}: {ann.label} ({ann.confidence:.2f})")

Using an LLM as a Labeling Oracle

Large language models can label data with remarkable accuracy, especially for tasks that benefit from reasoning:

# Use a large model (e.g., Ollama-hosted LLM) as a labeling oracle
llm_processor = client.create_post_processor(
    dataset_id=dataset.id,
    name="LLM Oracle",
    model_type="llm",
    model_id=large_llm.id,
    prompt="""
    You are an expert annotator. Look at this image and:
    1. Identify what the main subject is
    2. Classify it into one of these categories:
       {label_list}
    3. Rate your confidence (low/medium/high)

    Return JSON: {"label": "...", "confidence": "..."}
    """,
    output_target="annotations",
    auto_create_labels=True,
    enabled=True
)

This approach is central to Model Distillation—the LLM generates the training data, and you train a smaller, faster model on those labels.

Chaining Post-Processors for Complex Labeling

For multi-step labeling tasks, chain processors in sequence:

# Step 1: OCR to extract text from documents
ocr_processor = client.create_post_processor(
    dataset_id=dataset.id,
    name="Extract Text",
    model_type="ocr",
    model_id=ocr_model.id,
    output_target="text",
    order=1
)

# Step 2: NER to find entities in extracted text
ner_processor = client.create_post_processor(
    dataset_id=dataset.id,
    name="Find Entities",
    model_type="ner",
    model_id=ner_model.id,
    output_target="annotations",
    auto_create_labels=True,
    order=2
)

# Step 3: LLM to classify based on content
classify_processor = client.create_post_processor(
    dataset_id=dataset.id,
    name="Classify Document",
    model_type="llm",
    model_id=llm_model.id,
    prompt="Based on this document, classify it as: invoice, contract, letter, or report. Return only the category.",
    output_target="annotations",
    auto_create_labels=True,
    order=3
)

Setting Confidence Thresholds

Not all predictions are equally reliable. Use confidence thresholds to control quality:

# High threshold: only keep very confident predictions
# Fewer auto-labels, but higher accuracy
processor = client.create_post_processor(
    dataset_id=dataset.id,
    name="Conservative Labeler",
    model_type="classification",
    model_id=model.id,
    output_target="annotations",
    confidence_threshold=0.9,  # Only keep 90%+ confidence
    auto_create_labels=True,
    enabled=True
)

# Low threshold: keep more predictions
# More auto-labels, but more corrections needed
processor = client.create_post_processor(
    dataset_id=dataset.id,
    name="Aggressive Labeler",
    model_type="classification",
    model_id=model.id,
    output_target="annotations",
    confidence_threshold=0.5,  # Keep 50%+ confidence
    auto_create_labels=True,
    enabled=True
)

0.9+: Use when label accuracy is critical and you’d rather label manually than be wrong
0.7-0.9: Good default for most tasks—review flagged items
0.5-0.7: Use when you have time to review and want maximum coverage

Active Learning Pattern

Combine automated labeling with iterative model improvement:

graph TD
    A[Start: Pre-trained Model] --> B[Auto-label batch of data]
    B --> C[Human reviews corrections]
    C --> D[Retrain model on corrected data]
    D --> E{Model improved?}
    E -->|Yes| F[Auto-label next batch]
    F --> C
    E -->|No| G[Need more diverse data]
    G --> H[Upload new examples]
    H --> B

Start with a pre-trained or LLM-based post-processor
Auto-label a batch of data
Review and correct the labels
Train a new model on the corrected data
Replace the post-processor with your improved model
Repeat—each iteration produces better labels faster

# Iteration 1: Use LLM for initial labels
# (see setup above)

# After review and training...

# Iteration 2: Use your trained model (faster, cheaper)
client.update_post_processor(
    processor_id=processor.id,
    model_id=trained_model_v1.id,  # Your newly trained model
    model_type="classification"
)

# Upload next batch - now labeled by your own model

Best Practices

Start with a sample - Run on 100 items first and check quality before processing thousands
Use confidence thresholds - Don’t trust every prediction equally
Always review - Even 95% accurate auto-labeling means 1 in 20 items is wrong
Track accuracy - Compare auto-labels vs. human-corrected labels to measure quality
Iterate - Replace the labeling model with your retrained model each cycle

Next Step

Once you have a high-quality labeled dataset, use it for Model Distillation—train a smaller, faster model that matches the quality of the large model that labeled the data.

Model Distillation