Setup Teacher Model

Select and configure the large model that will label your training data.

Choosing a Teacher Model

The teacher model should be:

Accurate on your task (the student can only be as good as the teacher)
Available for inference (you’ll run it on many examples)
Appropriate for your data type

Teacher Options

Option	Accuracy	Cost	Speed	Best For
Your large model	Known	Low	Fast	You already have a good model
LLM (Ollama/local)	High	Low	Medium	Flexible tasks, local processing
LLM (OpenAI/Anthropic)	Very High	High	Fast	Best accuracy, budget available
Pre-trained model	Good	Low	Fast	Standard tasks
Ensemble	Highest	Medium	Slow	Maximum accuracy needed

Option 1: Use Your Existing Large Model

If you already have a large, accurate model:

from seeme import Client

client = Client()

# Get your existing accurate model
teacher_model = client.get_model("your-large-model-id")

# Verify it performs well
print(f"Teacher model: {teacher_model.name}")
print(f"Architecture: {teacher_model.config.get('base_model')}")
print(f"Accuracy: {teacher_model.metrics.get('accuracy', 'N/A')}")

# Get your existing accurate model
curl -X GET "https://api.seeme.ai/api/v1/models/your-large-model-id" \
  -H "Authorization: myusername:my-api-key"

# List all your models
curl -X GET "https://api.seeme.ai/api/v1/models" \
  -H "Authorization: myusername:my-api-key"

Option 2: Use a Local LLM (Ollama)

For flexible labeling with reasoning:

# First, ensure Ollama is running with your model
# ollama run llama3.1

# Create or get your LLM model reference
llm_model = client.get_model("ollama-llama3.1")  # Your Ollama model ID

# Test the teacher on a few examples
test_images = ["test1.jpg", "test2.jpg", "test3.jpg"]

for image in test_images:
    result = client.predict(
        model_id=llm_model.id,
        item=image,
        prompt="""
        Classify this image into one category:
        - defect_scratch
        - defect_dent
        - defect_discoloration
        - good

        Return only the category name.
        """
    )
    print(f"{image}: {result.prediction}")

Writing Effective LLM Prompts for Labeling

# Good prompt structure for classification
classification_prompt = """
You are an expert quality inspector. Classify this product image.

**Categories:**
- good: Product has no visible defects
- scratch: Visible scratch marks on surface
- dent: Physical dent or deformation
- discoloration: Color inconsistency or staining
- crack: Visible cracks or fractures

**Instructions:**
1. Examine the image carefully
2. Look for any defects
3. If multiple defects, report the most severe
4. If no defects are visible, classify as "good"

**Return only the category name, nothing else.**
"""

# Good prompt for multi-label
multilabel_prompt = """
List ALL defects visible in this image.

**Possible defects:**
- scratch
- dent
- discoloration
- crack
- contamination

**Return as comma-separated list, or "none" if no defects.**
"""

# Good prompt for entity extraction
extraction_prompt = """
Extract information from this document image.

**Return JSON with these fields:**
{
  "document_type": "invoice|receipt|contract|letter|other",
  "date": "YYYY-MM-DD or null",
  "total_amount": "number or null",
  "company_name": "string or null"
}

**Return only valid JSON.**
"""

Option 3: Use External LLM (OpenAI, Anthropic)

For maximum accuracy (at higher cost):

# Configure post-processor with external provider
processor = client.create_post_processor(
    dataset_id=dataset.id,
    name="GPT-4 Teacher",
    model_type="llm",
    external_provider="openai",
    external_model="gpt-4-turbo",
    external_config={
        "api_key": "sk-...",  # Or use environment variable
        "temperature": 0.1,   # Low for consistent labeling
        "max_tokens": 100
    },
    prompt="""
    Classify this image into exactly one category:
    - cat
    - dog
    - bird
    - other

    Return only the category name.
    """,
    output_target="annotations",
    auto_create_labels=True
)

# Configure post-processor with external LLM provider
curl -X POST "https://api.seeme.ai/api/v1/postprocessors" \
  -H "Authorization: myusername:my-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "dataset_id": "your-dataset-id",
    "name": "GPT-4 Teacher",
    "model_type": "llm",
    "external_provider": "openai",
    "external_model": "gpt-4-turbo",
    "external_config": {
      "api_key": "sk-...",
      "temperature": 0.1,
      "max_tokens": 100
    },
    "prompt": "Classify this image into exactly one category: cat, dog, bird, or other. Return only the category name.",
    "output_target": "annotations",
    "auto_create_labels": true
  }'

ℹ️

External LLMs cost money per prediction. For 10,000 images with GPT-4 Vision, expect ~$50-100. For initial experiments, use a local LLM or smaller external model first.

Option 4: Use an Ensemble

Combine multiple models for highest accuracy:

# Create a workflow that combines multiple teachers
workflow = client.create_workflow(
    name="Ensemble Teacher",
    description="Combine predictions from multiple models"
)
version = client.create_workflow_version(workflow_id=workflow.id, name="v1")

# Model nodes
model_a = client.create_workflow_node(
    version_id=version.id,
    name="Model A",
    entity_type="model",
    entity_id=efficientnet_b4.id,
    config={"input_template": "{{input}}"}
)

model_b = client.create_workflow_node(
    version_id=version.id,
    name="Model B",
    entity_type="model",
    entity_id=vit_base.id,
    config={"input_template": "{{input}}"}
)

model_c = client.create_workflow_node(
    version_id=version.id,
    name="Model C",
    entity_type="model",
    entity_id=resnet50.id,
    config={"input_template": "{{input}}"}
)

# Aggregation with LLM
aggregate = client.create_workflow_node(
    version_id=version.id,
    name="Aggregate",
    entity_type="model",
    entity_id=llm_model.id,
    config={
        "input_template": """
Three models classified an image:
- Model A: {{Model A}}
- Model B: {{Model B}}
- Model C: {{Model C}}

Return the majority prediction. If all three disagree, return the prediction from Model A.
Return only the category name.
"""
    }
)

# Connect models to aggregator
for model_node in [model_a, model_b, model_c]:
    client.create_workflow_edge(
        version_id=version.id,
        begin_node_id=model_node.id,
        end_node_id=aggregate.id,
        edge_type="data"
    )

Validate Teacher Quality

Before using the teacher to label thousands of examples, validate on a small test set:

# Create a small validation set with ground truth labels
val_items = [
    ("test_images/good_001.jpg", "good"),
    ("test_images/scratch_001.jpg", "scratch"),
    ("test_images/dent_001.jpg", "dent"),
    # ... 50-100 examples
]

# Run teacher predictions
correct = 0
total = 0
errors = []

for image_path, ground_truth in val_items:
    result = client.predict(
        model_id=teacher_model.id,
        item=image_path
    )

    if result.prediction == ground_truth:
        correct += 1
    else:
        errors.append({
            "image": image_path,
            "expected": ground_truth,
            "predicted": result.prediction,
            "confidence": result.confidence
        })
    total += 1

teacher_accuracy = correct / total
print(f"Teacher accuracy on validation: {teacher_accuracy:.2%}")

if teacher_accuracy < 0.90:
    print("⚠️  Warning: Teacher accuracy is low. Consider:")
    print("   - Using a better model")
    print("   - Improving the prompt (for LLM)")
    print("   - Reviewing error cases")

# Review errors
print(f"\nErrors ({len(errors)}):")
for e in errors[:10]:
    print(f"  {e['image']}: expected {e['expected']}, got {e['predicted']}")

Configure for Batch Labeling

Once validated, configure the teacher for efficient batch processing:

# For high-throughput labeling
processor_config = {
    # Quality controls
    "confidence_threshold": 0.8,  # Only keep high-confidence predictions
    "auto_create_labels": True,

    # Performance
    "batch_size": 32,             # Process in batches
    "timeout": 60,                # Timeout per item

    # Error handling
    "retry_on_error": True,
    "max_retries": 3
}

processor = client.create_post_processor(
    dataset_id=dataset.id,
    name="Teacher Labeler",
    model_type="classification",
    model_id=teacher_model.id,
    output_target="annotations",
    **processor_config
)

print(f"Teacher configured: {processor.id}")
print("Ready to label data in the next step.")

Next Step

With your teacher model configured, proceed to Label Data to generate training labels.

Label Data with Teacher