Node Types

Node Types

Workflows support several node types for different operations. Each type has specific configuration options and behaviors.

Available Node Types

TypePurposeInputOutput
ModelRun inferenceText or filePredictions
DatasetRead/write dataQuery or resultsItems or confirmation
StartEntry pointN/APasses input through
EndExit pointFinal resultN/A

Model Nodes

Model nodes run inference on your trained models. They’re the most common node type.

Configuration

model_node = client.create_workflow_node(
    version_id=version.id,
    name="Sentiment Analysis",
    entity_type="model",
    entity_id=model.id,
    config={
        # Input configuration
        "input_template": "Analyze sentiment: {{input}}",

        # Execution settings
        "timeout": 60,           # Max seconds
        "on_failure": "stop",    # or "continue"

        # Model version (optional)
        "model_version_id": version.id
    }
)

Input Template

The input_template determines what gets sent to the model:

## Simple: pass input directly
"input_template": "{{input}}"

# With context
"input_template": "Previous: {{prev_node}}\nNew: {{input}}"

# Complex prompt
"input_template": """
You are a helpful assistant. Analyze the following:

{{input}}

Consider these factors:
{{#each factors}}
- {{name}}: {{value}}
{{/each}}
"""

Supported Model Types

Model TypeInputOutput Format
Image ClassificationImage file{predictions: [{label, confidence}]}
Object DetectionImage file{detections: [{label, bbox, confidence}]}
NERText{entities: [{text, label, start, end}]}
Text ClassificationText{predictions: [{label, confidence}]}
STTAudio file{text, segments: [{text, start, end}]}
LLMText prompt{response, tokens_used}
OCRImage/PDF{text, pages: [{text, confidence}]}

Output Access

Model outputs are stored in variables[node_id]:

# Reference in subsequent nodes
"input_template": "OCR result: {{ocr_node_id}}"

# Access specific fields (for structured output)
"input_template": "Found entities: {{ner_node_id.entities}}"

Dataset Nodes

Dataset nodes read from or write to datasets. They can be input sources, context providers, or output destinations.

Input Mode (Read from Dataset)

Read items from a dataset to process:

input_dataset_node = client.create_workflow_node(
    version_id=version.id,
    name="Read Documents",
    entity_type="dataset",
    entity_id=dataset.id,
    config={
        "dataset_input": {
            "dataset_version_id": version.id,
            "input_field": "text",      # Column to use as input
            "filter_query": "",          # Optional filter
            "operation": "iterate"       # Process each item
        }
    }
)

Operations:

OperationBehavior
iterateProcess each item separately (batch mode)
lookupFind single matching item
joinFind all matching items
filterFilter items by condition

Output Mode (Write Results)

Store workflow results in a dataset:

output_dataset_node = client.create_workflow_node(
    version_id=version.id,
    name="Store Results",
    entity_type="dataset",
    entity_id=output_dataset.id,
    config={
        "output_dataset_id": output_dataset.id,
        "output_version_id": output_version.id,
        "output_split_id": split.id,        # Optional
        "column_mapping": {
            "original_text": "{{input}}",
            "sentiment": "{{sentiment_node}}",
            "entities": "{{ner_node}}",
            "summary": "{{llm_node}}"
        },
        "error_handling": "label_in_split"  # or "halt"
    }
)

Context Mode

Provide reference data to other nodes (see Edge Types):

rules_dataset_node = client.create_workflow_node(
    version_id=version.id,
    name="Processing Rules",
    entity_type="dataset",
    entity_id=rules_dataset.id,
    config={
        "context_config": {
            "field_mapping": {
                "rule_name": "name",
                "rule_text": "description"
            },
            "context_name": "rules",
            "iterate_context": False  # Pass all at once
        }
    }
)

Start Nodes

Start nodes mark the entry point of a workflow. They’re optional but useful for clarity.

start_node = client.create_workflow_node(
    version_id=version.id,
    name="Start",
    entity_type="start",
    entity_id="",  # No entity for start nodes
    config={}
)

Behavior:

  • Receives the initial input
  • Passes through unchanged
  • Useful for visual clarity in complex workflows

End Nodes

End nodes mark workflow completion. They’re optional.

end_node = client.create_workflow_node(
    version_id=version.id,
    name="End",
    entity_type="end",
    entity_id="",
    config={}
)

Behavior:

  • Marks workflow completion
  • Output becomes final workflow result
  • Useful for aggregating multiple branches

Node Configuration Reference

Common Properties

All node types share these properties:

{
    # Identity
    "name": "Human-readable name",
    "description": "Optional description",

    # Execution
    "timeout": 60,              # Seconds, default varies by type
    "on_failure": "stop",       # "stop" or "continue"

    # UI
    "position": {"x": 100, "y": 200}  # Canvas position
}

Model Node Config

{
    "input_template": "...",         # Required
    "model_version_id": "...",       # Optional, uses active if not set
}

Dataset Node Config (Input)

{
    "dataset_input": {
        "dataset_version_id": "...",
        "input_field": "text",
        "filter_query": "status == 'pending'",
        "operation": "iterate",
        "max_items": 100
    }
}

Dataset Node Config (Output)

{
    "output_dataset_id": "...",
    "output_version_id": "...",
    "output_split_id": "...",        # Optional
    "column_mapping": {...},
    "error_handling": "label_in_split"
}

Dataset Node Config (Context)

{
    "context_config": {
        "field_mapping": {...},
        "context_name": "rules",
        "iterate_context": False,
        "max_parallel": 5            # For iterate_context=True
    }
}

Node Execution Order

Nodes execute in topological order based on edges:

graph LR
    A[Node 1] --> C[Node 3]
    B[Node 2] --> C
    C --> D[Node 4]
  1. Nodes 1 and 2 have no dependencies (could run in parallel)
  2. Node 3 waits for both 1 and 2
  3. Node 4 waits for 3

Error Handling

Per-Node Failure Handling

node = client.create_workflow_node(
    version_id=version.id,
    name="Optional Enhancement",
    entity_type="model",
    entity_id=model.id,
    config={
        "on_failure": "continue",  # Workflow continues even if this fails
        "timeout": 30
    }
)

Timeout Configuration

Set appropriate timeouts based on model type:

Model TypeRecommended Timeout
Image Classification10-30s
Object Detection10-30s
NER5-15s
STT30-120s (depends on audio length)
LLM30-120s
OCR15-60s

Best Practices

  1. Name nodes clearly - Use descriptive names for debugging
  2. Set realistic timeouts - Account for model complexity
  3. Use on_failure=“continue” - For optional/enhancement steps
  4. Validate templates - Test input_template with sample data
  5. Handle empty outputs - Some models may return nothing

Next Steps