Selecting the right pre-trained model is critical for finetuning success. The base model provides the starting point—choose one that’s appropriate for your task, data, and deployment requirements.
Model Selection Criteria
Consider these factors when choosing a base model:
Factor
Question to Ask
Task match
Was it trained on similar data?
Size
Can it run on your deployment target?
Accuracy
Does it perform well on standard benchmarks?
Speed
Is inference fast enough for your use case?
License
Can you use it commercially?
Image Classification Models
Model Comparison
Model
Parameters
Size
Accuracy (ImageNet)
Inference Speed
Best For
MobileNet v2
3.4M
14 MB
72%
⚡⚡⚡⚡⚡
Mobile, edge, real-time
EfficientNet B0
5.3M
21 MB
77%
⚡⚡⚡⚡
Balanced accuracy/speed
EfficientNet B2
9.2M
36 MB
80%
⚡⚡⚡
Better accuracy, still efficient
ResNet-18
11.7M
45 MB
70%
⚡⚡⚡⚡
Simple, well-understood
ResNet-50
25.6M
98 MB
76%
⚡⚡⚡
Good balance
EfficientNet B4
19.3M
75 MB
83%
⚡⚡
High accuracy
ViT-Base
86.6M
330 MB
85%
⚡
Maximum accuracy
ConvNeXt-Tiny
28.6M
110 MB
82%
⚡⚡
Modern architecture
Recommendations
graph TD
A{Deployment Target?} --> B[Mobile / Edge]
A --> C[Cloud Server]
A --> D[On-Premise GPU]
B --> E[MobileNet v2]
B --> F[EfficientNet B0]
C --> G{Priority?}
G --> H[Speed: EfficientNet B0-B2]
G --> I[Accuracy: EfficientNet B4+]
D --> J{GPU Memory?}
J --> K[<8GB: EfficientNet B4]
J --> L[8GB+: ViT-Base]
Example: Choosing for Manufacturing Defect Detection
## For real-time inspection on edge deviceconfig={"base_model":"mobilenet_v2","image_size":224}# For high-accuracy cloud-based analysisconfig={"base_model":"efficientnet_b4","image_size":380}# For maximum accuracy (research/offline)config={"base_model":"vit_base_patch16_224","image_size":224}
Object Detection Models
Model
Parameters
Speed (GPU)
mAP (COCO)
Best For
YOLOv4-tiny
6M
3 ms
~40%
Real-time, edge
YOLOv4
64M
12 ms
~65%
Balanced
YOLOv5s
7M
4 ms
~56%
Fast, modern
YOLOv5m
21M
8 ms
~64%
Balanced, modern
YOLOv5l
47M
15 ms
~68%
Higher accuracy
Faster R-CNN
41M
50 ms
~67%
Two-stage, more accurate
# Real-time detection on camera feedconfig={"base_model":"yolov4_tiny","image_size":416}# Production detection with good accuracyconfig={"base_model":"yolov5m","image_size":640}
Text Classification Models
Model
Parameters
Size
Speed
Best For
DistilBERT
66M
250 MB
⚡⚡⚡⚡
Fast inference, good accuracy
BERT-base
110M
420 MB
⚡⚡⚡
Standard choice
RoBERTa-base
125M
480 MB
⚡⚡⚡
Better pre-training
BERT-large
340M
1.3 GB
⚡⚡
Higher accuracy
DeBERTa-base
139M
530 MB
⚡⚡
State-of-the-art
# Production API with latency requirementsconfig={"base_model":"distilbert-base-uncased","max_length":256}# Best accuracy for offline processingconfig={"base_model":"deberta-base","max_length":512}
NER Models
Model
Languages
Speed
Best For
spaCy sm
Per-language
⚡⚡⚡⚡⚡
Production, speed
spaCy lg
Per-language
⚡⚡⚡
Better accuracy
BERT-NER
Multilingual
⚡⚡
Custom entities
Flair
Multilingual
⚡
Research, complex NER
# Fast entity extractionconfig={"base_model":"en_core_web_sm",# spaCy small}# Custom entities with BERTconfig={"base_model":"bert-base-cased","ner_architecture":"token_classification"}
Checking Available Models
List models available for finetuning:
Navigate to Jobs > New Training Job
Select your dataset
Under Base Model, browse available options
Filter by task type, size, or architecture
# List available base models for your task typebase_models=client.list_base_models(task_type="image_classification"# or: object_detection, text_classification, ner)formodelinbase_models:print(f"{model.name}")print(f" Parameters: {model.parameters/1e6:.1f}M")print(f" Size: {model.size_mb:.0f} MB")print(f" Benchmark: {model.benchmark_accuracy:.1%}")print()
Domain-Specific Base Models
For specialized domains, look for models pre-trained on similar data:
Domain
Base Models
Why Better
Medical imaging
MedCLIP, BiomedCLIP
Pre-trained on medical images
Satellite imagery
SatMAE, SSL4EO
Understands aerial perspective
Documents
LayoutLM, DiT
Understands document structure
Scientific text
SciBERT, PubMedBERT
Scientific vocabulary
Legal text
LegalBERT
Legal terminology
Code
CodeBERT, GraphCodeBERT
Programming languages
# Medical image classificationconfig={"base_model":"medclip_vit_base","image_size":224}# Document understandingconfig={"base_model":"layoutlm_base","image_size":224}