Sound Classification

Sound Classification

Build models to classify audio and speech.

What You’ll Build

AI models for audio analysis:

  • Sound classification - Identify sounds (birds, machines, alarms)
  • Speech-to-text - Transcribe audio to text
  • Speaker diarization - Identify who spoke when

Prerequisites

  • A SeeMe.ai account (sign up)
  • Audio files (WAV, MP3, etc.)
  • (Optional) Python environment with seeme SDK installed

Supported Tasks

Speech-to-Text Quick Start

Supported Formats

  • WAV, MP3, FLAC, OGG, M4A
  • Up to 30 minutes per file (longer files are chunked automatically)