Small STT Eval Audio Dataset

Small STT Eval Audio Dataset A small speech-to-text evaluation dataset containing 92 audio samples with ground truth transcriptions. Designed for evaluating STT systems on technical vocabulary, code-switching (English/Hebrew), and various speaking styles. Dataset Description This dataset contains audio recordings with accompanying transcriptions across multiple categories: Category Count Description tech_github 5 GitHub-related technical vocabulary… See the full description on the dataset page: https://huggingface.co/datasets/danielrosehill/Small-STT-Eval-Audio-Dataset.

View on Hugging Face

Project Details

Created

Dec 10, 2025

Platform

Hugging Face Dataset

Type

Dataset

Explore More Projects

← Browse All Projects More AI Experiments Projects →

Small STT Eval Audio Dataset

Project Details

Categories

Tags

Explore More Projects