Skip to content
English
  • There are no suggestions because the search field is empty.

Data Annotation and Validation Process

How Shovels ensures high-quality data through annotation and validation

Our data annotation process involves multiple independent annotators labeling each record. When their responses diverge, we manually review and resolve the discrepancies. The validation sample size is proportionate to each category's representation in the dataset (1-5% of overall data). To avoid bias, annotators independently solve tasks rather than validating model outputs. This approach creates a "golden dataset" of correct answers that can benchmark any new model outputs across iterations without requiring new validation rounds.