Generate Incorrect SQL Queries

Expanded Database Coverage!

We now support multiple database schemas from the BIRD benchmark with hundreds of additional seed queries!

Available SQLite databases: Superhero, California Schools, Toxicology, Student Club

Explore All Databases Explore Error Taxonomy

Generation Parameters

Number of Natural Language Queries

How many random natural language queries to select (max 100)

Incorrect Queries Per NL Query

Maximum number of incorrect SQL queries to generate for each natural language query

Random Seed

Seed for random selection (for reproducibility)

Database Filter

Filter which database types to include in the random selection

About

This tool generates realistic but incorrect SQL queries based on natural language questions from the BIRD benchmark.

The tool uses GPT-4o to generate queries that represent common mistakes a human might make when writing SQL, with full support for multiple database schemas.

Features:

Generates syntactically valid but semantically incorrect SQL queries
Validates that incorrect queries produce different results from correct queries
Analyzes error patterns and categorizes common mistakes
Exports results in JSONL or JSON format

Generate Realistic but Incorrect SQL Queries

Expanded Database Coverage!

Generation Parameters

About

Features: