What is the "SQL Data-Quality Checks for Any Table" prompt?

Add the past-incident check FIRST in your test suite. If you don't have a check that would have caught your last data quality nightmare, you're not finished yet. The prompt targets ChatGPT (GPT-4) and lives in the Coding & Development category on mycopyprompt.

What AI model is this prompt for?

This prompt is written for ChatGPT (GPT-4). It's a text/chat prompt — paste it into ChatGPT (GPT-4) (or compatible LLMs like Claude or GPT-4) to get the expected output.

How do I use this prompt?

1. Click the Copy button on this page to copy the full prompt. 2. Open ChatGPT (GPT-4). 3. Paste the prompt into a new conversation. 4. Replace any {placeholders} with your specifics, then send. Most prompts produce the right output on the first try; complex ones may need 1-2 iterations.

Is this prompt free to use?

Yes — every prompt on mycopyprompt is free forever. No paywall, no signup wall for browsing or copying. You can use it for personal or commercial work, just don't redistribute the entire mycopyprompt library.

Can I modify the prompt?

Absolutely — most prompts are templates. Look for {placeholders} (curly braces) and swap them with your own values. You can also reword sections, add constraints, or chain it with other prompts.

What kind of output does this produce?

See the "Sample output" panel above — that's a real example of what ChatGPT (GPT-4) returns when this prompt runs. Your output will vary in wording but should follow the same structure and depth.

SQL Data-Quality Checks for Any Table — ChatGPT (GPT-4) prompt

The Prompt1,346 chars

Write me SQL data-quality checks for this table. I want to catch the kind of silent breakage that shows up 3 months later in a dashboard.

DATABASE: {postgres / mysql / bigquery / snowflake / sqlserver / sqlite}
TABLE NAME: {schema.table}
ROW COUNT (approx): {number}
COLUMNS (name + type, paste all): {paste}
USED FOR: {dashboard / api / pipeline / reporting}
KNOWN UPSTREAM SOURCES: {where_the_data_comes_from}
WHAT BAD DATA LOOKS LIKE in this table (past incidents): {real_examples_or_'unknown'}
KEYS / UNIQUENESS RULES: {primary_key + business_keys}

DELIVER:
1. **Schema-level checks** — row count vs expected range, freshness (last row added < X), duplicate primary keys.
2. **Column-level checks** — for each important column: null rate, distinct-value count, type validity, allowed values.
3. **Referential checks** — for FKs, orphans + dangling references.
4. **Business-logic checks** — pairs of fields that must agree (e.g. order_status = 'shipped' implies shipped_at IS NOT NULL).
5. **Distribution checks** — column means / counts against expected baselines (Z-score deviation).
6. **A single 'all checks in one query' version** using UNION ALL, where each check returns: check_name, status, fail_count, sample_bad_rows. Sorted so 'fail' comes first.
7. **How to schedule it** — Airflow / dbt tests / cron, with 'fail loud' guidance.

SQL Data-Quality Checks for Any Table

Common questions

You might also like

Build the Right SQL Query, Explained

Code Review with Best Practices

Code Review by a Senior Engineer

Quick Security Audit Pass

Cold Email That Gets a Reply

Data Migration Plan (Old → New)