Testing SQL Like a Software Engineer: Unit Testing, CI/CD, and Data Quality Automation

Table of Contents

# Introduction

Everyone focuses on writing SQL that “works,” but very few test whether it keeps working tomorrow. A single new row, a changed assumption, or a refactor can break a query silently. This article walks through a complete workflow, showing how to treat SQL like software: versioned, tested, and automated. We’ll use a real Amazon interview question about identifying customers with the highest daily spending. Then we will convert the SQL into a testable component, define expected outputs, and automate testing with continuous integration and continuous deployment (CI/CD).

# Step 1: Solving an Interview-Style SQL Question

// Understanding the Problem

In this interview question from Amazon, you are asked to find the customers with the highest daily total order cost between a certain date range.

// Understanding the Dataset

There are two data tables in this project: customers and orders.

The customers table:

Here is a preview of the dataset:

The orders table:

Here is a preview of the dataset:

This problem is perfect for illustrating how SQL can be treated like software: the query must be correct, stable, and resistant to regressions.

// Writing the SQL Solution

The logic breaks down into three parts:

Aggregate each customer’s total spending per day
Rank customers by total spending for each date
Return only the daily top spenders

Here is the final PostgreSQL solution:

WITH customer_daily_totals AS (
  SELECT
    o.cust_id,
    o.order_date,
    SUM(o.total_order_cost) AS total_daily_cost
  FROM orders o
  WHERE o.order_date BETWEEN '2019-02-01' AND '2019-05-01'
  GROUP BY o.cust_id, o.order_date
),
ranked_daily_totals AS (
  SELECT
    cust_id,
    order_date,
    total_daily_cost,
    RANK() OVER (
      PARTITION BY order_date
      ORDER BY total_daily_cost DESC
    ) AS rnk
  FROM customer_daily_totals
)
SELECT
  c.first_name,
  rdt.order_date,
  rdt.total_daily_cost AS max_cost
FROM ranked_daily_totals rdt
JOIN customers c ON rdt.cust_id = c.id
WHERE rdt.rnk = 1
ORDER BY rdt.order_date;

// Defining the Expected Output

Here is the expected output:

At this stage, most people stop.

# Step 2: Making the SQL Logic Reliable with Unit Tests

SQL breaks more easily than most think. A changed default, a renamed column, or a new data source can introduce silent errors. Testing protects you from these issues. There are three testing steps we will cover: converting the logic into a function, defining expected output, and writing a unit test suite.

// Turning the Query into a Reusable Component

To test the SQL code, we begin by wrapping it in a Python function using a lightweight testing framework like unittest. First, we define the query that we want to test:

    query = """
        WITH customer_daily_totals AS (
          SELECT
            o.cust_id,
            o.order_date,
            SUM(o.total_order_cost) AS total_daily_cost
          FROM orders o
          WHERE o.order_date BETWEEN '2019-02-01' AND '2019-05-01'
          GROUP BY o.cust_id, o.order_date
        ),
        ranked_daily_totals AS (
          SELECT
            cust_id,
            order_date,
            total_daily_cost,
            RANK() OVER (
              PARTITION BY order_date
              ORDER BY total_daily_cost DESC
            ) AS rnk
          FROM customer_daily_totals
        )
        SELECT
          c.first_name,
          rdt.order_date,
          rdt.total_daily_cost AS max_cost
        FROM ranked_daily_totals rdt
        JOIN customers c ON rdt.cust_id = c.id
        WHERE rdt.rnk = 1
        ORDER BY rdt.order_date;
    """

// Defining Test Input and Expected Output

Next, we must create a controlled sample dataset to test against.

test_customers = [
    (15, "Mia"),
    (7, "Jill"),
    (3, "Farida")
]

test_orders = [
    (1, 3, "2019-03-04", 100),
    (2, 3, "2019-03-01", 80),
    (4, 7, "2019-02-01", 25),
    (6, 15, "2019-02-01", 100)
]

We also create the expected output:

expected = [
    ("Mia", "2019-02-01", 100),
    ("Farida", "2019-03-01", 80),
    ("Farida", "2019-03-04", 100)
]

Why? Because defining expected outputs creates a benchmark.

// Writing SQL Unit Tests

Now we have the query defined, the test inputs, and the expected outputs. We can write an actual unit test. The idea is simple:

Create an isolated, in-memory database
Load controlled test data
Execute the SQL query
Assert that the result obtained matches the expected output

Python’s built-in unittest framework is highly effective because it allows us to keep dependencies minimal while providing structure and repeatability. We start by creating an in-memory SQLite database:

conn = sqlite3.connect(":memory:")
cursor = conn.cursor()

Using :memory: ensures that:

the test database is fully isolated
no external state can affect the result
the database is discarded automatically once the test finishes

Next, we recreate only the tables required by the query:

CREATE TABLE customers (...)
CREATE TABLE orders (...)

Even though the query only uses a subset of columns, the schema mirrors a realistic production table. This reduces the risk of false confidence caused by oversimplified schemas. We then insert the controlled test data defined earlier:

cursor.executemany("INSERT INTO customers VALUES (?, ?, ?, ?, ?, ?)", test_customers)
cursor.executemany("INSERT INTO orders VALUES (?, ?, ?, ?, ?)", test_orders)
conn.commit()

At this point, the database contains a known, deterministic state, which is essential for meaningful tests. Before executing the query, we load and print the test tables using Pandas:

customers_df = pd.read_sql("SELECT id, first_name, last_name, city FROM customers", conn)
orders_df = pd.read_sql("SELECT * FROM orders", conn)

While this step is not strictly required for automation, it is highly useful during development and debugging. When a test fails, being able to immediately inspect the input data saves significantly more time than checking the SQL logic, because it allows you to understand step-by-step what the code is computing. Now we run the query under test:

result = pd.read_sql(query, conn)

The result is loaded into a DataFrame, which provides:

structured access to rows and columns
easy comparison with expected outputs
readable printing for debugging

Next, we must verify the results row by row. The verification logic makes a manual assertion between the query output and the expected result:

all_correct = True

if len(result) != len(expected):
    all_correct = False

The first check confirms whether the number of rows returned by the query matches what we expect. A mismatch here immediately indicates missing or extra records. Next, we iterate through the expected output and compare it to the actual query result row by row:

for i, (fname, lname, date, cost) in enumerate(expected):
    if i < len(result):
        actual = result.iloc[i]
        if not (
            actual["first_name"] == fname
            and actual["last_name"] == lname
            and actual["order_date"] == date
            and actual["max_cost"] == cost
        ):
            all_correct = False

Each row is checked on all relevant dimensions:

customer name
order date
aggregated daily cost

If any value differs from the expected, the test is marked as failed. Finally, the test result is summarized in a clear pass/fail message:

if all_correct and len(result) == len(expected):
    print("ALL TESTS PASSED")
else:
    print("SOME TESTS FAILED")

The database connection is then closed:

If the tests pass, the expected output is:

This test carries some assumptions worth noting:

a stable row order (ORDER BY order_date)
exact matches on all values
no tolerance for ties or duplicate winners per day

The full script, ready to be used, can be seen here.

# Step 3: Automating SQL Tests with Continuous Integration and Continuous Deployment

A test suite is only useful if it runs consistently whenever needed. We utilize CI/CD to automate testing whenever a code change is made.

// Organizing the Project

A minimal repository structure can look like this:

// Creating the GitHub Actions Workflow

The next step is to ensure these tests run automatically whenever the code changes. For this, we use GitHub Actions. This tool allows us to define a CI workflow that runs the SQL tests every time code is pushed or a pull request is opened.

Create the workflow file: In your repository, create the following folder structure if it doesn’t already exist: .github/workflows/. Inside this folder, create a new file called test_sql.yml. The name is not special; GitHub only cares that the file lives inside the .github/workflows/ directory. You can name it anything, but test_sql.yml keeps things clear and simple.

Define when the workflow should run: Here is the full workflow file:

name: Run SQL Tests

on:
  push:
    branches: [ "main" ]
  pull_request:
    branches: [ "main" ]

This section defines when the workflow runs:

on every push to the main branch
on every pull request targeting main

In practice, this means:

pushing directly to main will trigger the tests
opening or updating a pull request will also trigger the tests

This helps catch SQL regressions before they get merged.

Define the test job: Next, we define a job called test:

jobs:
  test:
    runs-on: ubuntu-latest

This tells GitHub to:

create a fresh Linux machine
run all test steps inside it

Each workflow run starts from a clean environment, which prevents “it works on my machine” problems.

// Adding the Workflow Steps

Now we define the steps the machine should execute:

- name: Checkout repository
  uses: actions/checkout@v4

This step downloads your repository’s code into the runner so it can access your SQL files and tests.

- name: Set up Python
  uses: actions/setup-python@v5
  with:
    python-version: "3.10"

This installs Python 3.10, ensuring a consistent runtime across all runs.

- name: Install dependencies
  run: |
    python -m pip install --upgrade pip
    pip install -r requirements.txt

This installs all required Python libraries (such as Pandas) defined in requirements.txt.

- name: Run unit tests
  run: python -m unittest discover

Finally, this command:

automatically discovers test files
runs all SQL tests defined in the tests/ folder
fails the workflow if any test fails

The full workflow can be found here.

Running the workflow: You don’t need to run this file manually. Once committed:

pushing to main will trigger the workflow
opening a pull request will trigger the workflow

You can view the results directly in GitHub by navigating to your repository’s Actions tab.

Each run will show whether your SQL tests passed or failed.

# Step 4: Automating Data Quality

Unit tests confirm whether the logic still returns the expected output, and CI ensures these tests run automatically. But in real data environments, the input data itself can cause failures: late-arriving rows, malformed dates, missing keys, and unexpected duplicates can break queries long before the SQL logic does. This is where data quality automation comes in. Testing and versioning form a safety net for code changes; data quality automation extends that safety net to the data itself, preventing downstream issues before they impact results.

// Understanding Why Data Quality Checks Matter for SQL Workflows

In our interview problem, the following issues could make the query return incorrect results:

A customer’s first name is no longer unique.
An order arrives with a negative cost.
Dates fall outside the expected range.
Daily aggregates contain duplicate rows for the same customer and date.
A customer exists in orders but not in customers.

Without automated checks, these issues may silently distort results. Because SQL doesn’t raise obvious exceptions in many of these scenarios, errors spread unnoticed. Automated data quality checks detect these issues early and prevent the pipeline from running with corrupted or incomplete data.

// Turning Data Assumptions into Automated Rules

Every SQL query relies on assumptions about the data. The problem is that these assumptions are rarely written down and almost never enforced. In our daily spenders query, correctness depends not only on SQL logic, but also on the shape and validity of the input data. Instead of trusting those assumptions implicitly, we can turn them into automated data quality rules. The idea is simple:

express each assumption as a SQL check
run those checks automatically
fail fast if any assumption is violated

First names must be unique: Our query joins customers by ID, but returns first_name as an identifier. If first names are no longer unique, the output becomes ambiguous.

SELECT first_name, COUNT(*)
FROM customers
GROUP BY first_name
HAVING COUNT(*) > 1;

If this query returns any rows, the assumption is broken.

Order costs must be non-negative: Negative order values usually indicate ingestion or upstream transformation issues.

SELECT *
FROM orders
WHERE total_order_cost < 0;

Even a single row here invalidates financial aggregates.

Order dates must be valid and within expectations: Dates that are missing or wildly out of range often reveal synchronization or parsing errors.

SELECT *
FROM orders
WHERE order_date IS NULL
   OR order_date < '2010-01-01'
   OR order_date > CURRENT_DATE;

This protects the query from silently including bad temporal data.

Every order must reference a valid customer: If an order refers to a non-existent customer, joins will silently drop rows.

SELECT o.*
FROM orders o
LEFT JOIN customers c ON c.id = o.cust_id
WHERE c.id IS NULL;

This rule ensures referential integrity before analytics logic runs.

// Converting Rules into an Automated Check

Instead of running these checks manually, we can wrap them into a single Python function that fails immediately if any rule is violated.

import pandas as pd

def run_data_quality_checks(conn):
    checks = {
        "Duplicate first names": """
            SELECT first_name
            FROM customers
            GROUP BY first_name
            HAVING COUNT(*) > 1;
        """,
        "Negative order costs": """
            SELECT *
            FROM orders
            WHERE total_order_cost < 0;
        """,
        "Invalid order dates": """
            SELECT *
            FROM orders
            WHERE order_date IS NULL
               OR order_date < '2010-01-01'
               OR order_date > CURRENT_DATE;
        """,
        "Orders without customers": """
            SELECT o.*
            FROM orders o
            LEFT JOIN customers c ON c.id = o.cust_id
            WHERE c.id IS NULL;
        """
    }

    for rule_name, query in checks.items():
        result = pd.read_sql(query, conn)
        if not result.empty:
            raise ValueError(f"Data quality check failed: {rule_name}")

    print("All data quality checks passed.")

This function:

executes each rule
checks whether any rows are returned
raises an error immediately if a violation is found

At this point, data quality rules behave just like unit tests: pass or fail. If tests pass, you will see something like:

Because the data quality checks run inside Python, they are automatically picked up by the existing GitHub Actions workflow:

- name: Run unit tests
  run: python -m unittest discover

The CI pipeline will stop immediately as long as:

the function is imported or executed by your test file
a failure raises an exception

# Concluding Remarks

Most people stop once the SQL query produces a correct answer. But real data environments reward those who make their queries stable, testable, and version-controlled.

Combining the following practices ensures the query continues to deliver reliable results, even as data changes over time:

a clear solution
a reusable component
unit tests
automated CI

Correctness is good, but reliability is essential.

Nate Rosidi is a data scientist and in product strategy. He’s also an adjunct professor teaching analytics, and is the founder of StrataScratch, a platform helping data scientists prepare for their interviews with real interview questions from top companies. Nate writes on the latest trends in the career market, gives interview advice, shares data science projects, and covers everything SQL.

What's Hot

Resident Evil Requiem producer seems glad you rejected the Nvidia AI version of Grace

27 Years Later, This Near-Perfect 5-Part Dark Fantasy Spin-off Is Now the Perfect Free Binge

Australia’s EV incentives extended, but they’re being wound back

Testing SQL Like a Software Engineer: Unit Testing, CI/CD, and Data Quality Automation

100 Data Science Interview Questions & Answers 2026

10 Best Applications For People With Diabetes

Build an AI-Powered WhatsApp Sticker Generator with Python

From Prompt to a Shipped Hugging Face Model

7 Practical Ways to Reduce Claude Code Token Usage

15+ Solved Agentic AI Projects with Github Links

Black Swans in Artificial Intelligence — Dan Rose AI

We let ChatGPT judge impossible superhero debates — here’s how it ruled

Every Clue That Tony Stark Was Always Doctor Doom

Most Popular

Black Swans in Artificial Intelligence — Dan Rose AI

We let ChatGPT judge impossible superhero debates — here’s how it ruled

Every Clue That Tony Stark Was Always Doctor Doom

Subscribe to Updates

What's Hot

Testing SQL Like a Software Engineer: Unit Testing, CI/CD, and Data Quality Automation

# Introduction

# Step 1: Solving an Interview-Style SQL Question

// Understanding the Problem

// Understanding the Dataset

// Writing the SQL Solution

// Defining the Expected Output

# Step 2: Making the SQL Logic Reliable with Unit Tests

// Turning the Query into a Reusable Component

// Defining Test Input and Expected Output

// Writing SQL Unit Tests

# Step 3: Automating SQL Tests with Continuous Integration and Continuous Deployment

// Organizing the Project

// Creating the GitHub Actions Workflow

// Adding the Workflow Steps

# Step 4: Automating Data Quality

// Understanding Why Data Quality Checks Matter for SQL Workflows

// Turning Data Assumptions into Automated Rules

// Converting Rules into an Automated Check

# Concluding Remarks

Related posts:

100 Data Science Interview Questions & Answers 2026

10 Best Applications For People With Diabetes

Build an AI-Powered WhatsApp Sticker Generator with Python

Related Posts

Subscribe to Updates