Close Menu

    Subscribe to Updates

    Get the latest news from tastytech.

    What's Hot

    Ford posts its biggest loss since the Global Financial Crisis

    February 11, 2026

    Using synthetic biology and AI to address global antimicrobial resistance threat | MIT News

    February 11, 2026

    Barclays bets on AI to cut costs and boost returns

    February 11, 2026
    Facebook X (Twitter) Instagram
    Facebook X (Twitter) Instagram
    tastytech.intastytech.in
    Subscribe
    • AI News & Trends
    • Tech News
    • AI Tools
    • Business & Startups
    • Guides & Tutorials
    • Tech Reviews
    • Automobiles
    • Gaming
    • movies
    tastytech.intastytech.in
    Home»Business & Startups»Versioning and Testing Data Solutions: Applying CI and Unit Tests on Interview-style Queries
    Versioning and Testing Data Solutions: Applying CI and Unit Tests on Interview-style Queries
    Business & Startups

    Versioning and Testing Data Solutions: Applying CI and Unit Tests on Interview-style Queries

    gvfx00@gmail.comBy gvfx00@gmail.comFebruary 11, 2026No Comments11 Mins Read
    Share
    Facebook Twitter LinkedIn Pinterest Email


    Versioning and Testing Data Solutions
    Image by Author

     

    Table of Contents

    Toggle
    • # Introduction
    • # Solving A Real Interview Question From Tesla
        • // Understanding The Dataset
        • // Writing The Python Solution
        • // Viewing The Expected Output
    • # Making The Solution Reliable With Unit Tests
        • // Turning The Script Into A Reusable Function
        • // Defining Test Data And Expected Output
        • // Writing And Running Unit Tests
    • # Automating Tests With Continuous Integration
        • // Organizing Your Project Files
        • // Understanding The Repository Layout
        • // Creating A GitHub Actions Workflow
        • // Reviewing Test Results In GitHub Actions
        • // When A Small Change Breaks The Test
    • # Using Version Control To Track And Test Changes
    • # Final Thoughts
      • Related posts:
    • AI Joins The Dark Side
    • What are Recursive Language Models (RLM)?
    • 10 Best Python YouTube Channels for Beginners [2026 Edition]

    # Introduction

     
    Everyone focuses on solving the problem, but almost no one tests the solution. Sometimes, a perfectly working script can break with just one new row of data or a slight change in the logic.

    In this article, we will solve a Tesla interview question in Python and show how versioning and unit tests turn a fragile script into a reliable solution by following three steps. We will start with the interview question and end with automated testing using GitHub Actions.

     

    Versioning and Testing Data Solutions
    Image by Author

     

    We will go through these three steps to make a data solution production-ready.

    First, we will solve a real interview question from Tesla. Next, we will add unit tests to ensure the solution stays reliable over time. Finally, we will use GitHub Actions to automate testing and version control.

     

    # Solving A Real Interview Question From Tesla

     

    New Products
    Calculate the net change in the number of products launched by companies in 2020 compared to 2019. Your output should include the company names and the net difference.
    (Net difference = Number of products launched in 2020 – The number launched in 2019.)

     

    In this interview question from Tesla, you are asked to measure product growth across two years.

    The task is to return each company’s name along with the difference in product count between 2020 and 2019.

     

    // Understanding The Dataset

    Let us first look at the dataset we are working with. Here are the column names.

     

    Column Name Data Type
    year int64
    company_name object
    product_name object

     

    Let us preview the dataset.
     

    Year Company_name Product_name
    2019 Toyota Avalon
    2019 Toyota Camry
    2020 Toyota Corolla
    2019 Honda Accord
    2019 Honda Passport

     

    This dataset contains three columns: year, company_name, and product_name. Each row represents a car model released by a company in a given year.

     

    // Writing The Python Solution

    We will use basic pandas operations to group, compare, and calculate the net product change per company. The function we will write splits the data into subsets for 2019 and 2020.

    Next, it merges them by company names and counts the number of unique products launched each year.

    import pandas as pd
    import numpy as np
    from datetime import datetime
    
    df_2020 = car_launches[car_launches['year'].astype(str) == '2020']
    df_2019 = car_launches[car_launches['year'].astype(str) == '2019']
    df = pd.merge(df_2020, df_2019, how='outer', on=[
        'company_name'], suffixes=['_2020', '_2019']).fillna(0)

     

    The final output subtracts 2019 counts from 2020 to get the net difference. Here is the entire code.

    import pandas as pd
    import numpy as np
    from datetime import datetime
    
    df_2020 = car_launches[car_launches['year'].astype(str) == '2020']
    df_2019 = car_launches[car_launches['year'].astype(str) == '2019']
    df = pd.merge(df_2020, df_2019, how='outer', on=[
        'company_name'], suffixes=['_2020', '_2019']).fillna(0)
    df = df[df['product_name_2020'] != df['product_name_2019']]
    df = df.groupby(['company_name']).agg(
        {'product_name_2020': 'nunique', 'product_name_2019': 'nunique'}).reset_index()
    df['net_new_products'] = df['product_name_2020'] - df['product_name_2019']
    result = df[['company_name', 'net_new_products']]

     

    // Viewing The Expected Output

    Here is the expected output.
     

    Company_name Net_new_products
    Chevrolet 2
    Ford -1
    Honda -3
    Jeep 1
    Toyota -1

     

    # Making The Solution Reliable With Unit Tests

     
    Solving a data problem once does not mean it will keep working. A new row or a logic tweak can silently break your script. For instance, imagine you accidentally rename a column in your code, changing this line:

    df['net_new_products'] = df['product_name_2020'] - df['product_name_2019']

     

    to this:

    df['new_products'] = df['product_name_2020'] - df['product_name_2019']

     

    The logic still runs, but your output (and tests) will suddenly fail because the expected column name no longer matches. Unit tests fix that. They check if the same input still gives the same output, every time. If something breaks, the test fails and shows exactly where. We will do this in three steps, from turning the interview question’s solution into a function to writing a test that checks the output against what we expect.

     

    Versioning and Testing Data Solutions
    Image by Author

     

    // Turning The Script Into A Reusable Function

    Before writing tests, we need to make our solution reusable and easy to test. Converting it into a function allows us to run it with different datasets and verify the output automatically, without having to rewrite the same code every time. We changed the original code into a function that accepts a DataFrame and returns a result. Here is the code.

    def calculate_net_new_products(car_launches):
        df_2020 = car_launches[car_launches['year'].astype(str) == '2020']
        df_2019 = car_launches[car_launches['year'].astype(str) == '2019']
    
        df = pd.merge(df_2020, df_2019, how='outer', on=[
            'company_name'], suffixes=['_2020', '_2019']).fillna(0)
    
        df = df[df['product_name_2020'] != df['product_name_2019']]
    
        df = df.groupby(['company_name']).agg({
            'product_name_2020': 'nunique',
            'product_name_2019': 'nunique'
        }).reset_index()
    
        df['net_new_products'] = df['product_name_2020'] - df['product_name_2019']
        return df[['company_name', 'net_new_products']]

     

    // Defining Test Data And Expected Output

    Before running any tests, we need to know what “correct” looks like. Defining the expected output gives us a clear benchmark to compare our function’s results against. So, we will build a small test input and clearly define what the correct output should be.

    import pandas as pd
    
    # Sample test data
    test_data = pd.DataFrame({
        'year': [2019, 2019, 2020, 2020],
        'company_name': ['Toyota', 'Toyota', 'Toyota', 'Toyota'],
        'product_name': ['Camry', 'Avalon', 'Corolla', 'Yaris']
    })
    
    # Expected output
    expected_output = pd.DataFrame({
        'company_name': ['Toyota'],
        'net_new_products': [0]  # 2 in 2020 - 2 in 2019
    })

     

    // Writing And Running Unit Tests

    The following test code checks if your function returns exactly what you expect.
     
    Versioning and Testing Data Solutions
     

    If not, the test fails and tells you why, down to the last row or column.

     
    Versioning and Testing Data Solutions
     

    The test below uses the function from the previous step (calculate_net_new_products()) and the expected output we defined.

    import unittest
    
    class TestProductDifference(unittest.TestCase):
        def test_net_new_products(self):
            result = calculate_net_new_products(test_data)
            result = result.sort_values('company_name').reset_index(drop=True)
            expected = expected_output.sort_values('company_name').reset_index(drop=True)
    
            pd.testing.assert_frame_equal(result, expected)
    
    if __name__ == '__main__':
        unittest.main()

     

    # Automating Tests With Continuous Integration

     
    Writing tests is a good start, but only if they actually run. You could run the tests manually after every change, but that does not scale, it is easy to forget, and team members may use different setups. Continuous Integration (CI) solves this by running tests automatically whenever code changes are pushed to the repository.

    GitHub Actions is a free CI tool that does this on every push, keeping your solution reliable even when the code, data, or logic changes. It runs your tests automatically on every push, so your solution stays reliable even when the code, data, or logic changes. Here is how to apply CI with GitHub Actions.

     

    Versioning and Testing Data Solutions
    Image by Author

     

    // Organizing Your Project Files

    To apply CI to an interview query, you first need to push your solution to a GitHub repository. (To learn how to create a GitHub repo, please read this).

    Then, set up the following files:

    • solution.py: Interview questions solution from Step 2.1
    • expected_output.py: Defines test input and expected output from Step 2.2
    • test_solution.py: Unit test using unittest from Step 2.3
    • requirements.txt: Dependencies (e.g., pandas)
    • .github/workflows/test.yml: GitHub Actions workflow file
    • data/car_launches.csv: Input dataset used by the solution

     

    // Understanding The Repository Layout

    The repository is organized this way so GitHub Actions can find everything it needs in your GitHub repository without extra setup. It keeps things simple, consistent, and easy for both you and others to work with.

    my-query-solution/
    ├── data/
    │   └── car_launches.csv
    ├── solution.py
    ├── expected_output.py 
    ├── test_solution.py
    ├── requirements.txt
    └── .github/
        └── workflows/
            └── test.yml

     

    // Creating A GitHub Actions Workflow

    Now that you have all the files, the last one you need is test.yml. This file tells GitHub Actions how to run your tests automatically when code changes.

    First, we name the workflow and tell GitHub when to run it.

    name: Run Unit Tests
    
    on:
      push:
        branches: [ main ]
      pull_request:
        branches: [ main ]

     

    This means the tests will run every time someone pushes code or opens a pull request on the main branch. Next, we create a job that defines what will happen inside the workflow.

    jobs:
      test:
        runs-on: ubuntu-latest

     

    The job runs on GitHub’s Ubuntu environment, which gives you a clean setup each time. Now we add steps inside that job. The first one checks out your repository so GitHub Actions can access your code.

        - name: Checkout repository
          uses: actions/checkout@v4

     

    Then we set up Python and choose the version we want to use.

        - name: Set up Python
          uses: actions/setup-python@v5
          with:
            python-version: "3.10"

     

    After that, we install all the dependencies listed in requirements.txt.

        - name: Install dependencies
          run: |
            python -m pip install --upgrade pip
            pip install -r requirements.txt

     

    Finally, we run all unit tests in the project.

        - name: Run unit tests
          run: python -m unittest discover

     

    This last step runs your tests automatically and shows any errors if something breaks. Here is the full file for reference:

    name: Run Unit Tests
    
    on:
      push:
        branches: [ main ]
      pull_request:
        branches: [ main ]
    
    jobs:
      test:
        runs-on: ubuntu-latest
        
        steps:
        - name: Checkout repository
          uses: actions/checkout@v4
          
        - name: Set up Python
          uses: actions/setup-python@v5
          with:
            python-version: "3.10"
            
        - name: Install dependencies
          run: |
            python -m pip install --upgrade pip
            pip install -r requirements.txt
            
        - name: Run unit tests
          run: python -m unittest discover

     

    // Reviewing Test Results In GitHub Actions

    Once you have uploaded all the files to your GitHub repository, go to the Actions tab by clicking Actions, as you can see from the screenshot below.

     
    Versioning and Testing Data Solutions
     

    Once you click on Actions, you will see a green checkmark if everything ran successfully, like in the screenshot below.
     
    Versioning and Testing Data Solutions
     

    Click into the “Update test.yml” to see what actually happened. You will get a full breakdown, from setting up Python to running the test. If all tests pass:

    • Each step will have a check mark.
    • That confirms everything worked as expected.
    • It means your code behaves as intended at every stage, based on the tests you defined.
    • The output matches the goals you set when creating those tests.

    Let us see:
     
    Versioning and Testing Data Solutions
     

    As you can see, our unit test completed in just 1 second, and the entire CI process finished in 17 seconds, verifying everything from setup to test execution.

     

    // When A Small Change Breaks The Test

    Not every change will pass the test. Let us say you accidentally rename a column in solution.py, and send the changes to GitHub, for example:

    # Original (works fine)
    df['net_new_products'] = df['product_name_2020'] - df['product_name_2019']
    
    # Accidental change
    df['new_products'] = df['product_name_2020'] - df['product_name_2019']

     

    Let us now see the test results in the action tab.
     
    Versioning and Testing Data Solutions
     

    We have an error. Let us click it to see the details.
     
    Versioning and Testing Data Solutions
     

    The unit tests did not pass, so let us click “Run unit tests” to see the full error message.
     
    Versioning and Testing Data Solutions
     

    As you can see, our tests found the issue with a KeyError: 'net_new_products', because the column name in the function no longer matches what the test expects.

    That is how you keep your code under constant check. If you or someone on your team makes a mistake, the tests act as your safety net.

     

    # Using Version Control To Track And Test Changes

     
    Versioning helps you track every change you make, whether it is in your logic, your tests, or your dataset. Say you want to try a new way to group the data. Instead of editing the main script directly, create a new branch:

    git checkout -b refactor-grouping

     

    Here is what is next:

    • Make your changes, commit them, and run the tests.
    • If all tests pass, meaning the code works as expected, merge it.
    • If not, revert the branch without affecting the main code.

    That is the power of version control: every change is tracked, testable, and reversible.

     

    # Final Thoughts

     
    Most people stop after getting the right answer. But real-world data solutions ask more than that. They reward those who can build queries that hold up over time, not just once.

    With versioning, unit tests, and a simple CI setup, even a one-off interview question becomes a reliable, reusable part of your portfolio.
     
     

    Nate Rosidi is a data scientist and in product strategy. He’s also an adjunct professor teaching analytics, and is the founder of StrataScratch, a platform helping data scientists prepare for their interviews with real interview questions from top companies. Nate writes on the latest trends in the career market, gives interview advice, shares data science projects, and covers everything SQL.



    Related posts:

    3 Ways to Anonymize and Protect User Data in Your ML Pipeline

    10 Most Popular GitHub Repositories for Learning AI

    AI Rapper Turns Out To Be A Racist

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleRecovery Slides Are My Feet’s Best Friend. Here Are My Top Picks
    Next Article Barclays bets on AI to cut costs and boost returns
    gvfx00@gmail.com
    • Website

    Related Posts

    Business & Startups

    How to Create Your AI Caricature Using ChatGPT Image?

    February 11, 2026
    Business & Startups

    How to Improve Student Retention: AI-Powered Early Intervention That Works in 2026

    February 11, 2026
    Business & Startups

    AI Agents Explained in 3 Levels of Difficulty

    February 10, 2026
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    BMW Will Put eFuel In Cars Made In Germany From 2028

    October 14, 202511 Views

    Best Sonic Lego Deals – Dr. Eggman’s Drillster Gets Big Price Cut

    December 16, 20259 Views

    What is Fine-Tuning? Your Ultimate Guide to Tailoring AI Models in 2025

    October 14, 20259 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram

    Subscribe to Updates

    Get the latest tech news from tastytech.

    About Us
    About Us

    TastyTech.in brings you the latest AI, tech news, cybersecurity tips, and gadget insights all in one place. Stay informed, stay secure, and stay ahead with us!

    Most Popular

    BMW Will Put eFuel In Cars Made In Germany From 2028

    October 14, 202511 Views

    Best Sonic Lego Deals – Dr. Eggman’s Drillster Gets Big Price Cut

    December 16, 20259 Views

    What is Fine-Tuning? Your Ultimate Guide to Tailoring AI Models in 2025

    October 14, 20259 Views

    Subscribe to Updates

    Get the latest news from tastytech.

    Facebook X (Twitter) Instagram Pinterest
    • Homepage
    • About Us
    • Contact Us
    • Privacy Policy
    © 2026 TastyTech. Designed by TastyTech.

    Type above and press Enter to search. Press Esc to cancel.

    Ad Blocker Enabled!
    Ad Blocker Enabled!
    Our website is made possible by displaying online advertisements to our visitors. Please support us by disabling your Ad Blocker.