Close Menu

    Subscribe to Updates

    Get the latest news from tastytech.

    What's Hot

    Crimson Desert Is Finally Getting Summonable Mounts

    April 10, 2026

    Hyperpop Spark with PC Music Precision

    April 10, 2026

    Aston Martin Prototype: Caught Testing On Video

    April 10, 2026
    Facebook X (Twitter) Instagram
    Facebook X (Twitter) Instagram
    tastytech.intastytech.in
    Subscribe
    • AI News & Trends
    • Tech News
    • AI Tools
    • Business & Startups
    • Guides & Tutorials
    • Tech Reviews
    • Automobiles
    • Gaming
    • movies
    tastytech.intastytech.in
    Home»Business & Startups»All About Pyjanitor’s Method Chaining Functionality, And Why Its Useful
    All About Pyjanitor’s Method Chaining Functionality, And Why Its Useful
    Business & Startups

    All About Pyjanitor’s Method Chaining Functionality, And Why Its Useful

    gvfx00@gmail.comBy gvfx00@gmail.comApril 10, 2026No Comments5 Mins Read
    Share
    Facebook Twitter LinkedIn Pinterest Email



    Image by Editor

     

    Table of Contents

    Toggle
    • # Introduction
    • # Understanding Method Chaining
    • # Entering Pyjanitor: Application Example
    • # Wrapping Up
      • Related posts:
    • How LLMs Generate Text 3x Faster
    • Building a Real Image Matching Project with Gemini Embedding 2
    • How AI Models Inherit Hidden Dangers

    # Introduction

     
    Working intensively with data in Python teaches all of us an important lesson: data cleaning usually doesn’t feel much like performing data science, but rather like acting as a digital janitor. Here’s what it takes in most use cases: loading a dataset, discovering many column names are messy, coming across missing values, and ending up with plenty of temporary data variables, only the last of them containing your final, clean dataset.

    Pyjanitor provides a cleaner approach to carry these steps out. This library can be used alongside the notion of method chaining to transform otherwise arduous data cleaning processes into pipelines that look elegant, efficient, and readable.

    This article shows how and demystifies method chaining in the context of Pyjanitor and data cleaning.

     

    # Understanding Method Chaining

     
    Method chaining is not something new in the realm of programming: actually, it is a well-established coding pattern. It consists of calling multiple methods in sequential order on an object: all in just one statement. This way, you don’t need to reassign a variable after each step, because each method returns an object that invokes the next attached method, and so on.

    The following example helps understand the concept at its core. Observe how we would apply several simple modifications to a small piece of text (string) using “standard” Python:

    text = "  Hello World!  "
    text = text.strip()
    text = text.lower()
    text = text.replace("world", "python")

     

    The resulting value in text will be: "hello python!".

    Now, with method chaining, the same process would look like:

    text = "  Hello World!  "
    cleaned_text = text.strip().lower().replace("world", "python")

     

    Notice that the logical flow of operations applied goes from left to right: all in a single, unified chain of thought!

    If you got it, now you perfectly understand the notion of method chaining. Let’s translate this vision now to the context of data science using Pandas. A standard data cleaning on a dataframe, consisting of multiple steps, typically looks like this without chaining:

    # Traditional, step-by-step Pandas approach
    df = pd.read_csv("data.csv")
    df.columns = df.columns.str.lower().str.replace(' ', '_')
    df = df.dropna(subset=['id'])
    df = df.drop_duplicates()

     

    As we will see shortly, by applying method chaining, we will construct a unified pipeline whereby dataframe operations are encapsulated using parentheses. On top of that, we will no longer need intermediate variables containing non-final dataframes, allowing for cleaner, more bug-resilient code. And (once again) on the very top of that, Pyjanitor makes this process seamless.

     

    # Entering Pyjanitor: Application Example

     
    Pandas itself offers native support for method chaining to some extent. However, some of its essential functionalities have not been designed strictly bearing this pattern in mind. This is a core motivation why Pyjanitor was born, based on a nearly-namesake R package: janitor.

    In essence, Pyjanitor can be framed as an extension for Pandas that brings a pack of custom data-cleaning processes in a method chaining-friendly fashion. Examples of its application programming interface (API) method names include clean_names(), rename_column(), remove_empty(), and so on. Its API employs a suite of intuitive method names that take code expressiveness to a whole new level. Besides, Pyjanitor completely relies on open-source, free tools, and can be seamlessly run in cloud and notebook environments, such as Google Colab.

    Let’s fully understand how method chaining in Pyjanitor is applied, through an example in which we first create a small, synthetic dataset that looks intentionally messy, and put it into a Pandas DataFrame object.

    IMPORTANT: to avoid common, yet somewhat dreadful errors due to incompatibility between library versions, make sure you have the latest available version of both Pandas and Pyjanitor, by using !pip install --upgrade pyjanitor pandas first.

    messy_data = {
        'First Name ': ['Alice', 'Bob', 'Charlie', 'Alice', None],
        '  Last_Name': ['Smith', 'Jones', 'Brown', 'Smith', 'Doe'],
        'Age': [25, np.nan, 30, 25, 40],
        'Date_Of_Birth': ['1998-01-01', '1995-05-05', '1993-08-08', '1998-01-01', '1983-12-12'],
        'Salary ($)': [50000, 60000, 70000, 50000, 80000],
        'Empty_Col': [np.nan, np.nan, np.nan, np.nan, np.nan]
    }
    
    df = pd.DataFrame(messy_data)
    print("--- Messy Original Data ---")
    print(df.head(), "\n")

     

    Now we define a Pyjanitor method chain that applies a series of processing to both column names and data itself:

    cleaned_df = (
        df
        .rename_column('Salary ($)', 'Salary')  # 1. Manually fix tricky names BEFORE getting them mangled
        .clean_names()                          # 2. Standardize everything (makes it 'salary')
        .remove_empty()                         # 3. Drop empty columns/rows
        .drop_duplicates()                      # 4. Remove duplicate rows
        .fill_empty(                            # 5. Impute missing values
            column_names=['age'],               # CAUTION: after previous steps, assume lowercase name: 'age'
            value=df['Age'].median()            # Pull the median from the original raw df
        )
        .assign(                                # 6. Create a new column using assign
            salary_k=lambda d: d['salary'] / 1000
        )
    )
    
    print("--- Cleaned Pyjanitor Data ---")
    print(cleaned_df)

     

    The above code is self-explanatory, with inline comments explaining each method called at every step of the chain.

    This is the output of our example, which compares the original messy data with the cleaned version:

    --- Messy Original Data ---
      First Name    Last_Name   Age Date_Of_Birth  Salary ($)  Empty_Col
    0       Alice       Smith  25.0    1998-01-01       50000        NaN
    1         Bob       Jones   NaN    1995-05-05       60000        NaN
    2     Charlie       Brown  30.0    1993-08-08       70000        NaN
    3       Alice       Smith  25.0    1998-01-01       50000        NaN
    4         NaN         Doe  40.0    1983-12-12       80000        NaN 
    
    --- Cleaned Pyjanitor Data ---
      first_name_ _last_name   age date_of_birth  salary  salary_k
    0       Alice      Smith  25.0    1998-01-01   50000      50.0
    1         Bob      Jones  27.5    1995-05-05   60000      60.0
    2     Charlie      Brown  30.0    1993-08-08   70000      70.0
    4         NaN        Doe  40.0    1983-12-12   80000      80.0

     

    # Wrapping Up

     
    Throughout this article, we have learned how to use the Pyjanitor library to apply method chaining and simplify otherwise arduous data cleaning processes. This makes the code cleaner, expressive, and — in a manner of speaking — self-documenting, so that other developers or your future self can read the pipeline and easily understand what is going on in this journey from raw to ready dataset.

    Great job!
     
     

    Iván Palomares Carrascosa is a leader, writer, speaker, and adviser in AI, machine learning, deep learning & LLMs. He trains and guides others in harnessing AI in the real world.

    Related posts:

    Top 18 Power BI Projects for Practice in 2026

    A Developer-First Platform for Orchestrating AI Agents

    Claude Haiku 4.5 is Here… and it’s BETTER than Sonnet 4.5?

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleGoogle removes Doki Doki Literature Club! from the Play Store
    Next Article Six months into ‘ceasefire’, Gaza suffers under persistent Israeli attacks | Israel-Palestine conflict News
    gvfx00@gmail.com
    • Website

    Related Posts

    Business & Startups

    Project Glasswing is World’s Most Powerful AI in Action

    April 10, 2026
    Business & Startups

    What It Is, How It Works, and What ROI to Expect

    April 9, 2026
    Business & Startups

    Kaggle + Google’s Free 5-Day Gen AI Course

    April 9, 2026
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    Black Swans in Artificial Intelligence — Dan Rose AI

    October 2, 2025138 Views

    BMW Will Put eFuel In Cars Made In Germany From 2028

    October 14, 202511 Views

    Best Sonic Lego Deals – Dr. Eggman’s Drillster Gets Big Price Cut

    December 16, 20259 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram

    Subscribe to Updates

    Get the latest tech news from tastytech.

    About Us
    About Us

    TastyTech.in brings you the latest AI, tech news, cybersecurity tips, and gadget insights all in one place. Stay informed, stay secure, and stay ahead with us!

    Most Popular

    Black Swans in Artificial Intelligence — Dan Rose AI

    October 2, 2025138 Views

    BMW Will Put eFuel In Cars Made In Germany From 2028

    October 14, 202511 Views

    Best Sonic Lego Deals – Dr. Eggman’s Drillster Gets Big Price Cut

    December 16, 20259 Views

    Subscribe to Updates

    Get the latest news from tastytech.

    Facebook X (Twitter) Instagram Pinterest
    • Homepage
    • About Us
    • Contact Us
    • Privacy Policy
    © 2026 TastyTech. Designed by TastyTech.

    Type above and press Enter to search. Press Esc to cancel.

    Ad Blocker Enabled!
    Ad Blocker Enabled!
    Our website is made possible by displaying online advertisements to our visitors. Please support us by disabling your Ad Blocker.