top of page

Project: Cleaning a customer database with Python Pandas

tramngocnguyenhcm

In this project, I am using simple codes to clean a database containing customer information. The purpose is to create a database for the customer service team, extracted from the company's main database, which they can use to call customers who have given permission to receive consulting services.




Here are the steps involved in the cleaning process:

  1. Import an Excel file into Pandas using Jupyter Notebook.

  2. Remove duplicated rows from the database.

  3. Eliminate unused columns.

  4. Clean the data using functions such as .strip, .replace, and .split.

  5. Convert float-type data into string data using apply() and lambda() functions.

  6. Format data values according to our desired styles using apply() and lambda().

  7. Split values from one column into multiple columns.

  8. Replace "n/a" data using .replace or drop null data using .fillna.

  9. Remove data based on specific conditions.

  10. Drop null values in a column using .dropna.

  11. Reset the index column using reset_index.


Final result of the database


Disclaimer: This data is dummy data.


Recent Posts

See All

Comentários


©2021 by Tram Nguyen

bottom of page