Pandas has got to be one of my most favourite libraries… Ever.
Pandas allows us to deal with data in a way that us humans can understand it; with labelled columns and indexes. It allows us to effortlessly import data from files such as csvs, allows us to quickly apply complex transformations and filters to our data and much more. It’s absolutely brilliant.
Along with Numpy and Matplotlib I feel it helps create a really strong base for data exploration and analysis in Python. Scipy (which will be covered in the next post), is of course a major component and another absolutely fantastic library, but I feel these three are the real pillars of scientific Python.
So without any ado,
let’s get on with the third post in this series on scientific Python and take a look at Pandas. Don’t forget to check out the other posts if you haven’t yet!
First thing to do its to import the star of the show, Pandas.
import pandas as pd # This is the standard
This is the standard way to import Pandas. We don’t want to be writing ‘pandas’ all the time but it’s important to keep code concise and avoid naming clashes so we compromise with ‘pd’. If you look at other people’s code that uses Pandas you will see this import.
THE PANDAS DATA TYPES
Pandas is based around two data types, the series and the dataframe.