Efficient Python part 4: Basic Pandas Optimisation
Welcome to the fourth and last part out of four in the series on how to write more efficient python code. In the first part, we checked the very basics: Pythonic code, built-in functions, and just a bit on how numpy can improve speed. In the second part we learned how to measure which code is faster and enable comparison of efficiency. In the third part we checked some ways to make tour code more performant with more internal functions.As a recap, the series comprehends:
- Efficient Python Part 1: Start with the Basics
- Efficient Python Part 2: Tools for Evaluating Your Code
- Efficient Python Part 3: Increasing code performance
- Efficient Python Part 4: Optimisation for Pandas
Basic Pandas Optimisation
Intro to pandas DataFrame iteration
So far we used only standard libraries data, now we gonna use pandas lib and define the best practices for iterating over pandas. To recap:
- Pandas in a data analysis Lib
- Main data structure is the DF
- tabular data with labeled rows and columns
- Built on top of numpy array structure Creating the pandas df for studies: Baseball data!