Matplotlib, seaborn and plotlv. You've probably heard these names. Visualization is a very important activity for those who work with data - it allows for intuitive analysis, telling a story and disseminating work that is 100% technical in origin to wider audiences. For this, it is essential that your chart is very well constructed. As an example data source, we will use the "Titanic" dataset, classic in the middle of data
In [1]: import seaborn as sns import matplotlib.pyplot as pltIn
[2]:titanic_df = sns . load_dataset('titanic')titanic_df . head()Out[2]:
If we want to visualize the relationship between the survival rate and the sex of the passenger, we can do:
In [3]:titanic_df . groupby('sex')['survived'] . mean() . plot . bar()Out[3]:
This graph indicates that females have a higher survival rate than males. But, we were only able to deduce this because we made the graph! It does not have a title, the vertical axis is not named and it has no scale, which makes it very difficult to interpret from the outside.
So how can we improve? Let's add these elements to the chart.
In [4]:titanic_df . groupby('sex')['survived']\ . mean()\ . plot . bar(title = 'Survival rate by passenger sex')plt . ylabel('Survival rate (0 to 1)')Out[4]:Text(0, 0.5, 'Survival rate (0 to 1)')
There is no single rule of thumb for best practices when it comes to improving your chart. It is necessary to develop a critical sense and seek to understand the main elements needed. In this way, you will be able to present them in a clear and self-explanatory way. In this article, we saw that a few small changes to the chart can make a big difference. If you enjoyed the content and want to learn more about it, check out our Python course and come dive into this language with us!
Dont miss out on the news!
Join the MAKE NOW academy to receive exclusive content every week!