Data visualisation is an incredibly important component of data science. By using visual representations of data, we can help provide meaningful insights which are easy to understand. Conveniently, for those who use Python as their main programming language of choice, there are many potential options to explore however, not all of them may suit our needs.
There are many options to consider depending on your needs. For example, will it be used in a Jupyter Notebook or Web APP? Does it need to be interactive? In this blog post, we will explore five of the most popular data visualisation packages in Python exploring the pros and cons of each, followed by a cross-comparison of features.
Matplotlib is perhaps one of the most well-known Python packages in general and is widely used among the community for producing many different types of data visualisations and exporting them to different formats such as PNG and SVG. Furthermore, Matplotlib is versatile and can do anything from basic line charts to complex network diagrams.
- Popular choice and widely used among the community
- Figures can be explored in many different formats
- Versatile and can be used to plot different visualisation types
- Requires a lot of knowledge to build complex plots
- Documentation can be hard to navigate
- Fairly low-level
Seaborn inherits many of the features of matplotlib as it describes itself as "a Python data visualization library based on matplotlib.". Seaborne attempts to make it easier to produce "attractive and informative" statistical graphics by providing a high-level interface that matplotlib lacks
- East to get started
- Features a wider selection of data visualisations
- Requires little customisation and is ready out of the box
- Only really used for complex data and not for simple tasks
- In some cases, the data may need to be pre-processed before plotting
- Lack of customisation
Plot.ly is a service which provides a python package for creating complex, web-based, interactive data visualisations. They provide an open-source Python package for creating dashboards and charts and maps which can be integrated with their online service.
- Used across multiple programming languages, not just Python
- Allows users to create interactive plots
- Customisable plots
- Potentially persuaded into taking a paid service
- A little challenging to use and requires a lot of research
- Requires creating lots of code
- Can be used to create many different data visualisations
- Heavily customisable
- Supports many different web browsers
- Not as well known
- Complex framework for modelling data
- Resulting plots don't look as nice as other packages
Much like Plotly, Streamlit is a Python package and service designed to help create interactive data visualisation apps which can be shared with others within a web browser. It provides full GUI controls allowing users to adjust data sets and plot multiple figures in real-time using a single Python script.
- Easy and simple to use
- Share with others
- Much faster at runtime (due to caching)
- Only designed for web-based interactions
- Sharing data visualisation apps requires creating an account
- No easy option for self-hosting
To compare the packages listed in this blog post, the following attributes are used to rank the usability of certain features and highlight the differences between each package:
- Interactive (Yes/No): Does the package support interactive plots?
- Web-based (Yes/No): Can the plot be used in a web browser?
- Customisable (Out of 5 Stars): How well can the plots be customised?
- Quality of Output (Out of 5 Stars): Do the plots look decent when produced?
- Ease of Use (Out of 5 Stars): How easy is it to get started?
- Popularity: The number of stars on GitHub (as of 04/08/2022). A proxy for how large the community is.
|Package||Interactive||Web-Based||Customisable||Quality of Output||Ease of Use||Popularity|
There are many options to consider depending on your needs, some may be better suited to you than others. While packages like matplotlib and seaborn are considered to be nice all-rounders, they certainly lack the interactive features seen in Plotly, Bokeh and Streamlit. Ultimately the decision is yours! This is just one opinion of many within the data science community and I'm sure you are likely to disagree on some points however, I have had the opportunity to use all five at some point in my career so these simply serve as recommendations.