Open In Colab

Chapter 1 - Python for Data Econosystem

This chapter introduce rudimentary concepts relvelant to data visualization in a brief manner without going into much details.

Chapter two

chapter three to seven

chapter eight, nine, and ten

This book is organized by hands-on examples from a variety of problem domains from education to healthcare and from national culture to world development. Social researchers will find the topics interesting and relevant. Exposure to social issues will be helpful for technical professionals such as data engineers and data scientists.

1.1 Background

Plotly means many related things. At the core, it refers to Plotly.js, a Java Script library for data visualization. Plotly.js itself is based on another Java Script library D3.js. D3 stands for Data-Driven Document. D3.js brings data to life by manipulating documents using HTML, SVG, CSS, and Java Script.

In the context of this book, Plotly refers to Plotly Python, a Python library based on Ploly.js. Beside Python, Plotly.js has been made avaialble for other programming languages including R, Julia, and Matlab.

Plotly also refers to the Canadian company that developed the afore-mentioned Plotly Python library which Plotly Express is part of. The company also developed Dash and Chart Studio on top of Plotly Python. Dash provides a Python framework for developing interactive dashboards and web applications. Chart Studio is a web-based drag-and-drop tool for generating interactive visualizations without the need for coding.

1.2 Python Data Visualization Ecosystem

There are no shortage of data visualzation libraries in Python. Matplotlib, Seaborn, Plotly, Altair, Bokeh are among the most popular ones.

Visualization libraries can be classified using two dimensions - Static vs Interactive and Low-level vs High-level.

Static libraries only produce static visualizations that are still images without support for human interaction. Interactive libraries produces visualizations that are dynamic and allow users to interact via mouse movement and screen touch. Interactive visualizations are great tools for exploratory data analysis. Interactive visualizations also support the download of the interactive chars as static images for embedding in publication.

Low-level libraries provide more options and flexibility for customization but are more complex and require steeper learning curve. High-level libraries are easier to learn, simpler to use, and require fewer lines of code.

Table 1 provides a summary of Python data visualization libraries along these two dimensions.

Static

Interactive

Low-level

Matplotlib

Plotly

High-level

Seaborn

Plotly Express, Altair, Bokeh

Matplotlib is the first python library for data visualization and it is modelled based on a commercial visualization and modelling software MatLab. As a low-level library, Matplotlib is rich in functionalities and features which come with the downside of complexity and learning curve. Many data engineers and scientists in the Python camp choose Matplotlib as the first choice for learning and using data visualization.

Seaborn is a high-level library built on the foundation of Matplotlib. It is simpler to learn and essier to use than Matplotlib. Since it is based on Matplotlib, you can always resort to Matplotlib for more advanced and complex use cases where Seaborn falls short.

The main disadvantage of Matplotlib and Seaborn is that their visualizations are static and not interactive.

Interactivity is where Plotly and Plotly Express shine. Similar to Matplotlib and Seaboran. Plotly is a low-level library while Plotly Express is a high-level library built on the foundation of Plotly.

1.3 Benefit of Plotly EXpress: Best of Both Worlds

1.3.1 Both Static and Interactive

With Plotly Express, you can create highly interactive publication-quality visualizations. The visualizations can be published to a web site for sharing across the Internet. In addition, the visualizations can be exported as static images for embedding in online blogs, business reports, academic papers, presentations, and books.

1.3.2 Simple to Code yet Highly Customizable

As depicted in table 1, Plotly is a low-level interactive library and Plot Express is a high-level interactive library. Plotly Express is built on top of Plotly and enjoys the best of both worlds. Plotly Express is simple yet powerful. It provides tens of built-in interactive charts with just one line of code while providing advanced functionality and customizations via access to the low-level functions of Plotly.

1.3.3 Backed by a viable Company and a lively Ecosystem

Plotly is also the name of a Toroto-based company which builds several related open source products based on Plotly.js, a Java Script library for interactive data visualization. These products include Plotly, Plotly Express, Dash, and Chart Studio.

Dash is a Python library for generating interactive dashboards using Plotly and Plotly Express. Interactive dashboards can be published as websites for sharing on the Internet.

Plotly Chart Studio provides registered users an online environment for authoring Plotly visualizations with drag-and-drop design tool without writing code. The visualizations are hosted on the Cloud and can be shared. Users can also write code using Plotly Express to save the visualizations to the Chart Studio’s Cloud environment for sharing.

For more information about Plotly, Plotly Express, Dash, and Chart Studio, check out the company website: http://www.plotly.com