{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "view-in-github"
},
"source": [
""
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "oseVPs8Bk0_v"
},
"source": [
"# Chapter 4 - Fun Ride on Plotly Express\n",
"\n",
"Before we dive deep into the wonderland of Plotly data visualization, let's enjoy a fun ride on Plotly Express - A simple yet powerful module to create beautiful and interactive data visualizations using just one line of code.\n",
"\n",
"One big plus about Plotly Express is its tight integration with Pandas data frame. Pandas is the de facto tool for data preparation and analysis within Python ecosystem and can be thought of Microsoft Excel on steroid. \n",
"\n",
"In Plotly Express, you specify the data source using a data frame and specify variables using columns names of the data frame. \n",
"\n",
"Plotly Express comes with several build-in sample datasets in the form of Pandas Dataframe. We will use the `gapminder` dataset. The gapminder dataset contains Population, GDP per Capita, and Life Expectancy of countries from the past many years starting from 1952 until 2007 with five-year interval.\n",
"\n",
"We will begin by examining a scatter plot to gain a comprehensive understanding of a Plotly Express visualization. \n",
"\n",
"\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "pRYZa852m8_k"
},
"source": [
"## 4.1 A Bare-minimal Scatter Plot\n",
"\n",
"First, let's create a simplest scatter plot. We will create a scatter plot with Life Expectancy on the Y axis and GDP per Capita on the X axis using data from 2007.\n",
"\n",
"\n",
"We are using Google Colab as our development environment. Colab is a free Jupyter Notebook environment hosted in the Cloud and provided by Google. You need to register a Google account first before you can use this free service.\n",
"\n",
"At the time of this writing, Google Colab has an older version of Plot (4.4.1) pre-installed, so we would want to upgrade it to the latest version (5.3.1) by running the system command `pip install --upgrade plotly`.\n",
"\n",
"Since the system command is run in Jupyter Notebook environment, we need to start the command with an exclamation `!` like this `!pip install --upgrade plotly`.\n",
"\n",
"The `--upgrade` option removes the old version and installs the latest version.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "IzT8bTR3MxIu"
},
"outputs": [],
"source": [
"# Upgrade Plotly library since Google Colab environment has an older version of Plotly\n",
"\n",
"!pip install --upgrade plotly"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "euRxw1HnNj4T"
},
"outputs": [],
"source": [
"# Display the Plotly version number\n",
"\n",
"import plotly\n",
"\n",
"print(plotly.__version__)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "8fCH4JjeE-B4"
},
"outputs": [],
"source": [
"# Load the plotly Express module\n",
"# Give it a shortcut alias px for easy reference later\n",
"\n",
"import plotly.express as px"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 519
},
"id": "4E5FWqwHFB2M",
"outputId": "63866a89-dc4e-41e5-94b9-73f87edbc5ba"
},
"outputs": [
{
"data": {
"text/html": [
"
\n", " | country | \n", "continent | \n", "year | \n", "lifeExp | \n", "pop | \n", "gdpPercap | \n", "iso_alpha | \n", "iso_num | \n", "
---|---|---|---|---|---|---|---|---|
0 | \n", "Afghanistan | \n", "Asia | \n", "1952 | \n", "28.801 | \n", "8425333 | \n", "779.445314 | \n", "AFG | \n", "4 | \n", "
1 | \n", "Afghanistan | \n", "Asia | \n", "1957 | \n", "30.332 | \n", "9240934 | \n", "820.853030 | \n", "AFG | \n", "4 | \n", "
2 | \n", "Afghanistan | \n", "Asia | \n", "1962 | \n", "31.997 | \n", "10267083 | \n", "853.100710 | \n", "AFG | \n", "4 | \n", "
3 | \n", "Afghanistan | \n", "Asia | \n", "1967 | \n", "34.020 | \n", "11537966 | \n", "836.197138 | \n", "AFG | \n", "4 | \n", "
4 | \n", "Afghanistan | \n", "Asia | \n", "1972 | \n", "36.088 | \n", "13079460 | \n", "739.981106 | \n", "AFG | \n", "4 | \n", "
5 | \n", "Afghanistan | \n", "Asia | \n", "1977 | \n", "38.438 | \n", "14880372 | \n", "786.113360 | \n", "AFG | \n", "4 | \n", "
6 | \n", "Afghanistan | \n", "Asia | \n", "1982 | \n", "39.854 | \n", "12881816 | \n", "978.011439 | \n", "AFG | \n", "4 | \n", "
7 | \n", "Afghanistan | \n", "Asia | \n", "1987 | \n", "40.822 | \n", "13867957 | \n", "852.395945 | \n", "AFG | \n", "4 | \n", "
8 | \n", "Afghanistan | \n", "Asia | \n", "1992 | \n", "41.674 | \n", "16317921 | \n", "649.341395 | \n", "AFG | \n", "4 | \n", "
9 | \n", "Afghanistan | \n", "Asia | \n", "1997 | \n", "41.763 | \n", "22227415 | \n", "635.341351 | \n", "AFG | \n", "4 | \n", "
10 | \n", "Afghanistan | \n", "Asia | \n", "2002 | \n", "42.129 | \n", "25268405 | \n", "726.734055 | \n", "AFG | \n", "4 | \n", "
11 | \n", "Afghanistan | \n", "Asia | \n", "2007 | \n", "43.828 | \n", "31889923 | \n", "974.580338 | \n", "AFG | \n", "4 | \n", "
12 | \n", "Albania | \n", "Europe | \n", "1952 | \n", "55.230 | \n", "1282697 | \n", "1601.056136 | \n", "ALB | \n", "8 | \n", "
13 | \n", "Albania | \n", "Europe | \n", "1957 | \n", "59.280 | \n", "1476505 | \n", "1942.284244 | \n", "ALB | \n", "8 | \n", "
14 | \n", "Albania | \n", "Europe | \n", "1962 | \n", "64.820 | \n", "1728137 | \n", "2312.888958 | \n", "ALB | \n", "8 | \n", "