{ "cells": [ { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "view-in-github" }, "source": [ "\"Open" ] }, { "cell_type": "markdown", "metadata": { "id": "oseVPs8Bk0_v" }, "source": [ "# Chapter 6 - Amenities on Plotly Express\n", "\n", "Plotly Express visualizations share many common features and also can have their own unique features. \n", "All of these features can be customized through input parameters from the visualization methods.\n", "\n", "\n", "\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "id": "pRYZa852m8_k" }, "source": [ "## 3.1 A Bare-minimal Scatter Plot\n", "\n", "First, let's create a simplest scatter plot. We will create a scatter plot with Life Expectancy on the Y axis and GDP per Capita on the X axis using data from 2007.\n", "\n", "\n", "We are using Google Colab as our development environment. Colab is a free Jupyter Notebook environment hosted in the Cloud and provided by Google. You need to register a Google account first before you can use this free service.\n", "\n", "At the time of this writing, Google Colab has an older version of Plot (4.4.1) pre-installed, so we would want to upgrade it to the latest version (5.3.1) by running the system command `pip install --upgrade plotly`.\n", "\n", "Since the system command is run in Jupyter Notebook environment, we need to start the command with an exclamation `!` like this `!pip install --upgrade plotly`.\n", "\n", "The `--upgrade` option removes the old version and installs the latest version.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "IzT8bTR3MxIu" }, "outputs": [], "source": [ "# Upgrade Plotly library since Google Colab environment has an older version of Plotly\n", "\n", "!pip install --upgrade plotly" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "euRxw1HnNj4T" }, "outputs": [], "source": [ "# Display the Plotly version number\n", "\n", "import plotly\n", "\n", "print(plotly.__version__)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "8fCH4JjeE-B4" }, "outputs": [], "source": [ "# Load the plotly Express module\n", "# Give it a shortcut alias px for easy reference later\n", "\n", "import plotly.express as px" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 519 }, "id": "4E5FWqwHFB2M", "outputId": "63866a89-dc4e-41e5-94b9-73f87edbc5ba" }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countrycontinentyearlifeExppopgdpPercapiso_alphaiso_num
0AfghanistanAsia195228.8018425333779.445314AFG4
1AfghanistanAsia195730.3329240934820.853030AFG4
2AfghanistanAsia196231.99710267083853.100710AFG4
3AfghanistanAsia196734.02011537966836.197138AFG4
4AfghanistanAsia197236.08813079460739.981106AFG4
5AfghanistanAsia197738.43814880372786.113360AFG4
6AfghanistanAsia198239.85412881816978.011439AFG4
7AfghanistanAsia198740.82213867957852.395945AFG4
8AfghanistanAsia199241.67416317921649.341395AFG4
9AfghanistanAsia199741.76322227415635.341351AFG4
10AfghanistanAsia200242.12925268405726.734055AFG4
11AfghanistanAsia200743.82831889923974.580338AFG4
12AlbaniaEurope195255.23012826971601.056136ALB8
13AlbaniaEurope195759.28014765051942.284244ALB8
14AlbaniaEurope196264.82017281372312.888958ALB8
\n", "
" ], "text/plain": [ " country continent year ... gdpPercap iso_alpha iso_num\n", "0 Afghanistan Asia 1952 ... 779.445314 AFG 4\n", "1 Afghanistan Asia 1957 ... 820.853030 AFG 4\n", "2 Afghanistan Asia 1962 ... 853.100710 AFG 4\n", "3 Afghanistan Asia 1967 ... 836.197138 AFG 4\n", "4 Afghanistan Asia 1972 ... 739.981106 AFG 4\n", "5 Afghanistan Asia 1977 ... 786.113360 AFG 4\n", "6 Afghanistan Asia 1982 ... 978.011439 AFG 4\n", "7 Afghanistan Asia 1987 ... 852.395945 AFG 4\n", "8 Afghanistan Asia 1992 ... 649.341395 AFG 4\n", "9 Afghanistan Asia 1997 ... 635.341351 AFG 4\n", "10 Afghanistan Asia 2002 ... 726.734055 AFG 4\n", "11 Afghanistan Asia 2007 ... 974.580338 AFG 4\n", "12 Albania Europe 1952 ... 1601.056136 ALB 8\n", "13 Albania Europe 1957 ... 1942.284244 ALB 8\n", "14 Albania Europe 1962 ... 2312.888958 ALB 8\n", "\n", "[15 rows x 8 columns]" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Use the built-in sample dataset gapminder\n", "\n", "df = px.data.gapminder()\n", "\n", "df.head(15)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 542 }, "id": "b8P6ahRvljiT", "outputId": "cb908b9c-0b25-4a38-b224-726a3be9596c" }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "
\n", " \n", " \n", " \n", "
\n", " \n", "
\n", "\n", "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Use Plotly Express's scatter method to create a bare-minimal scatter plot.\n", "# Provide the data frame and specify columns for the axies.\n", "\n", "fig = px.scatter(\n", " df.query(\"year == 2007\"), # Only use 2007 data\n", " x=\"gdpPercap\", # gdpPercap is the column name for GDP per Capita\n", " y=\"lifeExp\" # lifeExp is the column name for Life Expectancy\n", ")\n", "\n", "fig.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "IIF958XolUR9" }, "source": [ "## 3.2 A Feature-rich Scatter Plot\n", "\n", "Plotly Express's scatter method provides many parameters for customization. In this example, we use the following features:\n", "- Use country name as the hoover name so that when we mouse over a dot we wil be able to know which country the dot represents.\n", "- Use the population to specify the size of the dot (geometric area). Since the dots now look like bubbles, this scatter plot is also known as bubble plot. \n", "- Use continent to specify the color of the dot. Plotly Express provides a color legend. You can use mouse click to select/de-select continents.\n", "- Provide a title for the scatter plot. Since life expectancy is an indicator of health while GDP per capita is an indicator of wealth, we will give the visualization a title \"Health vs Wealth\".\n", "- Use year as the animation frame. We can play the animation to see the changes over time. Since the ranges for X and Y change from year to year, we fix the ranges using the minimum and maximum of X and Y so that the bubbles don't disappear outside of the visualization. \n", "." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 617 }, "id": "Ds4SNabxFMpd", "outputId": "9598519e-3a37-454c-84bd-743e9625fcec" }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "
\n", " \n", " \n", " \n", "
\n", " \n", "
\n", "\n", "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "fig = px.scatter(\n", " df,\n", " template=\"plotly_dark\",\n", " x=\"gdpPercap\",\n", " y=\"lifeExp\",\n", " color=\"continent\",\n", " hover_name=\"country\",\n", " title=\"Health vs Wealth 2007\",\n", " height=600,\n", " animation_frame=\"year\",\n", " size=\"pop\",\n", " size_max=55, \n", " log_x=True,\n", " range_x=(df[\"gdpPercap\"].min(), df[\"gdpPercap\"].max()),\n", " range_y=(df[\"lifeExp\"].min(), df[\"lifeExp\"].max())\n", ")\n", "\n", "fig.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "H1beOBUakgm1" }, "source": [ "## 3.3 Interact with the Visualization\n", "- Mouse hover\n", "- Crop\n", "- Using the Tool Bar" ] }, { "cell_type": "markdown", "metadata": { "id": "tssDKNJzjM-1" }, "source": [ "## 3.4 Customize the Visualization\n", "- Scene Template\n", "- Title\n", "- X Axis\n", " - title\n", " - ticks\n", "- Y Axis\n", " - title\n", " - ticks\n", "- Marker \n", " - shape\n", " - color\n", "- Legend\n", "\n", "\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "id": "KjjA5IRXylk0" }, "source": [ "## 3.5 Download the Visualization \n", "\n" ] } ], "metadata": { "colab": { "authorship_tag": "ABX9TyPB30D7ivW4NtXIM39FI9xD", "include_colab_link": true, "name": "Chapter_02.ipynb", "provenance": [], "toc_visible": true }, "kernelspec": { "display_name": "Python 3", "name": "python3" }, "language_info": { "name": "python" } }, "nbformat": 4, "nbformat_minor": 0 }