Chapter 2 - Data Visualization Concepts¶

This chapter introduces rudimentary concepts relvelant to data visualization in a brief manner without going into much details.

2.1 Data Type: Numerical vs Categorical¶

Data can be differentiated as Structured vs semi-structured vs unstructured. For this book, we will only deal with structured data.

Structured data can be boardly classfied into two types:

Numerical
- Interval
- Ratio
Categorical
- Ordinal
- Nominal

There is a special type of structured data that measure point in time. For example, year, month, week, day, hour, etc. This is called temporal data type. It can be represented as either a numerical data or a categorical data.

2.2 Data Format: Long vs Wide¶

Structured data are typically organized in two different formats depending on how a categorized numerical measure is represented.

In a long format, a categorial variable is used to store the categories and a numerical variable is used to store the measure. This results in more rows.

A wide format uses multiple numerical variables to represent the measure, each variable represents a single category.

Data in long format are also call tidy data. The process of tranforming data from wide format to long format is called tidying up the data. Data in tidy format tend to be more conducive to data analysis and visualization.

However, either one has its own advantages and disadvantages. Plotly Express works well with either format.

2.3 Visual Encoding: Marks & Channels¶

To brign data to life, data visualization employs marks and channels to visually encode data points.

A mark is a geometric shape that helps visualize the persona of a data point. Typical marks are:

Dot
Bar
Line
Area
Square
Triangle

A channel represents a visual property of a mark which enriches the persona of a data point. Typical channels are:

Size
Location
- X Coordinate
- Y Coordinate
Color
Opacity
Text Annotation

Data Visualization with Plotly Express

Chapter 2 - Data Visualization Concepts¶

2.1 Data Type: Numerical vs Categorical¶

2.2 Data Format: Long vs Wide¶

2.3 Visual Encoding: Marks & Channels¶