Overview
A data visualization project exploring the relationship between job vacancies, educational attainment, and income levels across Canadian provinces and territories, using open datasets from Statistics Canada.
The goal was to identify patterns in labour market supply and demand that aren't obvious from summary statistics — which regions have the largest gaps between vacancy rates and labour force participation, and how does education level correlate with income disparity between provinces?
Data Sources
All data sourced from Statistics Canada's open data portal:
- Job vacancy and wage survey (JVWS) — quarterly vacancy counts and average offered wages by occupation and province
- Labour Force Survey (LFS) — employment rates, unemployment rates, and participation rates
- Census education data — post-secondary attainment rates by age group and province
- Income data — median household income and income decile distributions by province
Analysis Pipeline
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.ticker as mtick
# Load and merge datasets on province code + year
vacancies = pd.read_csv('jvws_2023.csv')
income = pd.read_csv('income_by_province.csv')
education = pd.read_csv('census_education.csv')
merged = (
vacancies
.merge(income, on=['province_code', 'year'])
.merge(education, on=['province_code', 'year'])
)
Merging across three surveys required reconciling different province encoding schemes (two-letter abbreviations vs. numeric codes vs. full names) and different temporal granularities (quarterly vs. annual). A cleaning layer normalized all encodings before the merge.
Key Findings
Vacancy-to-participation gap. The Prairie provinces (Alberta, Saskatchewan, Manitoba) consistently show high vacancy rates alongside above-average participation rates — tight labour markets with strong demand. Quebec shows the inverse: lower vacancy rates and the highest post-secondary attainment in the dataset.
Education-income correlation. The correlation between province-level post-secondary attainment and median household income is strong (r ≈ 0.72), but the causal direction is ambiguous: provinces with stronger economies attract educated workers and fund better post-secondary institutions.
Occupational vacancy hotspots. Healthcare and construction vacancies dominate across all provinces. Software and IT vacancies are concentrated in Ontario and BC, accounting for a disproportionate share of high-wage vacancies.
Visualization Design
Charts were designed for a non-specialist audience — policy researchers and students. Design choices:
- Choropleth maps for geographic distribution of vacancy rates (province fill = rate)
- Slope charts for income change 2019→2023, emphasizing direction rather than absolute levels
- Faceted bar charts for occupational breakdown, sorted by vacancy rate within each province
- Consistent color palette (diverging: low→high vacancy) across all charts
Lessons Learned
Data cleaning consumed more time than analysis — roughly 3:1. The main challenges were:
- Province encoding inconsistency across datasets from the same government source
- Temporal misalignment — quarterly vacancy data needed to be annualized to merge with census education data
- Suppressed cells — Statistics Canada suppresses small-count cells for privacy, requiring careful handling of missing values in low-population territories
This project reinforced that data journalism requires as much judgment about what not to show as what to show — several early analyses were misleading due to sample size artifacts.