A short description of the post.
I was inspired by David Robinson’s latest webcast in which he made a heatmap of French train delays using geom_tile
and wanted to try it out for myself.
In this week's #tidytuesday screencast, I analyze delays in French train stations 🇫🇷🚄
— David Robinson (@drob) February 26, 2019
I show how to create heatmaps of delays (inspired by @noccaea!), and embarrass myself with even the simplest French pronunciationshttps://t.co/zBrkIkdcCz #rstats pic.twitter.com/RI7ZpxV89X
I don’t have a lot of opportunities to use heatmaps, but recently Statcan has released their Journey to Work data as part of the 2016 Census. I wanted to see if I could use a heatmap to understand the commuting patterns in communities in southern Ontario.
I downloaded the commuting table from statscan because I couldn’t find it using the cancensus
package.
raw_commute <- read_csv("~/projects/R stuff/commute/98-400-X2016391_English_CSV_data.csv") %>%
janitor::clean_names() %>%
select(code = geo_code_por,
live = geo_name,
work = geo_name_1,
total = dim_sex_3_member_id_1_total_sex)
# filter only those whose code starts w/ 35 (Ontario)
ontario_commute <- raw_commute %>%
filter(str_detect(code, pattern = "^35"))
I was able to retrieve the working age population and census division geographies using cancensus
## Linking to GEOS 3.7.0, GDAL 2.3.2, PROJ 5.2.0
# create list of ontario census divisions to pass to cancensus
regions_list_ontario <- list_census_regions("CA16") %>%
filter(str_detect(region, pattern = "^35")) %>%
as_census_region_list
## Querying CensusMapper API for regions data...
pop_data <- get_census("CA16",
regions = regions_list_ontario,
vectors = "v_CA16_61",
level = "CD",
geo_format = "sf", labels = "short") %>%
janitor::clean_names()
# clean
pop_data %>%
mutate(working_age = v_ca16_61) %>%
mutate(code = as.double(geo_uid)) %>%
select(code, working_age, shape_area, geometry) -> pop_data_clean
# remove commuting within cd, compute totals
ontario_commute %>%
filter(live != work) %>%
group_by(live) %>%
mutate(total_commuters = sum(total),
prop_commuters = total / total_commuters ) %>%
ungroup() %>%
left_join(pop_data_clean, by = "code") %>%
group_by(work) %>%
mutate(total_commuters_destination = sum(total)) %>%
mutate(live_prop = total_commuters / working_age,
work_prop = total / working_age ) %>%
ungroup() -> ontario_commute_clean
Let’s see what visualizations will work.
ontario_commute_clean %>%
filter(total >= 3000) %>%
mutate(live = str_wrap(live, width = 15)) %>%
mutate(live = fct_reorder(live, total)) %>%
mutate(work = fct_reorder(work, total)) %>%
ggplot(aes(live, work, fill = total)) +
geom_tile(alpha = 0.7) +
theme_light() +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
scale_fill_viridis_c(labels = scales::comma_format()) +
labs(fill = "# of commuters",
title = "The largest number of commuters in Ontario are those who live in York, Peel and Durham who commute to Toronto for work",
subtitle = "Number of employed labour force aged 15+ who commute by Census Division - Minimum 3,000 commuters to be represented",
x = "Live - Census Division",
y = "Work - Census Division",
caption = "Data from Statistics Canada 2016 Canadian Census - Journey to Work \n https://www12.statcan.gc.ca/census-recensement/2016/rt-td/jtw-ddt-eng.cfm")
I’m always trying to improve my mapping skills and geom_sf
makes it a lot easier.
ontario_commute_clean %>%
filter((!live %in% c("Rainy River", "Kenora", "Thunder Bay", "Algoma", "Nipissing",
"Cochrane", "Greater Sudbury / Grand Sudbury", "Manitoulin",
"Timiskaming", "Sudbury", "Parry Sound"))) %>%
ggplot() +
geom_sf(aes(fill = live_prop)) +
scale_fill_viridis_c("% Labour force who commute to work", labels = scales::percent) + theme_minimal() +
theme(panel.grid = element_blank(),
axis.text = element_blank(),
axis.ticks = element_blank()) +
coord_sf(datum=NA) +
labs(title = "The % of the labour force (aged 15+) who commute by Census Division in Southern Ontario")
If you see mistakes or want to suggest changes, please create an issue on the source repository.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. Source code is available at https://github.com/colemanrob/robcoleman.ca, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".