Dimension reduction

Day 6 of 30DayMapChallenge
R
30DayMapChallenge
spatial
datavisualization
Author

Michaël

Published

2025-11-06

Modified

2025-11-05

A photo of a dodecahedron in the twilight, openwork and lit from within

Floura - Light Bloom - The Art of Hybycozo - Desert Botanical Garden – CC-BY-NC by Alan English CPA

Day 6 of 30DayMapChallenge: « Dimensions » (previously).

According to Wikipedia, Uniform manifold approximation and projection (UMAP) is a nonlinear dimensionality reduction technique. It will allow us to project many dimensions (well, only 3 in this example) onto a 2D plane.

library(sf)
library(umap)
library(dplyr)
library(tidyr)
library(ggplot2)
library(ggrepel)
library(glue)

options(scipen = 100)

Data

We’ll use the french communes (get the data from this post).

com <- read_sf("~/data/adminexpress/adminexpress_cog_simpl_000_2022.gpkg",
               layer = "commune") |>
  st_centroid() |>
  mutate(x = st_coordinates(geom)[, 1],
         y = st_coordinates(geom)[, 2])

UMAP

The dimensions taken into account are: location (x, y) and population. These variables should be scaled but the result is prettier without scaling…

umaps_params <- umap.defaults
umaps_params$random_state <- 20251106

com_umap <- com |>
  st_drop_geometry() |>
  select(x, y, population) |>
  # scale() |> 
  umap(config = umaps_params)

res <- com_umap$layout |>
  as_tibble(.name_repair = "universal") |>
  bind_cols(com) |>
  rename(UMAP1 = 1,
         UMAP2 = 2)

Map

res |>
  ggplot(aes(UMAP1, UMAP2, color = population)) +
  geom_point() +
  geom_text_repel(data = filter(res, 
                                statut %in% c("Préfecture", 
                                              "Préfecture de région",
                                              "Capitale d'état")),
                  aes(label = nom),
                  size = 3, force = .5, force_pull = 0.5, max.overlaps = 1e6,
                  bg.colour = "#ffffffaa", bg.r = .2, alpha = .6) +
  scale_color_viridis_c(trans = "log1p", option = "H",
                        breaks = c(1000, 50000, 500000, 2000000)) +
  coord_equal() +
  labs(title = "Uniform manifold approximation and projection of french communes",
       subtitle = "by location and population",
       caption = glue("https://r.iresmi.net/ - {Sys.Date()}
                      data from IGN Adminexpress 2022")) +
  theme_minimal() +
  theme(plot.caption = element_text(size = 6, 
                                    color = "darkgrey"))
Plot on a 2D plane where each point is a french town. The patterns are like a firework
Figure 1: A UMAP representation of the french communes