Open and merge multiple shapefiles, updated

One code to bind them all
R
spatial
Author

Michaël

Published

2023-04-05

Modified

2024-12-25

A puzzle

CC BY-SA by INTVGene

This old post sees a little traffic from search engines but is a mess after many editions due to the packages evolutions.

So, how can we (chose your term) append, merge, union or combine many shapefiles or other spatial vector data in 2023 with R, preferably using tidyverse functions?

For good measure, we want to add the source file as an attribute.

First, make some data using the geopackage available here. We generate several shapefiles (one per région) in a temporary directory:

dep <- sf::read_sf("~/data/adminexpress/adminexpress_cog_simpl_000_2022.gpkg", 
                   layer = "departement")

dep |> 
  dplyr::group_by(insee_reg) |> 
  dplyr::group_walk(\(grp_data, grp_name) sf::write_sf(grp_data, 
                                                       glue::glue("~/temp/reg_{grp_name}.shp")))

This is the way

Current and concise.

fs::dir_ls("~/temp", regexp = ".*\\.shp$") |> 
  purrr::map(\(f) sf::read_sf(f) |> 
               dplyr::mutate(source = f, .before = 1)) |> 
  dplyr::bind_rows()

Approved

Recommended, as seen in the purrr help on map_dfr() (that I liked better, see below) but verbose because we have to specify that we want a sf-tibble which is lost in translation.

fs::dir_ls("~/temp", regexp = ".*\\.shp$") |> 
  purrr::map(\(f) sf::read_sf(f) |> 
               dplyr::mutate(source = f, .before = 1)) |> 
  purrr::list_rbind() |>
  dplyr::as_tibble() |> 
  sf::st_sf()

Superseded

… Sadly. That’s short and understandable.

fs::dir_ls("~/temp", regexp = ".*\\.shp$") |> 
  purrr::map_dfr(\(f) sf::read_sf(f) |> 
                   dplyr::mutate(source = f, .before = 1)) 

Older

I liked it, too.

fs::dir_ls("~/temp", regexp = ".*\\.shp$") |> 
  dplyr::tibble(source = _) |>
  dplyr::mutate(shp = purrr::map(source, sf::read_sf)) |>
  tidyr::unnest(shp) |>
  sf::st_sf()

Oldest

Only if all files share the same attributes structure.

fs::dir_ls("~/temp", regexp = ".*\\.shp$") |> 
  purrr::map(\(f) sf::read_sf(f) |> 
               dplyr::mutate(source = f, .before = 1)) |> 
  do.call(what = rbind)

or

fs::dir_ls("~/temp", regexp = ".*\\.shp$") |> 
  purrr::map(\(f) sf::read_sf(f) |> 
               dplyr::mutate(source = f, .before = 1)) |> 
  purrr::reduce(rbind)