Catégories
R

Where and when (not) to eat in France ?

Results of sanitary controls in France can be found on data.gouv.fr however, only the running year is available… Thanks to @cquest@amicale.net we can access the archives since 2017.

First a global view of the dataset :

About 800 controls per week, except during the lock-down in 2020, and a slightly lower control pressure in 2021 and 2022.

Poor results (To be improved or To be corrected urgently) are stable at around 6 %, except a recent spike in 2023?

It seems that poor results increase during the year, from June to November.

This surprising periodic phenomenon is also visible by day :

So for the « when », it is : not in summer or autumn.

What about the « where »? It seems you also could be careful in some départements

Not good in Guadeloupe, Guyane, Réunion and the southern lower Seine valley, west of Paris.

Can we see more in details ? Using a 30 km kernel smoothing :

It confirms some hot-spots west of Paris, in Alsace, in Indre, Cher, Alpes-Maritimes and between Gironde and Landes. You are safer in Paris and Bretagne

Has it changed ?

No real trend…

Download the supporting data and R scripts :

Catégories
R

Cherry blossom

It’s cherry blossom time… A nice dataset going back to the year 812 in Japan can be found here. It describes the phenological data for full flowering date of cherry tree (Prunus jamasakura) in Kyoto, showing springtime climate changes.

Let’s draw…

# params ------------------------------------------------------------------

url_kyoto <- "http://atmenv.envi.osakafu-u.ac.jp/aono/kyophenotemp4/"
file_kyoto <- "kyoto.rds"   # cache data
icon_sakura <- "sakura.png" # cache icon


# config ------------------------------------------------------------------

library(tidyverse)
library(fs)
library(janitor)
library(httr)
library(rvest)
library(glue)
library(ggimage)

Sys.setlocale("LC_TIME", "en_GB.UTF-8")


# data --------------------------------------------------------------------

# icon
if (!file_exists(icon_sakura)) {
  GET("https://www.flaticon.com/download/icon/7096433?icon_id=7096433&author=232&team=232&keyword=Sakura&pack=packs%2Fsakura-festival-59&style=522&format=png&color=&colored=2&size=512&selection=1&premium=0&type=standard&search=cherry+blossom",
      write_disk(icon_sakura))
}

# blossom dates, scraped from the web page (more up to date than the xls files)
if (!file_exists(file_kyoto)) {
  GET(url_kyoto) |> 
    content() |> 
    html_element("pre") |>
    html_text2() |>  
    str_replace_all("\xc2\xa0", " ") |> # bad encoding needs correction
    read_fwf(fwf_cols("ad"   = c(7, 10), 
                      "fifd" = c(17, 20), 
                      "fufd" = c(22, 25), 
                      "work" = c(27, 30),
                      "type" = c(32, 35), 
                      "ref"  = c(37, Inf)),
             skip = 26,
             na = c("", "NA", "-")) |> 
    remove_empty() |>  
    mutate(full_flowering_date = ymd(glue("{str_pad(ad, 4, 'left', '0')}{fufd}")),
           full_flowering_date_doy = yday(full_flowering_date)) |> 
    write_rds(file_kyoto)
}


# plot --------------------------------------------------------------------

read_rds(file_kyoto) |> 
  mutate(random_size = sample(c(0.015, 0.02, 0.025, 0.03), length(ad), replace = TRUE)) |> 
  ggplot(aes(ad, parse_date_time(full_flowering_date_doy, orders = "j"))) +
  geom_smooth(color = NA, fill = "chocolate4", alpha = 0.5) +
  geom_image(aes(size = I(random_size)), image = icon_sakura) +
  geom_smooth(color = "chocolate3", se = FALSE, alpha = 0.5) +
  scale_y_datetime(labels = scales::date_format("%b %d"),
                   breaks = "weeks", minor_breaks = "days") +
  labs(title = "Cherry blossom",
       subtitle = "Kyoto",
       x = "year",
       y = "date",
       caption = glue("http://r.iresmi.net/ {Sys.Date()}
                      data: {url_kyoto}
                      icon by Vitaly Gorbachev")) +
  theme_minimal() +
  theme(plot.background = element_rect(fill = "cornflowerblue"),
        panel.grid = element_line(color = "dodgerblue"),
        text = element_text(color = "pink", family = "Ubuntu"),
        plot.title = element_text(size = 20),
        plot.caption = element_text(size = 7),
        axis.text = element_text(color = "pink"))

ggsave("cherry_blossom.png", width = 25, height = 18, units = "cm", dpi = 300)
Catégories
R

Open and merge multiple shapefiles, updated

This old post sees a little traffic from search engines but is a mess after many editions due to the packages evolutions.

So, how can we (chose your term) append, merge, union or combine many shapefiles or other spatial vector data in 2023 with R, preferably using tidyverse functions ?

For good measure, we want to add the source file as an attribute.

First, make some data using the geopackage available here. We generate several shapefiles (one per région) in a temporary directory :

dep <- sf::read_sf("~/data/adminexpress/adminexpress_cog_simpl_000_2022.gpkg", 
                   layer = "departement")

dep |> 
  dplyr::group_by(insee_reg) |> 
  dplyr::group_walk(\(grp_data, grp_name) sf::write_sf(grp_data, 
                                                       glue::glue("~/temp/reg_{grp_name}.shp")))

This is the way

Current and concise.

fs::dir_ls("~/temp", regexp = ".*\\.shp$") |> 
  purrr::map(\(f) sf::read_sf(f) |> 
               dplyr::mutate(source = f, .before = 1)) |> 
  dplyr::bind_rows()

Approved

Recommended, as seen in the purrr help on map_dfr (that I liked better, see below) but verbose because we have to specify that we want a sf-tibble which is lost in translation.

fs::dir_ls("~/temp", regexp = ".*\\.shp$") |> 
  purrr::map(\(f) sf::read_sf(f) |> 
               dplyr::mutate(source = f, .before = 1)) |> 
  purrr::list_rbind() |>
  dplyr::as_tibble() |> 
  sf::st_sf()

Superseded

… Sadly. That’s short and understandable.

fs::dir_ls("~/temp", regexp = ".*\\.shp$") |> 
  purrr::map_dfr(\(f) sf::read_sf(f) |> 
                   dplyr::mutate(source = f, .before = 1)) 

Older

I liked it, too.

fs::dir_ls("~/temp", regexp = ".*\\.shp$") |> 
  dplyr::tibble(source = _) |>
  dplyr::mutate(shp = purrr::map(source, sf::read_sf)) |>
  tidyr::unnest(shp) |>
  sf::st_sf()

Oldest

If all the files share the same attributes structure.

fs::dir_ls("~/temp", regexp = ".*\\.shp$") |> 
  purrr::map(\(f) sf::read_sf(f) |> 
               dplyr::mutate(source = f, .before = 1)) |> 
  do.call(what = rbind)

or

fs::dir_ls("~/temp", regexp = ".*\\.shp$") |> 
  purrr::map(\(f) sf::read_sf(f) |> 
               dplyr::mutate(source = f, .before = 1)) |> 
  purrr::reduce(rbind)
Catégories
R

One neighbour

Today I saw this :

So, Portugal has only one terrestrial neighbour. Opencage lists the other countries with only one neighbour.

Can we get this by ourselves ?

library(tidyverse)
library(rnaturalearth)
library(spdep)

# get data from Natural Earth
countries <- ne_countries(scale = 10, returnclass = "sf") |> 
  st_make_valid() |> 
  group_by(sov_a3, sovereignt) |> 
  summarise(.groups = "drop")

# for joining codes and names later
lookup <- countries |> 
  st_drop_geometry()

# get neighbours with {spdep}
neighbours <- countries |> 
  poly2nb()

# we only keep countries with 1 neighbour
# and add names
neighbours |> 
  set_names(countries$sov_a3) |> 
  keep(\(x) length(x) == 1) |> 
  enframe("sov_a3", "nb") |> 
  unnest(nb) |> 
  filter(nb != 0) |> 
  mutate(nb = countries$sov_a3[nb]) |> 
  left_join(lookup, by = "sov_a3") |> 
  left_join(lookup, by = c("nb" = "sov_a3"), suffix = c("", "_nb")) |> 
  relocate(sovereignt, .after = 1)
# A tibble: 16 × 4
   sov_a3 sovereignt         nb    sovereignt_nb           
   <chr>  <chr>              <chr> <chr>                   
 1 BRN    Brunei             MYS   Malaysia                
 2 CAN    Canada             US1   United States of America
 3 DN1    Denmark            DEU   Germany                 
 4 DOM    Dominican Republic HTI   Haiti                   
 5 GMB    Gambia             SEN   Senegal                 
 6 HTI    Haiti              DOM   Dominican Republic      
 7 IRL    Ireland            GB1   United Kingdom          
 8 KOR    South Korea        PRK   North Korea             
 9 LSO    Lesotho            ZAF   South Africa            
10 MCO    Monaco             FR1   France                  
11 PNG    Papua New Guinea   IDN   Indonesia               
12 PRT    Portugal           ESP   Spain                   
13 QAT    Qatar              SAU   Saudi Arabia            
14 SMR    San Marino         ITA   Italy                   
15 TLS    East Timor         IDN   Indonesia               
16 VAT    Vatican            ITA   Italy    

I miss Bahrain / Saudi Arabia because the Natural Earth dataset is not detailed enough…

Border on a small island

And the « new » Canada / Denmark border is also not taken into account…

While we are at it, which country has more neighbors ?

neighbours |> 
  set_names(countries$sov_a3) |> 
  unclass() |> 
  enframe("sov_a3", "nb") |> 
  unnest(nb) |> 
  count(sov_a3, sort = TRUE) |> 
  left_join(lookup, by = "sov_a3") 

So, China…

# A tibble: 209 × 3
   sov_a3     n sovereignt                      
   <chr>  <int> <chr>                           
 1 CH1       15 China                           
 2 RUS       14 Russia                          
 3 BRA       11 Brazil                          
 4 FR1       11 France                          
 5 COD        9 Democratic Republic of the Congo
 6 DEU        9 Germany                         
 7 AUT        8 Austria                         
 8 SDN        8 Sudan                           
 9 SRB        8 Republic of Serbia              
10 TUR        8 Turkey                          
# … with 199 more rows

However it includes the disputed part of « Kashmir », claimed by India and China, so it’s 14 neighbours, like Russia.

Catégories
R

Population growth

I just saw this article in Le Monde :

« a pair of cats can produce 20,000 individuals in just four years ».

(translation)

That seems quite high… Let’s check!

(additionally, the article is about feral cats preying on wildlife but we are shown a wild cat capturing a laboratory mouse!)

This figure could come from a back-of-the-envelope calculation, such as litters of 3.5 kitten twice a year for 4 years producing 3.58 ≈ 22,519 kitten. But that’s a very rough and false estimate : at this rate there will be 76 billions cats in Lyon in 10 years! Moreover we ignore the fact that the first generations can still have litters, the less than perfect survival of feral cats, the delay before the first pregnancy, etc.

We can use Leslie matrix to model the destiny of our founding pair. Leslie matrices are used in population ecology to project a structured (by age) population based on transitions between age classes and fertility.

Cat females reach sexual maturity at 6–8 months (Kaeuffer et al. 2004), can have 2.1 litters each year (Robinson & Cox, 1970) and have a mean of around 4 kitten by litter (Hall & Pierce, 1934 ; Deag et al., 1987) or 9.1 kitten by year (Robinson & Cox, 1970). So we will use quarters as ages classes. We first use an unrealistic 100 % survival and 100 % fecundity.

R base version

The first line of the matrix reads 0 kitten produced between 0-3 months, 0 kitten between 3-6 months, and 9.1 / 2 / 4 = 1.4 kitten by capita by quarter for the 6-9 months class and the adults. The 1s on the diagonal are the survivals between age classes and the lower-right 1 the adult survival.

l <- matrix(c(0, 0, 1.4, 1.4,
              1, 0,   0,   0,
              0, 1,   0,   0,
              0, 0,   1,   1), 
            nrow = 4,
            byrow = TRUE)

quarters <- 16

n0 <- matrix(c(0, 0, 0, 2), 
             ncol = 1)

simul <- function(q) { 
  n_proj <- matrix(0, 
                   nrow = nrow(l), 
                   ncol = q)
  n_proj[, 1] <- n0
  
  for (i in 1:(q - 1)) {
    n_proj[, i + 1] <- l %*% n_proj[, i]
  }
  return(n_proj)
}

res <- simul(quarters)
round(sum(res[, ncol(res)]))
# 2450

With our optimistic parameters, we get a population of 2,450 cats after 16 quarters (4 years). An order of magnitude less than in the article…

With a more realistic matrix, especially for feral cats, with less fertility for the first pregnancy and survival rates totally made-up but not 100 %, we can get quite a more manageable population size :

l <- matrix(c(0,     0,   1, 1.4,
              0.6,   0,   0,   0,
              0,   0.7,   0,   0,
              0,     0, 0.8,  .9), 
            nrow = 4,
            byrow = TRUE)

res <- simul(quarters)
round(sum(res[, ncol(res)]))
# 77

round(res)
#      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16]
# [1,]    0    3    3    2    3    4    5    6    8     9    11    14    17    21    26    32
# [2,]    0    0    2    2    1    2    3    3    4     5     6     7     8    10    13    15
# [3,]    0    0    0    1    1    1    1    2    2     3     3     4     5     6     7     9
# [4,]    2    2    2    1    2    3    3    4    5     6     8     9    12    14    17    21

matplot(1:quarters, t(res), type = "l", lty = 1:ncol(l), 
        xlab = "quarter", ylab = "age class size", lwd = 2)
legend("topleft", legend = c("0-3 months", "3-6", "6-9", "adults"), 
       lty = 1:ncol(l), col = 1:ncol(l), lwd = 2)

So only 77 cats now… I let you play with fertility and survival rates to stabilize the population.

But anyway, cats do have an effect on wildlife, so whatever their population size, we must act to reduce their impact.

Tidyverse version

library(tidyverse)

l <- matrix(c(0, 0, 1.4, 1.4,
              1, 0,   0,   0,
              0, 1,   0,   0,
              0, 0,   1,   1), 
            nrow = 4,
            byrow = TRUE)

quarters <- 16

n0 <- matrix(c(0, 0, 0, 2), 
             ncol = 1)

res <- list(l) |> 
  rep(quarters - 1) |> 
  accumulate(`%*%`, .init = n0, .dir = "backward") |> 
  rev() |> 
  bind_cols(.name_repair = "unique_quiet")

res |> 
  summarise(across(last_col(), sum))

res |> 
  mutate(class = c("0-3 months", "3-6", "6-9", "adult"), .before = 1) |> 
  pivot_longer(-1, 
               names_to = "quarter", 
               values_to = "n", 
               names_prefix = "...",
               names_transform = as.integer) |> 
  ggplot(aes(quarter, n)) + 
  geom_line(aes(color = class)) +
  labs(title = "Cat population")
  

References

Renaud Kaeuffer, Dominique Pontier, Sébastien Devillard, Nicolas Perrin. Effective size of two feral domestic cat populations (Felis catus L.): effect of the mating system. Molecular Ecology, 2004, 13(2), pp. 483-490. https://doi.org/10.1046/j.1365-294x.2003.02046.x.hal-00427607

Hall, V.E. and Pierce, G.N., Jr. Litter size, birth weight and growth to weaning in the cat. Anat. Rec., 1934, 60: 111-124. https://doi.org/10.1002/ar.1090600113

Deag, J.M., Lawrence, C.E. and Manning, A. The consequences of differences in litter size for the nursing cat and her kittens. Journal of Zoology, 1987, 213: 153-179. https://doi.org/10.1111/j.1469-7998.1987.tb03687.x

Robinson R., Cox H.W. Reproductive performance in a cat colony over a 10-year period. Laboratory Animals, 1970, 4(1):99-112 https://doi.org/10.1258/002367770781036616

Catégories
R

Map your Strava activities

Use the Strava API to map your outings with R using the {rStrava} package.

library(tidyverse)
library(leaflet)
library(rStrava)
library(sf)

# Get your credentials from https://www.strava.com/settings/api
app_name <- "*******"
client_id <- "*******"
client_secret <- "*******"

#' Convert Google polylines from Strava activities to {sf} polylines
#'
#' @param gp string : encoded polyline
#'
#' @return {sf} polyline
gp2sf <- function(gp) {
  gp |> 
    googlePolylines::decode() |> 
    map_dfr(
      function(df) {
        df |> 
          st_as_sf(coords = c("lon", "lat")) |> 
          st_combine() |> 
          st_cast("LINESTRING") |> 
          st_sf() 
      }) |> 
    pull(1)
}

# Get activities
activities <- httr::config(
  token = strava_oauth(
    app_name,
    client_id,
    client_secret,
    app_scope = "activity:read_all",
    cache = TRUE)) |> 
  get_activity_list() |> 
  compile_activities() 

# Map 
activities |> 
  mutate(geom = gp2sf(map.summary_polyline)) |> 
  st_sf(crs = "EPSG:4326") |> 
  leaflet() |> 
  addTiles() |> 
  addPolylines()
Catégories
R

Minimal

Day 16 of 30DayMapChallenge : « minimal » (previously).

Bare Réunion…

# devtools::install_github("tylermorganwall/rayshader")
library(rayshader)
library(raster)
library(tidyverse)
library(fs)

# ftp://BD_ALTI_ext:docoazeecoosh1Ai@ftp3.ign.fr/BDALTIV2_2-0_25M_ASC_RGR92UTM40S-REUN89_D974_2016-03-11.7z
# We get a bunch of ASCII files : open and merge, convert to matrix used by rayshader
elmat <- dir_ls("~/BDALTIV2_2-0_25M_ASC_RGR92UTM40S-REUN89_D974_2016-03-11/BDALTIV2/1_DONNEES_LIVRAISON_2020-06-00408/BDALTIV2_MNT_25M_ASC_RGR92UTM40S_REUN89_D974/") %>% 
  map(raster) %>% 
  reduce(merge)%>% 
  raster_to_matrix()

elmat %>%
  sphere_shade(texture = "desert") %>%
  add_shadow(ray_shade(elmat, zscale = 3), 0.5) %>%
  add_shadow(ambient_shade(elmat), 0) %>%
  plot_3d(elmat, zscale = 10, fov = 0, theta = 135, zoom = 0.75, phi = 45, windowsize = c(1000, 800))

render_snapshot("~/reunion_3d.jpeg")
Bare Réunion
Catégories
R

Food

Day 15 of 30DayMapChallenge : « food » (previously).

Results of State food controls in restaurants in La Réunion.

library(dplyr)
library(sf)
library(janitor)
library(mapsf)
library(glue)
library(lubridate)

# https://geoservices.ign.fr/adminexpress 
# COG
dep <- read_sf("~/ADE-COG_2-1_SHP_WGS84G_FRA/DEPARTEMENT.shp") %>% 
  filter(INSEE_DEP == "974")

# https://dgal.opendatasoft.com/explore/dataset/export_alimconfiance/download/?format=shp
alim <- read_sf("~/../Downloads/export_alimconfiance.shp") %>% 
  clean_names() %>% 
  filter(str_sub(code_postal, 1, 3) == "974") %>% 
  mutate(date_inspec = ymd_hms(date_inspec),
         synthese_ev = factor(synthese_ev, levels = c("A corriger de manière urgente",               
                                                      "A améliorer",
                                                      "Satisfaisant",
                                                      "Très satisfaisant")))
alim %>% 
  ggplot() +
  geom_sf(data = dep,  fill = "grey90", color = "lightblue3") +
  geom_sf(aes(color = synthese_ev)) +
  scale_color_manual(values = c(
    "A corriger de manière urgente" = "red",               
    "A améliorer" = "orange",
    "Satisfaisant" = "blue",
    "Très satisfaisant" = "green")) +
  labs(title = "Food quality control",
       subtitle = "La Réunion",
       color = "Control result",
       caption = glue("data: alim-confiance.gouv.fr
                      {min(alim$date_inspec)} - {max(alim$date_inspec)}
                  r.iresmi.net - {Sys.Date()}")) +
  theme_minimal() +
  theme(panel.background = element_rect(fill = "lightblue1"),
        plot.caption = element_text(size = 7))


ggsave("food.png", width = 20, height = 20, units = "cm", scale = 1.1)
Food controls
Catégories
R

Sugar

Day 13 of 30DayMapChallenge : « 5 minutes map » (previously).

Sugar cane fields in La Réunion

library(tidyverse)
library(sf)

# RPG région La Réunion édition 2021 
# https://wxs.ign.fr/0zf5kvnyfgyss0dk5dvvq9n7/telechargement/prepackage/RPG_REGION_PACK_DIFF_2021-01-01$RPG_2-0__SHP_RGR92UTM40S_R04_2021-01-01/file/RPG_2-0__SHP_RGR92UTM40S_R04_2021-01-01.7z
parc <- read_sf("~/PARCELLES_GRAPHIQUES.shp")

# https://geoservices.ign.fr/adminexpress 
# COG
dep <- read_sf("~/DEPARTEMENT.shp") %>% 
  filter(INSEE_DEP == "974")

parc %>% 
  ggplot() +
  geom_sf(data = dep, fill = "white", color = "blue") +
  geom_sf(aes(fill = CODE_CULTU %in% c(
              "CSA",
              "CSF",
              "CSI",
              "CSP",
              "CSR")), color = NA) +
  scale_fill_manual(values = list("TRUE" = "goldenrod",
                                  "FALSE" = "lightgrey"),
                    labels = list("TRUE" = "Cane",
                                  "FALSE" = "Other")) +
  labs(title = "Sugar cane fields",
       subtitle = "La Réunion",
       fill = "Crop") +
  theme_minimal() +
  theme(plot.background = element_rect(fill = "lightblue1"))

ggsave("reunion.png", width = 20, height = 20, units = "cm", scale = 1.1)
Cane
Catégories
R

Scales

Day 12 of 30DayMapChallenge : « scales » (previously).

French « départements » at the same size but tidily ordered by area.

library(tidyverse)
library(sf)
library(units)
library(tmap)

# data from https://geoservices.ign.fr/adminexpress
fr <- read_sf("~/adminexpress.gpkg", 
              layer = "departement")

png("scales.png", width = 20, height = 12.36, units = "cm", res = 300)
fr %>%  
  mutate(surf_km2 =  set_units(st_area(geom), km^2), 
         rank = dense_rank(surf_km2), 
         nom_dep = fct_rev(fct_reorder(as_factor(nom_dep), rank))) %>% 
  tm_shape() +
  tm_polygons(fill = "lightgrey") +
  tm_facets(by = "nom_dep", free.scales = TRUE)
dev.off()  
Départements