Sport data from Runalyze

R
datavisualization
webscraping
sport
Author

Michaël

Published

2025-04-15

Modified

2025-04-16

Runners blurred

Runners – CC-BY by ~jar{}

I already explored how to get your activities from Strava. Maybe you use Runalyze instead? In that case you can do some web scraping to export your data.

library(httr)
library(rvest)
library(glue)
library(readr)
library(dplyr)
library(stringr)
library(tidyr)
library(lubridate)
library(hms)
library(janitor)
library(ggplot2)
library(forcats)
library(tibble)
library(purrr)

Runalyze allows you to download your tabular data as a CSV (so no tracks). The only complicated part is the login: you have to get a token.

token <- GET("https://runalyze.com/login") |> 
  content() |>
  html_elements("input[name=_csrf_token]") |> 
  html_attr("value")

POST("https://runalyze.com/login",
     body = list(
       "_username" = Sys.getenv("RUNALYZE_U"),
       "_password" = Sys.getenv("RUNALYZE_P"),
       "_remember_me" = "on",
       "_csrf_token" = token))

GET("https://runalyze.com/_internal/data/activities/all",
    write_disk(glue("{tempdir()}/runalyze.csv"), overwrite = TRUE))

Fairly easy. Then, you get all your activities.

There is also an API for other uses, but I didn’t try it. I didn’t try to get the GPS data either (seems less straightforward).

# to build from : distinct(activites, sportid)
sports <- tribble(
  ~sportid, ~sport,               ~colour,
  400452,   "running",            "yellow",
  400454,   "cycling",            "orange",
  422335,   "nordic skiing",      "lightblue",
  422336,   "mountain skiing",    "blue",
  453960,   "alpinism",           "darkgreen",
  400453,   "swimming",           "deepskyblue",
  400455,   "stretching",         "pink",
  1304290,  "walking and others", "grey",    # hiking
  400456,   "walking and others", "grey",    # crossfit
  400457,   "walking and others", "grey") |> # hiking
  mutate(sport = fct_rev(as_factor(sport)))

activities <- read_csv(glue("{tempdir()}/runalyze.csv"),
                       guess_max = 1e4) |> 
  clean_names() |> 
  mutate(across(c(time, created, edited), as_datetime),
         across(c(s, elapsed_time), hms),
         vdate = ymd(paste("2024", month(time), day(time), sep = "-"))) |> 
  left_join(sports, join_by(sportid))
activities |> 
  group_by(ym = format(time, "%Y-%m"), sport) |> 
  summarise(time_s = sum(s, na.rm = TRUE),
            distance = sum(distance, na.rm = TRUE),
            .groups = "drop") |> 
  mutate(hours = as.numeric(time_s, "hours")) |> 
  ggplot(aes(ym, hours, fill = sport)) +
  geom_col(just = 0) +
  scale_x_discrete(
    breaks = \(x) keep(x, substr(x, 6, 7) == "01"),
    labels = \(x) ifelse(substr(x, 6, 7) == "01", substr(x, 1, 4), "")) +
  scale_fill_manual(values = sports |> 
                      select(sport, colour) |> 
                      deframe()) +
  labs(title = "Activities",
       subtitle = "Monthly time",
       x = "month",
       y = "h",
       fill = "activities") +
  theme(axis.title.y = element_text(angle = 0, vjust = 0.5))
A bar plot of sport activities by month
Figure 1: My data from Runalyze