Eurostat data loaders

June 22, 2025

Eurostat data loaders for easy access to European statistics datasets

The Eurostat data browser is a great sources for all european statistics. But it is sometimes a bit difficult to use. This page loads its XML catalog and converts it to an easier csv format. Click on a specific dataset to create a data loader which returns a highly optimized parquet file.

This page has been taken from Fil/pangea and slightly adapted to work with my Observable setup. All credits to Fil.


The following code can be used as a dataloader for this dataset:

Save it as data/eurostat/.parquet.sh. You can then reference /data/eurostat/.parquet in your SQL front matter or DuckDBClient options.

Don't hesitate to play with the parameters in the query to try and optimize the file further. The smaller it is, the faster your data app will be! In particular, try to add:

  • WHERE geo='TOTAL' (or some other filter) when you don’t need all the modalities in all the dimensions;
  • ORDER BY 2, 3, 4, 5… where the ranks correspond to the columns that have the smallest number of distinct values. (You can evaluate these with queries such as SELECT COUNT(DISTINCT "geo") FROM table.)

Using this approach we have seen some files with 6+ million rows compressed to just 1MB—less than 2 bits per row!