Data Product Case Study

Global Macro Database

A public macroeconomic data product that packages long-run international data, quarterly releases, documentation, and research-ready access into one workflow.

Global Macro Database reads less like a personal tool and more like a public data product. The live site leads with a clear promise: the world's most comprehensive macroeconomic dataset, distributed through data files, packages, documentation, GitHub, and an academic paper.

The article therefore needs a data-product structure: release signal, coverage metrics, source pipeline, access modes, and citation requirements.

01

The Release Page Is the Interface

GMD is not a static spreadsheet. The live site frames it around a current release, update notes, download paths, paper, documentation, and GitHub.

46variables across GDP, prices, government finance, trade, labor, credit, and more.
240countries and territories in the public release coverage.
121data sources, including 27 contemporary and 94 historical datasets.
1086-2030historical coverage from 1086 to 2025, with forecasts to 2030.
v2026_03Latest release on the live project site

The update adds eleven new data sources, two new consumption variables, improved government finance ratio splicing, and feature parity across the Python, R, and Stata packages.

02

A Pipeline for Removing Data Friction

The project exists because macro data is scattered, inconsistent, and expensive to harmonize before analysis can even begin.

The live site describes the bottleneck plainly: users can spend weeks cleaning, harmonizing, and combining data before doing analysis. GMD moves that work into a systematic pipeline for downloading, cleaning, combining, documenting, and releasing data.

Official and institutional data

IMF, World Bank, OECD, UN, BIS, and other contemporary sources give modern coverage.

Historical reconstruction

Yearbooks, archives, handbooks, and academic datasets extend selected series far back in time.

Harmonization logic

Definitions, source ranking, splicing, and metadata turn source fragments into usable series.

03

Research-Ready Access Paths

A dataset becomes infrastructure when it fits the tools researchers already use.

Download

CSV, Excel, and Stata files support direct inspection, teaching, replication, and offline workflows.

Packages

Python, R, and Stata packages let researchers pull GMD into analysis pipelines without manual download steps.

Documentation

Technical documentation explains variables, sources, construction choices, and limitations.

License

The live site presents the data as free for non-commercial use under CC BY-NC-SA 4.0, with citation required.

04

Citation Is Part of the Product

Because GMD updates quarterly, the site asks users to cite the exact version they used.

This matters for reproducibility. If a dataset changes over time, a paper or report should point to the version that generated the results, not just the project name.

Versioned citation shape
@techreport{GMD2025,
  title = {The Global Macro Database: A New International Macroeconomic Dataset (Version 2026-03)},
  author = {Müller, Karsten and Xu, Chenzi and Lehbib, Mohamed and Chen, Ziliang},
  institution = {National Bureau of Economic Research},
  number = {33714},
  year = {2025},
  doi = {10.3386/w33714}
}
Why it matters to me

GMD is one of the most systematic projects I have participated in: a way to make macroeconomic history easier to study, compare, teach, and replicate.