Data model

Before looking at the API endpoints, it is important to understand the general terms and how they relate to one another.

Data Types

The IHME Portal data model is built around providing access to similarly structured data of different types. Data type is what we use to categorize data based on the kind of information that it covers and the way the data is collected. The following data types are available:

Data type
Description

Cause

A single disease or injury or an aggregation of diseases and injuries that cause death or disability.

Sequela

Health states that result from a cause. For example, diabetes can lead to vision impairment.

Risk

A number of behavioral, environmental, and metabolic factors are causally associated with the likelihood of experiencing health burden or early mortality due to a given condition.

Etiology

Pathogens that lead to a cause. For example, rotavirus leads to diarrhea.

Impairment

Aggregate entity of all of the related sequelae, across various causes. For example, chronic obstructive pulmonary disease (a cause) can lead to various severities of heart failure (sequelae). Ischemic heart disease can also lead to heart failure.

Summary exposure value (SEV)

A measure of a population’s exposure to a risk factor that takes into account the extent of exposure by risk level and the severity of that risk’s contribution to disease burden. SEV takes the value zero when no excess risk for a population exists and the value one when the total population is at the highest level of risk; we report SEV on a scale from 0% to 100% to emphasize that it is risk-weighted prevalence.

Covariate

Other variables that are used in the modeling process to generate estimates

Population

Total number of people matching certain criteria

Data Structure

At the most basic level, the data can be described as a list of table-like records, which has the following structure.

Filter 1

Filter 2

...

Filter N

Value 1

Value 2

...

Value N

Each record provides corresponding values for the given filter breakdown. The available filters depend on the type of data. Abstraction describing a piece of data is called dataset.

Filters and Primary Entity

Data of all types except "Population" can be filtered by its primary entity, which usually matches the name of the data type. The rest of the possible data filters are:

Data filter
Description

Age group

Defines age range, e.g. 12-14, neonatal, etc

Forecast scenario

Defines a range of potential health outcomes for forecast data, e.g. better, worse

Gender

Male, female or both genders combined

Location

Includes country, non-sovereign region, principal administrative unit of a country (e.g., state, province), GBD region, or other custom administrative division, such as World Bank Income Level or WHO region

Measure

The indicator for which results data are produced, e.g. Mortality (Death), DALY, Incidence, etc

Metric

The unit by which a measure is expressed. E.g., number, percent, rate, etc

Round

Corresponds to data collection and processing iteration, e.g. Prior GBD trends with forecasts, GBD 1990-2024

Year

Data collection year

The table below shows which filters are available for each data type. Columns represent the data types, and rows represent filters.

Cause
Sequela
Risk
Etiology
Impairment
SEV
Covariate
Population

Age group

X

X

X

X

X

X

X

X

Cause

Primary

X

X

X

X

Covariate

Primary

Etiology

Primary

Forecast scenario

X

Gender

X

X

X

X

X

X

X

X

Impairment

Primary

Location

X

X

X

X

X

X

X

X

Measure

X

X

X

X

X

X

X

Metric

X

X

X

X

X

X

X

Risk

Primary

Primary

Round

X

X

X

X

X

X

X

X

Sequela

Primary

Year

X

X

X

X

X

X

X

X

Granularity

Just knowing the available filters is not enough to describe the data. Data of different data types, as well as different pieces of data of the same data type may have different filter values available.

For example, because a sequela describes a health state (and someone has to be alive to experience that health state), there is no mortality associated with sequela. Thus fatal measures aren't available for sequelae, and we will never see them among sequela measure filter values.

The combination of all filters with their values is called data granularity. It can describe the whole data or only a part of a larger dataset.

Location
Year
Gender
Age group
Cause
Measure
Metric
Round
Value

United States

2023

Female

All Ages

All Causes

Deaths

Number

GBD 1990-2024

1,474,164.29

United States

2023

Male

All Ages

All Causes

Deaths

Number

GBD 1990-2024

1,617,491.58

United States

2024

Female

All Ages

All Causes

Deaths

Number

GBD 1990-2024

1,485,926.67

United States

2024

Male

All Ages

All Causes

Deaths

Number

GBD 1990-2024

1,636,385.89

The granularity of the cause outcome data in the table above is:

Filter
Values

Location

United States

Year

2023, 2024

Gender

Female, Male

Age group

All Ages

Cause

All Causes

Measure

Deaths

Metric

Number

Round

GBD 1990-2024

Looking at the data granularity, we get an understanding of what the data is without looking at the data itself, which is very useful when dealing with huge amounts of data.

API Representation

If we requested the data above from the IHME Portal API, we would get filter value ids instead of names.

location_id

year

gender_id

age_group_id

cause_id

measure_id

metric_id

round_id

value

102

2023

2

22

294

1

1

9

1,474,164.29

102

2023

1

22

294

1

1

9

1,617,491.58

102

2024

2

22

294

1

1

9

1,485,926.67

102

2024

1

22

294

1

1

9

1,636,385.89

This is because the API works only with ids. Filter value names and other filter related information can be requested separately using the filter metadata API endpoints.

The data granularity returned by the API would also have only ids:

As you probably noticed, the filter names also changed to their id form. The table below contains the mapping between filter names you see on the IHME Portal's "Data Explorer" page and the IHME Portal API:

"Data Explorer" filter name
API filter name

Age

age_group_id

Cause

cause_id

Covariate

covariate_id

Etiology

rei_id

Forecast scenario

forecast_scenario_id

Gender

gender_id

Impairment

rei_id

Location

location_id

Measure

measure_id

Metric

metric_id

Risk

rei_id

Round

round_id

Sequela

sequela_id

Year

year

Note, that risk, etiology, and impairment filters have the same id name rei_id.

The available data types also have different names on the API side:

"Data Explorer" data type name
API data type name

Cause

cause_outcome

Sequela

sequela_id

Risk

risk_attribution

Etiology

etiology_records

Impairment

impairment_records

SEV

sev_records

Covariate

covariate_records

Population

population_records

Now proceed to the API Reference section to learn how the data model above is abstracted behind the API HTTP resources.

Last updated