Data model
Before looking at the API endpoints, it is important to understand the general terms and how they relate to one another.
Data Types
The IHME Portal data model is built around providing access to similarly structured data of different types. Data type is what we use to categorize data based on the kind of information that it covers and the way the data is collected. The following data types are available:
Cause
A single disease or injury or an aggregation of diseases and injuries that cause death or disability.
Sequela
Health states that result from a cause. For example, diabetes can lead to vision impairment.
Risk
A number of behavioral, environmental, and metabolic factors are causally associated with the likelihood of experiencing health burden or early mortality due to a given condition.
Etiology
Pathogens that lead to a cause. For example, rotavirus leads to diarrhea.
Impairment
Aggregate entity of all of the related sequelae, across various causes. For example, chronic obstructive pulmonary disease (a cause) can lead to various severities of heart failure (sequelae). Ischemic heart disease can also lead to heart failure.
Summary exposure value (SEV)
A measure of a population’s exposure to a risk factor that takes into account the extent of exposure by risk level and the severity of that risk’s contribution to disease burden. SEV takes the value zero when no excess risk for a population exists and the value one when the total population is at the highest level of risk; we report SEV on a scale from 0% to 100% to emphasize that it is risk-weighted prevalence.
Covariate
Other variables that are used in the modeling process to generate estimates
Population
Total number of people matching certain criteria
Data Structure
At the most basic level, the data can be described as a list of table-like records, which has the following structure.
Filter 1
Filter 2
...
Filter N
Value 1
Value 2
...
Value N
Each record provides corresponding values for the given filter breakdown. The available filters depend on the type of data. Abstraction describing a piece of data is called dataset.
Filters and Primary Entity
Data of all types except "Population" can be filtered by its primary entity, which usually matches the name of the data type. The rest of the possible data filters are:
Age group
Defines age range, e.g. 12-14, neonatal, etc
Forecast scenario
Defines a range of potential health outcomes for forecast data, e.g. better, worse
Gender
Male, female or both genders combined
Location
Includes country, non-sovereign region, principal administrative unit of a country (e.g., state, province), GBD region, or other custom administrative division, such as World Bank Income Level or WHO region
Measure
The indicator for which results data are produced, e.g. Mortality (Death), DALY, Incidence, etc
Metric
The unit by which a measure is expressed. E.g., number, percent, rate, etc
Round
Corresponds to data collection and processing iteration, e.g. Prior GBD trends with forecasts, GBD 1990-2024
Year
Data collection year
The table below shows which filters are available for each data type. Columns represent the data types, and rows represent filters.
Age group
X
X
X
X
X
X
X
X
Cause
Primary
X
X
X
X
Covariate
Primary
Etiology
Primary
Forecast scenario
X
Gender
X
X
X
X
X
X
X
X
Impairment
Primary
Location
X
X
X
X
X
X
X
X
Measure
X
X
X
X
X
X
X
Metric
X
X
X
X
X
X
X
Risk
Primary
Primary
Round
X
X
X
X
X
X
X
X
Sequela
Primary
Year
X
X
X
X
X
X
X
X
Granularity
Just knowing the available filters is not enough to describe the data. Data of different data types, as well as different pieces of data of the same data type may have different filter values available.
For example, because a sequela describes a health state (and someone has to be alive to experience that health state), there is no mortality associated with sequela. Thus fatal measures aren't available for sequelae, and we will never see them among sequela measure filter values.
The combination of all filters with their values is called data granularity. It can describe the whole data or only a part of a larger dataset.
United States
2023
Female
All Ages
All Causes
Deaths
Number
GBD 1990-2024
1,474,164.29
United States
2023
Male
All Ages
All Causes
Deaths
Number
GBD 1990-2024
1,617,491.58
United States
2024
Female
All Ages
All Causes
Deaths
Number
GBD 1990-2024
1,485,926.67
United States
2024
Male
All Ages
All Causes
Deaths
Number
GBD 1990-2024
1,636,385.89
The granularity of the cause outcome data in the table above is:
Location
United States
Year
2023, 2024
Gender
Female, Male
Age group
All Ages
Cause
All Causes
Measure
Deaths
Metric
Number
Round
GBD 1990-2024
Looking at the data granularity, we get an understanding of what the data is without looking at the data itself, which is very useful when dealing with huge amounts of data.
API Representation
If we requested the data above from the IHME Portal API, we would get filter value ids instead of names.
location_id
year
gender_id
age_group_id
cause_id
measure_id
metric_id
round_id
value
102
2023
2
22
294
1
1
9
1,474,164.29
102
2023
1
22
294
1
1
9
1,617,491.58
102
2024
2
22
294
1
1
9
1,485,926.67
102
2024
1
22
294
1
1
9
1,636,385.89
This is because the API works only with ids. Filter value names and other filter related information can be requested separately using the filter metadata API endpoints.
The data granularity returned by the API would also have only ids:
As you probably noticed, the filter names also changed to their id form. The table below contains the mapping between filter names you see on the IHME Portal's "Data Explorer" page and the IHME Portal API:
Age
age_group_id
Cause
cause_id
Covariate
covariate_id
Etiology
rei_id
Forecast scenario
forecast_scenario_id
Gender
gender_id
Impairment
rei_id
Location
location_id
Measure
measure_id
Metric
metric_id
Risk
rei_id
Round
round_id
Sequela
sequela_id
Year
year
Note, that risk, etiology, and impairment filters have the same id name rei_id.
The available data types also have different names on the API side:
Cause
cause_outcome
Sequela
sequela_id
Risk
risk_attribution
Etiology
etiology_records
Impairment
impairment_records
SEV
sev_records
Covariate
covariate_records
Population
population_records
Now proceed to the API Reference section to learn how the data model above is abstracted behind the API HTTP resources.
Last updated