Retrieving Paper References with the OpenAlex API from Python

python
openalex
api
Use the OpenAlex API from Python to retrieve the references cited by a paper.
Published

2026-06-19

Modified

2026-06-19

Preparing an API Key (First-Time Setup)

An API key is required to use the OpenAlex API. Create an OpenAlex account, then obtain an API key from the settings page. Use the following link to sign up and get an API key.

Importing Libraries and Setting the API Key

This example uses Python’s standard os library and the external libraries requests and pandas. It assumes that the API key has been set in the environment variable OPENALEX_API_KEY, and retrieves it through os.environ. Before running the code, set OPENALEX_API_KEY=your_api_key_here in a .env file or a similar local setup in the working directory. The .env file is a text file placed at the project root to define environment variables, including the API key. When using GitHub Actions, the basic approach is not to use .env; instead, register OPENALEX_API_KEY as a repository secret and pass it from the workflow.

.env
OPENALEX_API_KEY=your_api_key_here
import os
import pandas as pd
import requests
from dotenv import load_dotenv

load_dotenv()

api_key = os.environ["OPENALEX_API_KEY"]

Retrieving a Paper’s References by DOI

In this example, I use the following paper and show how to retrieve its references by DOI.

Yan, Pu, Nianpeng He, Kailiang Yu, Lawren Sack, Lin Jiang, and Marcos Fernández-Martínez. “Plant Elemental Diversity Increases Ecosystem Productivity and Temporal Stability.” Ecological Monographs 96, no. 1 (2026): e70061. https://doi.org/10.1002/ecm.70061.

To retrieve metadata for a paper by DOI, call the /works/doi:{doi} endpoint as follows.

Then create the request and call the API with requests.get().

doi = "10.1002/ecm.70061"
url = f"https://api.openalex.org/works/doi:{doi}"
params = {
    "api_key": api_key,
    "select": "id,doi,display_name,publication_year,referenced_works",
}

# DOIから対象論文を取得
response = requests.get(
    url,
    params=params,
    timeout=30,
)
response.raise_for_status() # 失敗した場合は例外を発生させる

The result is returned in JSON format, so convert it to a Python dictionary with response.json(). The response includes the paper title, publication year, and IDs of cited references.

work = response.json()
print(work["display_name"])  # 論文のタイトルを表示
print(len(work["referenced_works"]))  # 引用文献の数を表示
Output
Plant elemental diversity increases ecosystem productivity and temporal stability
74

The referenced_works field stores the cited reference IDs as a list.

print(work["referenced_works"][:5])  # 最初の5件を表示
Output
['https://openalex.org/W1584343945', 'https://openalex.org/W1594032928', 'https://openalex.org/W1885882554', 'https://openalex.org/W1965202842', 'https://openalex.org/W2020948897']

You can use these IDs to retrieve metadata for the cited references.

Retrieving Reference Metadata from Reference IDs

To retrieve metadata such as titles, publication years, citation counts, and DOIs for the obtained reference IDs, set query parameters as follows and call the /works endpoint.

This lets you retrieve information for multiple references at once.

ref_ids = work["referenced_works"]

params = {
    "api_key": api_key,
    "filter": "openalex:" + "|".join(ref_ids),
    "per_page": 100,
    "select": "id,display_name,publication_year,cited_by_count,doi",
}

response = requests.get(
    "https://api.openalex.org/works",
    params=params,
    timeout=30,
)
response.raise_for_status()

refs = response.json()["results"]

df_refs = pd.DataFrame(
    [
        {
            "title": ref.get("display_name"),
            "year": ref.get("publication_year"),
            "citations": ref.get("cited_by_count"),
            "doi": ref.get("doi"),
            "openalex_id": ref.get("id"),
        }
        for ref in refs
    ]
)

print(df_refs.to_string())
Output excerpt
title  year  citations                                                    doi                       openalex_id
0                                                                                                                                           Multimodel Inference  2004      11568               https://doi.org/10.1177/0049124104268644  https://openalex.org/W2158196600
1                                                                                              A Caution Regarding Rules of Thumb for Variance Inflation Factors  2007      10409              https://doi.org/10.1007/s11135-006-9018-6  https://openalex.org/W2140964565
2                                                                                                                          Soil Sampling and Methods of Analysis  2007       6452                  https://doi.org/10.1201/9781420005271  https://openalex.org/W2478908224
3                                                                         performance: An R Package for Assessment, Comparison and Testing of Statistical Models  2021       5235                    https://doi.org/10.21105/joss.03139  https://openalex.org/W3153999239

(途中省略)

71                                                                     The scaling of elemental stoichiometry and growth rate over the course of bamboo ontogeny  2023         17                      https://doi.org/10.1111/nph.19408  https://openalex.org/W4388921950
72                        Plant Elemental Homeostasis Enhances Species Performance and Community Functioning in Wetlands: Looking Beyond Nitrogen and Phosphorus  2025          4                      https://doi.org/10.1111/ele.70152  https://openalex.org/W4412488598
73                                                                           Optimal set of leaf and aboveground tree elements for predicting forest functioning  2025          1                https://doi.org/10.5194/bg-22-2115-2025  https://openalex.org/W4409963066

In this way, you can use the cited reference IDs to retrieve metadata such as reference titles, publication years, citation counts, and DOIs.