Using the WFO Plant List API from R
Original Japanese version: WFO Plant List APIをRから利用する方法
The World Flora Online is a global database of plant information. It also provides an API as the WFO Plant List API, so this article shows how to access it from R and retrieve accepted names from scientific names.
Installing Packages
Install the httr2 package to make API requests easier. Because API responses are returned as JSON, also install the jsonlite package.
After installation, load the packages with library().
Specifying the Endpoint
The API endpoint is the following URL.
endpoint <- "https://list.worldfloraonline.org/gql.php"Creating a Query to Search by Scientific Name
The search is written as a GraphQL query. Here, I create a query that searches for the scientific name "Quercus serrata" (called Konara in Japanese).
query <- '
query {
taxonNameSuggestion(
termsString: "Quercus serrata"
limit: 10
) {
id
fullNameStringHtml
currentPreferredUsage {
hasName {
id
fullNameStringHtml
}
}
}
}
'This query does the following.
- Calls the
taxonNameSuggestionfield - Sets the scientific name to search for in the
termsStringargument - Sets the maximum number of search results in the
limitargument - Retrieves
id,fullNameStringHtml, andcurrentPreferredUsagefields as the query result
fullNameStringHtml is the searched scientific name expressed in HTML format. currentPreferredUsage shows which taxon the name is currently treated as. The hasName field inside currentPreferredUsage contains the scientific name of the current taxon. In other words, when an accepted name exists, that name appears in the hasName field inside currentPreferredUsage.
Creating and Sending the Request
Use httr2::request() to create the base request for the API.
req <- request(endpoint)Add the GraphQL query as JSON.
req <- req_body_json(
req,
list(query = query),
auto_unbox = TRUE
)Then send it to WFO.
resp <- req_perform(req)Convert the returned JSON into an R list. This makes the API response usable from R.
x <- resp_body_json(resp, simplifyVector = FALSE)Checking the Result
The x object stores the API response as an R list.
print(x)Ten candidates were returned because the query specified limit: 10. Let’s inspect the first result.
x$data$taxonNameSuggestion[[1]]The three fields id, fullNameStringHtml, and currentPreferredUsage were returned.
The field to focus on is hasName inside currentPreferredUsage. This field contains the scientific name of the current taxon.
x$data$taxonNameSuggestion[[1]]$currentPreferredUsageIn this example, the hasName field inside currentPreferredUsage displays the scientific name “Quercus serrata Murray”.
x$data$taxonNameSuggestion[[1]]$id
x$data$taxonNameSuggestion[[1]]$currentPreferredUsage$hasName$idComparing x$data$taxonNameSuggestion[[1]]$id with x$data$taxonNameSuggestion[[1]]$currentPreferredUsage$hasName$id, the two values are identical.
This shows that the searched scientific name "Quercus serrata" is currently treated as "Quercus serrata".
In this way, you can retrieve an accepted name from a searched scientific name.
Creating a Function to Retrieve Accepted Names
In practical work, it is useful to wrap the process in functions.
Here, I turn the steps above into functions and create a function that retrieves the accepted name from a scientific name. First, create get_wfo_suggestions(), which retrieves candidates from a scientific name.
In addition to the query above, this function also retrieves fields such as fullNameStringNoAuthorsPlain, authorsString, and rank. The rank field is especially important because it shows the taxonomic rank of the searched name. For example, it lets you distinguish whether the name is at the rank of species, subspecies, variety, and so on.
When retrieving an accepted name, it is often useful to look for an accepted name at the same rank as the searched name, so this function also retrieves rank information.
get_wfo_suggestions <- function(
name,
limit = 10,
endpoint = "https://list.worldfloraonline.org/gql.php"
) {
query <- '
query NameSearch($terms: String!, $limit: Int) {
taxonNameSuggestion(
termsString: $terms
limit: $limit
) {
id
fullNameStringPlain
fullNameStringNoAuthorsPlain
authorsString
rank
currentPreferredUsage {
hasName {
id
fullNameStringPlain
fullNameStringNoAuthorsPlain
authorsString
rank
}
}
}
}
'
req <- request(endpoint)
req <- req_body_json(
req,
list(
query = query,
variables = list(
terms = name,
limit = limit
)
),
auto_unbox = TRUE
)
resp <- req_perform(req)
x <- resp_body_json(resp, simplifyVector = FALSE)
res <- x$data$taxonNameSuggestion
if (is.null(res) || length(res) == 0) {
return(data.frame())
}
get_value <- function(z, field) {
if (is.null(z[[field]])) {
NA_character_
} else {
z[[field]]
}
}
get_accepted <- function(z, field) {
if (is.null(z$currentPreferredUsage)) {
NA_character_
} else if (is.null(z$currentPreferredUsage$hasName[[field]])) {
NA_character_
} else {
z$currentPreferredUsage$hasName[[field]]
}
}
out <- data.frame(
input = name,
id = sapply(res, get_value, field = "id"),
name = sapply(res, get_value, field = "fullNameStringPlain"),
name_no_author = sapply(
res,
get_value,
field = "fullNameStringNoAuthorsPlain"
),
authors = sapply(res, get_value, field = "authorsString"),
rank = sapply(res, get_value, field = "rank"),
accepted_id = sapply(res, get_accepted, field = "id"),
accepted_name = sapply(res, get_accepted, field = "fullNameStringPlain"),
accepted_name_no_author = sapply(
res,
get_accepted,
field = "fullNameStringNoAuthorsPlain"
),
accepted_authors = sapply(res, get_accepted, field = "authorsString"),
accepted_rank = sapply(res, get_accepted, field = "rank"),
stringsAsFactors = FALSE
)
out$is_accepted <- !is.na(out$accepted_id) & out$id == out$accepted_id
out
}Next, create get_accepted_name(), a function that retrieves the accepted name from a scientific name.
The get_accepted_name() function takes a scientific name and rank, then calls get_wfo_suggestions() to retrieve candidates.
result <- get_wfo_suggestions("Quercus serrata")
print(result)Use get_accepted_name() to retrieve the row that contains the accepted name for a scientific name.
accepted_name <- get_accepted_name("Quercus serrata")
print(accepted_name)Notes on Using the API
The WFO Plant List API is publicly available, but sending a large number of requests in a short time may place load on the server. Searching a small number of names for learning or checking is fine, but when processing many scientific names at once, keep the following points in mind.
- Save or cache results so that the same name is not queried repeatedly.
- Add
Sys.sleep()in loops when needed to avoid continuous access. - For large datasets, consider using the published download data or a local workflow instead of trying to retrieve all data through the API.
- Do not accept candidates returned by the API without review; check
rank, WFO ID,currentPreferredUsage, and accepted name. - Because
taxonNameSuggestion()is for candidate search, make the candidate-review rules explicit when doing strict bulk matching.
For example, when running a loop, you can add Sys.sleep() as follows to avoid continuous access.
for (nm in names) {
result <- get_accepted_name(nm)
Sys.sleep(0.5) # サーバー負荷を下げるため少し待つ
}The WFO Plant List also provides the same data as the API through Zenodo, with a DOI that can be cited. For research use cases that process many names or require reproducibility, consider not relying only on the API. Instead, specify the WFO Plant List release used and use or cite the data available on Zenodo.