OpenAlex Funding Data Inconsistency

For circa one quarter of funders, OpenAlex lists different publication counts in funder vs. works objects.
bibliometrics
OpenAlex
funding data
Author
Affiliation

Nikita Sorgatz

Published

December 5, 2024

During my work for the KBOPENBIB project, I came across the following inconsistency in OpenAlex funding data which might be of interest to the wider community: the number of works per funder doesn’t always match up between works and funders object.

How to get the relevant data from OpenAlex’s API using openalexR
# query list of all funders, including works_count field
funders <- oa_request(query_url = "https://api.openalex.org/funders")
df <- oa2df(funders, entity = "funders")

# query works list grouped by funders
df2 <- oa_fetch(entity = "works", group_by = "grants.funder")

You would expect that the number in the works_count field of the funder object to match the count you get by counting unique works ids per funder id in the works table.

Agreement Number of funders p
8025 24.74%
✔️ 24412 75.26%
Table 1: Agreement of funded publications per funder between works_count and “manual” count.

For roughly a quarter of funders the publication counts do not match up. The mean difference between works_count and manual count is 10.26, so works_count field is missing 10 publications on average. However, if we remove the National Natural Science Foundation of China from the data — which has a whopping number of 343744 missing publications — the mean drops down to -0.34.

Table 2: Difference in count methods

Looking at funders with diverging publication counts in Table 2 we see that for a majority of funders the difference is only one publication.

I hope that this problem description contributes to the continuous improvement of OpenAlex. Until this inconsistency is addressed, I recommend to “manually” count work ids per funder instead of the works_count field.

Citation

BibTeX citation:
@online{sorgatz2024,
  author = {Sorgatz, Nikita},
  title = {OpenAlex {Funding} {Data} {Inconsistency}},
  date = {2024-12-05},
  url = {https://social.construction/posts/first post.html},
  langid = {en}
}
For attribution, please cite this work as:
Sorgatz, Nikita. 2024. “OpenAlex Funding Data Inconsistency.” December 5, 2024. https://social.construction/posts/first post.html.