OpenCitations Data Downloads

Access all data dumps from OpenCitations Meta and OpenCitations Index

Data available through Figshare and the Internet Archive

Meta

Bibliographic metadata for all publications in the OpenCitations Index

Latest: February 2025 Dump

Includes data on 121M+ bibliographic entities, 368M+ authors, and 698K+ publication venues

View Meta Downloads
Index

OMID-to-OMID references representing all citations from multiple sources

Latest: March 2025 Dump

Includes data on 2.1 billion+ citations from various sources

View Index Downloads

OpenCitations Meta

The OpenCitations Meta database stores and delivers bibliographic metadata for all publications involved in the OpenCitations Index.

Most recent OpenCitations Meta data dump - February 2025

Released on 2025-02-13, this dataset enhances its previous version by incorporating new data from the Crossref November 2024 Dump, as well as the November 2024 dump of JaLC (Japan Link Center).

Key Statistics

121M+

Bibliographic entities

368M+

Authors

698K+

Publication venues

2,718,222 editors and 101,612,475 publishers (counted by roles, without disambiguating individuals)
Download Files
Type and format Archive Size
Metadata (CSV) Download tar 48G (12G zipped) on ext4
Metadata and provenance (RDF) Download tar.gz 145G (47G compressed) on ext4
Additional Files
Type and format Archive Size
A CSV dump storing a mapping between all OMIDs and their corresponding PID(s) (e.g., DOI, ORCID, PMID, etc) Download ZIP 6.5 GB (1.5 GB zipped)
Previous dumps

Compared to the previous dump, this one adds the metadata contained in the Crossref dump dated March 2024.

Available in CSV (metadata) and JSON-LD (metadata and provenance) formats.

Compared to the previous dump, this one incorporates OpenAlex IDs, leveraging data from the OpenAlex dump.

Available in CSV (metadata) and RDF (metadata and provenance) formats.

Compared to the previous dump, this one adds the metadata contained in the Japan Link Center (JaLC).

Available in CSV (metadata) format.

OpenCitations Index

The OpenCitations Index stores OMID-to-OMID references representing all the references gathered from several sources.

Most recent OpenCitations Index data dump - March 2025

Released on 2025-03-24, this dataset adds the citation data contained in the Crossref dump dated November 2024.

Key Statistics

2.1 Billion+

Citations (2,155,497,918)

Download Files
Type and format Archive Size
Citation data (CSV) Download ZIP 220 GB (34.4 GB zipped)
Citation data (N-Triple) Download ZIP 1.9 TB (80.6 GB zipped)
Citation data (Scholix) Download ZIP 1.9 TB (40 GB zipped)
Provenance data (CSV) Download ZIP 410 GB (18 GB zipped)
Provenance data (N-Triple) Download ZIP 3.1 TB (95 GB zipped)
Additional Files
Type and format Archive Size
Citation data sources' info (N-Triple): information regarding the data source collection (e.g., COCI, DOCI, POCI, etc) of all the citation data Download ZIP 388 GB (23.7 GB zipped)
Citation data sources' info (CSV): information regarding the data source collection (e.g., COCI, DOCI, POCI, etc) of all the citation data Download ZIP 97 GB (21 GB zipped)
Citation count data (CSV): the number of incoming citations to each bibliographic entity (identified by an OMID) in OpenCitations Index TBA
Previous dumps

Available in CSV (citation data), N-Triple (citation data), SCHOLIX (citation data), CSV (provenance data), and N-Triple (provenance data).

In addition, a N-Triple dump containing information regarding the data source collection, and a citation count dump with the number of incoming citations to each bibliographic entity (identified by an OMID).

Available in CSV (citation data), N-Triple (citation data), SCHOLIX (citation data), CSV (provenance data), and N-Triple (provenance data).

In addition, a N-Triple dump containing information regarding the data source collection, and a citation count dump with the number of incoming citations to each bibliographic entity (identified by an OMID).

Available in CSV (citation data), N-Triple (citation data), SCHOLIX (citation data), CSV (provenance data), and N-Triple (provenance data).

In addition, a N-Triple dump containing information regarding the data source collection.