Version: 10.8.2

Cloud access

The OpenAIRE Graph is made available publicly as a dataset on Google Cloud and can be accessed and queried from Google BigQuery.

How is the data structured?

The dataset on Google Cloud is based on the latest available full graph dataset on Zenodo, in this case, v9.0.1. We do our best to keep the two in line, so, whenever a new dataset is released on Zenodo, a new dataset on Google Cloud will follow.

Attention

Please be aware that the tables and fields you will find on Google Cloud follow the data model documentation v9.0.1.

The dataset can be queried from BigQuery using this unique ID openaire-graph.oag_v9_0_1 (followed by one table name, e.g., openaire-graph.oag_v9_0_1.publications), provided you already own a project on Google Cloud (if not, see the section below to get started).

For each entity type present in the dataset on Zenodo (e.g., publication, dataset, project, relation, etc.), you can find a corresponding table on BigQuery. Each table has its own columns, and for each field in the documentation of the data model, you will find a "twin" column on Google Cloud. In order to simplify queries and make them run more efficiently and at a lower cost, whenever possible, structured properties of the data model have been preventively destructured. E.g., grant information, which is structured in three fields (i.e., currency, fundedAmount, totalCost) translates onto three separate columns named in the same way on BigQuery.

Getting started on Google Cloud

The dataset is hosted on Google Cloud and OpenAIRE AMKE covers for data storage expenses to make it publicly available; however, expenses due to data processing (e.g., filtering and aggregating the data) are on individual resources.

Therefore, if you are interested in accessing the OpenAIRE Graph on Google Cloud, you need to get started with the whole technological stack. But fear not! Despite Google Cloud seeming daunting at first (and indeed it is for certain aspects), having a Google account is everything you need to get started!

Pricing

Google grants 200 USD credit for first-time users. As a rule of thumb, a query shuffling 1TB of data costs approximately 6 USD, which provides a good amount for experimenting with the dataset. More information here. Google also supports a scheme to grant credits for research on Google Cloud.

First of all, you will need to set up a Project on Google Cloud. Then head to BigQuery Studio and open a new query and copy paste the following inexpensive query (a few KB of data read just as a test).

SELECT *
FROM openaire-graph.oag_v9_0_1.communities

Once it has run, you should see several rows appearing right below the query panel. You should also be able to explore the tables and their fields on BigQuery Studio from this link.

If you managed to get up to this point flawlessly, congratulations! Everything is rigged to get going and you can proceed with the following sections! If instead you encountered any trouble, please contact the helpdesk.

Training and materials

This dataset, and how to access and query it from BigQuery, has been previewed on a few occasions where participants had the opportunity to explore it free-of-charge (with processing costs covered by OpenAIRE AMKE). Stay tuned for more events like these by joining our User Forum and following our OpenAIRE Graph social channels (X, Bluesky, Mastodon)!

In June 2025, Dr. Andrea Mannocci hosted a tutorial session at the 20th International Conference on Scientometrics and Informetrics (ISSI 2025), held in Yerevan, Armenia. Here you can find the slides presented on the day. The notebooks and the queries showcased on the tutorial are available on GitHub here.

In October 2024, Dr. Andrea Mannocci and Dr. Alysson Fernandes Mazoni organised an online training focused on using OpenAIRE in scientometric research. A video recording of the day is available on YouTube as well as the slides. However, please note that the data structure has since been modified. Therefore, the queries for the exercises have been updated to reflect the new structure, as seen in the slides from the ISSI 2025 session mentioned above.

For a comprehensive reference on BigQuery syntax, please head to the official reference guide.

Use-case packages

Based on the experience gained from OpenAIRE MONITOR, we created a couple of packages that, with queries rather than plots and bar charts, follow a selection of the dashboard visualisations. Here you can find

Package for Research Funding Organisations (RFOs), based on the EC OpenAIRE MONITOR instance;
Package for Research Performing Organisations (RPOs), based on University of Minho OpenAIRE MONITOR instance.

Cloud access

How is the data structured?​

Getting started on Google Cloud​

Training and materials​

Use-case packages​

How is the data structured?

Getting started on Google Cloud

Training and materials

Use-case packages