Skip to main content
Version: 5.1.0

EMBL-EBIs Protein Data Bank in Europe

This section describes the mapping implemented for EMBL-EBIs Protein Data Bank in Europe.

The Europe PMC RESTful Web Service gives the datalinks API to retrieve data-literature links in Scholix format.

How the data is collected

Starting from the Pubmed collection, the API below is used to obtain the bioentities related to publications for each PubMed identifier.

Example:

curl -s "https://www.ebi.ac.uk/europepmc/webservices/rest/MED/33024307/datalinks?format=json" | jq '.'
{
"version": "6.8",
"hitCount": 9,
"request": {
"id": "33024307",
"source": "MED"
},
"dataLinkList": {
"Category": [
{
"Name": "Nucleotide Sequences",
"CategoryLinkCount": 5,
"Section": [
{
"ObtainedBy": "tm_accession",
"Tags": [
"supporting_data"
],
"SectionLinkCount": 5,
"Linklist": {
"Link": [
{
"ObtainedBy": "tm_accession",
"PublicationDate": "04-11-2022",
"LinkProvider": {
"Name": "Europe PMC"
},
"RelationshipType": {
"Name": "References"
},
"Source": {
"Type": {
"Name": "literature"
},
"Identifier": {
"ID": "33024307",
"IDScheme": "MED"
}
},
"Target": {
"Type": {
"Name": "dataset"
},
"Identifier": {
"ID": "AY278488",
"IDScheme": "ENA",
"IDURL": "http://identifiers.org/ebi/ena.embl:AY278488"
},
"Title": "AY278488",
"Publisher": {
"Name": "Europe PMC"
}
},
[...]

Mapping

The table below describes the mapping from the EBI links records to the OpenAIRE Graph dump format. We filter all the target links with pid type ena, pdb or uniprot For each target we construct a Bioentity with the following mapping

OpenAIRE Result field pathEBI record field xpathNotes
idtarget/identifier/ID and target/identifier/IDSchemeid in the form SCHEMA_________::md5(pid)
pidtarget/identifier/ID and target/identifier/IDSchemeclassid = classname = schema
publicationdatetarget/PublicationDateclean and normalize the format of the date to be YYYY-mm-dd
maintitletarget/Title
Instance Mapping
instance.typeBioentity
typeDataset
instance.pidtarget/identifier/ID and target/identifier/IDSchemeclassid = classname = schema
instance.urltarget/identifier/IDURLCopy the value as it is
instance.publicationdate//PubmedPubDateclean and normalize the format of the date to be YYYY-mm-dd

Relation Mapping

OpenAIRE Relation Semantic and inverseSource/Target typeNotes
IsRelatedToresult/resultwe create relationships between the BioEntity and the pubmed publication