Skip to main content
Version: 5.2.0

UniProtKB/Swiss-Prot

This section describes the mapping implemented to integrate metadata and links from UniProtKB/Swiss-Prot. The complete data dump "Reviewed (Swiss-Prot)" can be downloaded from here.

From this dataset, only the protein records linked to a PubMed publication are extracted.

Entity Mapping

The table below describes the mapping from the TEXT metadata format to the OpenAIRE Graph dump format. You can check an example of the text metadata here

OpenAIRE Result field pathFASTA record field xpathNotes
BIOEntity Mapping
idLINE Starts with ACid in the form uniprot_____::md5(id)
pidLINE Starts with ACexample AC A0A0C5B5G6; classid=classname=uniprot the vaue is the text after AC
publicationdateLINE START WITH DT containg text integrated into UniProtKB/Swiss-Protclean and normalize the format of the date to be YYYY-mm-dd
maintitleLINE START WITH GNmain title
Instance Mapping
instance.typeBioentity
typeDataset
instance.pidLINE Starts with ACclassid = classname = uniprot
instance.urlpidprepend to the value https://www.uniprot.org/uniprot/
instance.publicationdateLINE START WITH DT containg text integrated into UniProtKB/Swiss-Protclean and normalize the format of the date to be YYYY-mm-dd

Relation Mapping

OpenAIRE Relation Semantic and inverseSource/Target typeNotes
IsRelatedToLINE START WITH RXthe mapping creates relationships between the BioEntity and the PubMed or DOI generating an unresolved target identifier