期刊论文详细信息
Journal of Biomedical Semantics
PAV ontology: provenance, authoring and versioning
Tim Clark1  Carole Goble1  Alasdair JG Gray1  Khalid Belhajjame1  Stian Soiland-Reyes1  Paolo Ciccarese2 
[1] School of Computer Science, University of Manchester, Oxford Road, Manchester M13 9PL, UK;Harvard Medical School, 25 Shattuck Street, Boston, MA 02115, USA
关键词: Attribution;    Semantic web;    Annotation;    Versioning;    Authoring;    Provenance;   
Others  :  806445
DOI  :  10.1186/2041-1480-4-37
 received in 2013-04-26, accepted in 2013-10-07,  发布年份 2013
PDF
【 摘 要 】

Background

Provenance is a critical ingredient for establishing trust of published scientific content. This is true whether we are considering a data set, a computational workflow, a peer-reviewed publication or a simple scientific claim with supportive evidence. Existing vocabularies such as Dublin Core Terms (DC Terms) and the W3C Provenance Ontology (PROV-O) are domain-independent and general-purpose and they allow and encourage for extensions to cover more specific needs. In particular, to track authoring and versioning information of web resources, PROV-O provides a basic methodology but not any specific classes and properties for identifying or distinguishing between the various roles assumed by agents manipulating digital artifacts, such as author, contributor and curator.

Results

We present the Provenance, Authoring and Versioning ontology (PAV, namespace http://purl.org/pav/ webcite): a lightweight ontology for capturing “just enough” descriptions essential for tracking the provenance, authoring and versioning of web resources. We argue that such descriptions are essential for digital scientific content. PAV distinguishes between contributors, authors and curators of content and creators of representations in addition to the provenance of originating resources that have been accessed, transformed and consumed. We explore five projects (and communities) that have adopted PAV illustrating their usage through concrete examples. Moreover, we present mappings that show how PAV extends the W3C PROV-O ontology to support broader interoperability.

Method

The initial design of the PAV ontology was driven by requirements from the AlzSWAN project with further requirements incorporated later from other projects detailed in this paper. The authors strived to keep PAV lightweight and compact by including only those terms that have demonstrated to be pragmatically useful in existing applications, and by recommending terms from existing ontologies when plausible.

Discussion

We analyze and compare PAV with related approaches, namely Provenance Vocabulary (PRV), DC Terms and BIBFRAME. We identify similarities and analyze differences between those vocabularies and PAV, outlining strengths and weaknesses of our proposed model. We specify SKOS mappings that align PAV with DC Terms. We conclude the paper with general remarks on the applicability of PAV.

【 授权许可】

   
2013 Ciccarese et al.; licensee BioMed Central Ltd.

【 预 览 】
附件列表
Files Size Format View
20140708093352625.pdf 3226KB PDF download
Figure 19. 28KB Image download
Figure 18. 32KB Image download
Figure 17. 36KB Image download
Figure 16. 35KB Image download
Figure 15. 84KB Image download
Figure 14. 13KB Image download
Figure 13. 18KB Image download
Figure 1. 32KB Image download
Figure 11. 23KB Image download
Figure 10. 29KB Image download
Figure 9. 68KB Image download
Figure 8. 55KB Image download
Figure 7. 46KB Image download
Figure 6. 15KB Image download
Figure 5. 56KB Image download
Figure 4. 28KB Image download
Figure 3. 31KB Image download
Figure 2. 47KB Image download
Figure 1. 62KB Image download
【 图 表 】

Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

Figure 6.

Figure 7.

Figure 8.

Figure 9.

Figure 10.

Figure 11.

Figure 1.

Figure 13.

Figure 14.

Figure 15.

Figure 16.

Figure 17.

Figure 18.

Figure 19.

【 参考文献 】
  • [1]Goble C, De Roure D, Bechhofer S: Accelerating Scientists’ Knowledge Turns. In Knowledge Discovery, Knowledge Engineering and Knowledge Management, Communications in Computer and Information Science Edited by Fred A, Dietz JLG, Liu K, Filipe J. 2013, 3-25. doi:10.1007/978-3-642-37186-8_1
  • [2]DCMI Usage Board: DCMI Metadata Terms. 2012. DCMI Recommendation, [http://dublincore.org/documents/2012/06/14/dcmi-terms/ webcite]
  • [3]Lebo T, Sahoo S, McGuinness D, Belhajjame K, Cheney J, Corsar D, Garijo D, Soiland-Reyes S, Zednik S, Zhao J: PROV-O: The PROV Ontology. W3C Recommendation; 2013. [http://www.w3.org/TR/2013/REC-prov-o-20130430/ webcite]
  • [4]Moreau L, Clifford B, Freire J, Joe F, Gil T, Groth P, Kwasnikowska N, Miles S, Missier P, Myers J, Plale B, Simmhan Y, Stephan E, Van den Busschef J: The open provenance model core specification (v1.1). Futur Gener Comput Syst 2011, 27(6):743-756. doi:10.1016/j.future.2010.07.005
  • [5]Hartig O, Zhao J: Publishing and consuming provenance metadata on the web of linked data. In Provenance and Annotation of Data and Processes, Third International Provenance and Annotation Workshop, IPAW 2010, Troy, NY, USA, June 15-16, 2010. Revised Selected Papers. In Lecture Notes in Computer Science 2010, 6378. Edited by McGuinnes DL, Michaelis JR, Moreau L. Springer Berlin Heidenberg; 2010:78-90. doi:10.1007/978-3-642-17819-1_10
  • [6]Gao Y, Kinoshita J, Wu E, Miller E, Lee R, Seaborne A, Cayzer S, Clark T: SWAN: a distributed knowledge infrastructure for Alzheimer disease research. Web Semant Sci Serv Agents World Wide Web 2006, 4(3):222-228. doi:10.1016/j.websem.2006.05.006
  • [7]SWAN Development Team: AlzSWAN knowledge base. Web application. Accessed 2013-10-09. [http://hypothesis.alzforum.org/ webcite]
  • [8]Alzheimer Research Forum: AlzForum homepage. Accessed 2013-10-09. [http://www.alzforum.org/ webcite]
  • [9]Ciccarese P, Wu E, Wong G, Ocana M, Kinoshita J, Ruttenberg A, Clark T: The SWAN biomedical discourse ontology. J Biomed Inform 2008, 41(5):739-751. doi:10.1016/j.jbi.2008.04.010
  • [10]National Center for Biotechnology Information: PubMed homepage. Accessed 2013-10-09 [http://www.ncbi.nlm.nih.gov/pubmed webcite]
  • [11]The UniProt Consortium: Update on activities at the Universal Protein Resource (UniProt) in 2013. Nucleic Acids Res 2013, 41(D1):D43-D47. doi:10.1093/nar/gks1068
  • [12]UniProt Consortium: UniProt homepage. Accessed 2013-10-09. [http://www.uniprot.org/ webcite]
  • [13]W3C Provenance Working group homepage Accessed 2013-10-09. [http://www.w3.org/2011/prov webcite]
  • [14]Ciccarese P, Ocana M, Soiland-Reyes S: Provenance, Authoring and Versioning (PAV). OWL ontolLatest version, accessed 2013-10-09. [http://purl.org/pav/ webcite]
  • [15]Ciccarese P, Ocana M, Soiland-Reyes S: Provenance, Authoring and Versioning (PAV) ontology v. 2.2. OWL ontolissued 2013-08-30. [http://purl.org/pav/2.2 webcite]
  • [16]Soiland-Reyes S: PAV Ontology changes across versions. Wiki page. Accessed 2013-10-09. [https://code.google.com/p/pav-ontology/wiki/Versions webcite]
  • [17]Tan W, Madduri R, Nenadic A, Soiland-Reyes S, Sulakhe D, Foster I, Goble CA: caGrid workflow toolkit: a Taverna based workflow tool for cancer grid. BMC Bioinforma 2010, 11:542. doi:10.1186/1471-2105-11-542 BioMed Central Full Text
  • [18]Maglott D, Ostell J, Pruitt KD, Tatusova T: Entrez gene: gene-centered information at NCBI. Nucleic Acids Res 2011, 39(suppl 1):D52-D57. doi:10.1093/nar/gkq1237
  • [19]Prud’hommeaux E, Carothers G, Beckett D, Berners-Lee T: Turtle - Terse RDF Triple Language. W3C Candidate Recommendation; Published 2013-02-19. [http://www.w3.org/TR/2013/CR-turtle-20130219/ webcite]
  • [20]Ciccarese P, Peroni S: The collections ontology: creating and handling collections in OWL 2 DL frameworks. Semant Web JIn press [http://semantic-web-journal.net/content/collections-ontology-creating-and-handling-collections-owl-2-dl-frameworks-0 webcite]
  • [21]Hames I: Report on the International Workshop on Contributorship and Scholarly Attribution, May 16, 2012. Harvard University and the Wellcome Trust; 2012. [IWCSA Report] Accessed 2013-10-09. [http://projects.iq.harvard.edu/attribution_workshop webcite]
  • [22]Shotton D, Peroni S: PRO, the Publishing Roles Ontology v1.5.2. OWL ontology. Published 2013-07-12. Accessed 2013-10-09. [http://purl.org/spar/pro webcite]
  • [23]Ciccarese P, Ocana M, Garcia Castro LJ, Das S, Clark T: An open annotation ontology for science on web 3.0. J Biomed Semant 2011, 2(Suppl 2):S4. doi:10.1186/2041-1480-2-S2 BioMed Central Full Text
  • [24]Ciccarese P, Ocana M, Clark T: Open semantic annotation of scientific publications using DOMEO. J Biomed Semant 2012, 3(Suppl 1):S1. doi:10.1186/2041-1480-3-S1-S1 BioMed Central Full Text
  • [25]Schultes E, Chichester C, Burger K, Groth P, Kotoulas S, Loizou A, Tkachenko V, Waagmeester A, Askjær S, Pettifer S, Harland L, Haupt C, Batchelor C, Vazquez M, Fernández JM, Saito J, Givson A, Wich L: The Open PHACTS Nanopublications guidelines v1.8.1. The Open PHACTS RDF/Nanopublication Working Group 2012-03-26; Published 2012-03-26. Accessed 2013-10-09 [http://www.nanopub.org/guidelines/OpenPHACTS_Nanopublication_Guidlines_v1.8.1.pdf webcite]
  • [26]Open PHACTS consortium: Open PHACTS website. Accessed 2013-10-09. [http://www.openphacts.org/ webcite]
  • [27]Williams AJ, Harland L, Groth P, Pettifer S, Chichester C, Willighagen EL, Evelo CT, Blomberg N, Ecker G, Goble C, Mons B: Open PHACTS: semantic interoperability for drug discovery. Drug Discov Today 2012, 17(21-22):1188-1198. doi:10.1016/j.drudis.2012.05.016
  • [28]Gray AJG, Brenninkmeijer C, Evelo C, Goble C, Harland L, Stevens R, Waagmeester A, Willighagen E: Dataset Descriptions for the Open Pharmacological Space. Published 2013-09-19. Accessed 2013-10-09 [http://www.openphacts.org/specs/2013/WD-datadesc-20130912/ webcite]
  • [29]Belhajjame K, Corcho O, Garijo D, Zhao J, Missier P, Newman DR, Palma R, Bechhofer S, Garcia Cuesta E, Gomez-Perez JM, Klyne G, Page K, Roos M, Ruiz JE, Soiland-Reyes S, Verdes-Montenegro L, De Roure D, Goble C: Workflow-Centric Research Objects: A First Class Citizen in the Scholarly Discourse. In Proceedings of the Workshop on the Semantic Publishing (SePublica 2012), 9th Extended Semantic Web Conference. Hersonissos, Crete, Greece, May 28, 2012 Edited by Van Harmelen F, García Castro A, Lange C, Good B. 2012. [http://sepublica.mywikipaper.org/sepublica2012.pdf webcite]
  • [30]Kuilman D, Ruck M: Satellites, the Elsevier Format for Ancillary Information to Scientific Journals and Books. In Proc. Int’l Conf. on Dublin Core and Metadata Applications; 21-23 September 2011 Edited by Dublin Core Metadata Initiative, Baker T, Hillman DI, Isaac A. 2011. ISSN 1939-1366 [http://dcpapers.dublincore.org/pubs/article/view/3636 webcite]
  • [31]Pence HE, Williams A: ChemSpider: an online chemical information resource. J Chem Educ 2010, 87(11):1123-1124. doi:10.1021/ed100697w
  • [32]Gaulton A, Bellis LJ, Patricia Bento A, Chambers J, Davies M, Hersey A, Light Y, McGlinchey S, Michalovich D, Al-Lazikani B, Ovrington: ChEMBL: a large-scale bioactivity database for chemical biology and drug discovery. Nucleic Acids Res 2012, 40(D1):D1100-D1107. doi:10.1093/nar/gkr777
  • [33]Wishart DS, Knox C, Guo AC, Shrivastava S, Hassanali M, Stothard P, Chang Z, Woolsey J: DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Res 2006, 34(suppl 1):D668-D672.
  • [34]Alexander K, Cyganiak R, Hausenblas M, Zhao J: Describing Linked Datasets with the VoID Vocabulary. W3C Interest Group Note. Published 2011-03-03. [http://www.w3.org/TR/2011/NOTE-void-20110303/ webcite]
  • [35]Brickley D, Miller L: Friend of a Friend (FOAF). Vocabulary Namespace document. Published 2010-08-09. Accessed 2013-10-09. [http://xmlns:m.com/foaf/spec/20100809.html]
  • [36]Using Dublin Core Metadata Terms in VoID for general dataset metadata Describing Linked Datasets with the VoID VocabularyW3C Interest Group Note. Published 2011-03-03. [http://www.w3.org/TR/2011/NOTE-void-20110303/#dublin-core webcite]
  • [37]Gray AJG, Hausenblas M: Open PHACTS VoID editor. Accessed 2013-10-09. [http://openphacts.cs.man.ac.uk/Void-Editor/ webcite]
  • [38]Groth P, Gibson A, Velterop J: The anatomy of a nanopublication. Inf Services Use 2010, 30(1):51-56. doi:10.3233/ISU-2010-0613
  • [39]Nanopublications website Accessed 2013-10-09. [http://nanopub.org webcite]
  • [40]The Biosemantics Group: Gene disease nanopub example. Nanopublication accessed 2013-08-21. [http://rdf.biosemantics.org/examples/gene_disease_nanopub webcite]
  • [41]Iskold A, Mika P, Milicic V, Montgomerie S, Passant A, Taylor J, Tori A: Common Tag vocabulary. Published 2009-06-08. Accessed 2013-10-09. [http://commontag.org/Specification webcite]
  • [42]Callahan A, Cruz-Toledo J, Dumontier M: Ontology-based querying with Bio2RDF’s linked open data. Proceedings of the bio-ontologies special interest group 2012. J Biomed Semant 2013, 4(Suppl 1):S1. doi:10.1186/2041-1480-4-S1-S1 BioMed Central Full Text
  • [43]Miles A, Bechhofer S: SKOS Reference. W3C Recommendation; Published 2009-08-18. Accessed 2013-10-09. [http://www.w3.org/TR/2009/REC-skos-reference-20090818/ webcite]
  • [44]Soiland-Reyes S, Bechhofer S, Belhajjame K, Klyne G, Garijo D, Corcho O, García Cuesta E, Palma R: Wf4Ever Research Object Model. Published 2013-08-20. Accessed 2013-10-09. [http://purl.org/wf4ever/model webcite]
  • [45]W3C Provenance incubator group Accessed 2013-10-09. [http://www.w3.org/2005/Incubator/prov/wiki/Main_Page webcite]
  • [46]Coppens S, Garijo D, Gomez Jose M, Missier P, Myers J, Sahoo S, Zhao J: W3C Provenance incubator group final report. In Edited by Gil Y, Cheney J, Groth P, Hartig O, Miles S, Moreau L, Pinheiro Da Silva P. Published 2010-12-08. Accessed 2013-10-09. [http://www.w3.org/2005/Incubator/prov/XGR-prov-20101214/ webcite]
  • [47]Garijo D, Eckert K, Miles S, Trim CM, Panzer M: Dublin Core to PROV Mapping. W3C Working Draft; Published 2012-12-11. [http://www.w3.org/TR/2012/WD-prov-dc-20121211/ webcite]
  • [48]Cheney J, Missier P, Moreau L, De Nies T: Constraints of the PROV Data Model. W3C Recommendation; Published 2013-04-30. [http://www.w3.org/TR/2013/REC-prov-constraints-20130430/ webcite]
  • [49]Wolstencroft K, Haines R, Fellows D, Williams A, Withers D, Owen S, Soiland-Reyes S, Dunlop I, Nenadic A, Fisher P, Bhagat J, Belhajjame K, Bacall F, Hardisty A, de la Hidalga A, Balcazar Vargas M, Sufi S, Goble C: The Taverna workflow suite: designing and executing workflows of web services on the desktop, web, or in the cloud. Nucleic Acids Res 2013, 41(W1):W557-W561. doi:10.1093/nar/gkt328
  • [50]Soiland-Reyes S: Visualize PAV provenance as SVG. Taverna 2 workflow, myExperiment. Published 2013-05-01. Accessed 2013-10-09. [http://www.myexperiment.org/packs/418]
  • [51]Sirin E, Parsia B, Cuenca Grau B, Kalyanpur A, Katz Y: Pellet: a practical owl-dl reasoner. Web Semant Sci Serv Agents World Wide Web 2007, 5(2):51-53. doi:10.1016/j.websem.2007.03.004
  • [52]Moreau L, Dong T, Jewell M, Keshavarz AM: ProvToolbox; Github source code repository. 2013. [https://github.com/lucmoreau/ProvToolbox/ webcite]
  • [53]Gray AJG: ChemSpider VoID Descriptor. VoID descriptor in Turtle. Published 2012-08-10. Accessed 2013-10-09. [http://www.openphacts.org/specs/2012/WD-datadesc-20121019/examples/chemspider-void.ttl webcite]
  • [54]Soiland-Reyes S: SKOS mapping of PAV to Dublin Core terms v0.2.1. SKOS mapping in Turtle. Published 2013-03-01. Accessed 2013-10-09. [http://purl.org/pav/mapping/dcterms webcite]
  • [55]US Library of Congress: BIBFRAME website. Accessed 2013-10-09. [http://bibframe.org/ webcite]
  • [56]Miller E, Ogbuji U, Mueller V, MacDougall K: Bibliographic Framework as a Web of Data: Linked Data Model and Supporting Services. Washington, DC: Library of Congress; Published 2012-11-21. Accessed 2013-10-09. [http://www.loc.gov/bibframe/pdf/marcld-report-11-21-2012.pdf webcite]
  • [57]Library of Congress: MARC 21 format for bibliographic data 1999 Edition Update No. 17. 2013. Accessed 2013-10-09. [http://www.loc.gov/marc/bibliographic/ webcite]
  • [58]Madison O, Byrum J, Jougulet S, McGarry D, Williamson N, Witt M, Delsey T, Dulabahn E, Svenonius E, Tillett B, John N, Tucker B: Functional Requirements for Bibliographic Records: Final report IFLA Universal Bibliographic Control and International MARC Programme. UBCIM Pub, new series 1997., 19[http://archive.ifla.org/VII/s13/frbr/frbr.pdf webcite]. ISBN 978-3-598-11382-6
  • [59]Shotton D, Peroni S, Ciccarese P, Clark T: FaBiO, The FRBR-aligned Bibliographic Ontology v1.7.5. OWL ontol. Published 2013-06-24. Accessed 2013-08-03. [http:/purl.org/spar/Fabio webcite]
  • [60]US Library of Congress: A collection of Library of Congress MARC records representing a small selection of physical resources that have been translated via the BIBFRAME pipeline. Accessed 2013-10-09. [http://bibframe.org/resources/sample-lc-1/exhibit.html webcite]
  • [61]Hartig O, Zhao J: Provenance Vocabulary Core Ontology Specification. OWL ontology. Published 2012-03-14. Accessed 2013-10-09. [http://purl.org/net/provenance/ns-20120314 webcite ]
  • [62]Keβler C, Trame J, Kaupinnen T: Tracking editing processes in volunteered geographic information: The case of OpenStreetMap. In Identifying Objects, Processes and Events in Spatio-Temporally Distributed Data (IOPE). Edited by Duckham M, Galton A, Worboys M. Belfast, Maine, USA: Workshop at Conference on Spatial Information Theory 2011 (COSIT’11); 2011.
  • [63]Stasch C, Schade S, Llaves A, Janowicz K, Bröring A: Aggregating Linked Sensor Data. In The 4th International Workshop on Semantic Sensor Networks 2011 (SSN 2011), Workshop of the 10th International Semantic Web Conference (ISWC 2011). CEUR Proceedings 839. Edited by Taylor K, Ayyagari A, De Roure D. Bonn, Germany: Aggregating Linked Sensor Data; 2011.
  • [64]Quasthoff M, Meinel C: Tracing the Provenance of ObjectOriented Computations on RDF Data. In Proceedings of the Second Workshop on Trust and Privacy on the Social and Semantic Web (SPOT2010). Edited by Kärger P, Olmedilla D, Passant A, Polleres A. Heraklion, Greece: CEUR-WS.org; 2010.
  • [65]Steiner T, Verborgh R, Gabarró Vallés J, Van de Walle R: Adding meaning to Facebook microposts via a mash-up API and tracking its data provenance. In Next Generation Web Services Practices (NWeSP), 2011 7th International Conference on Next Generation Web Services Practices. Edited by Abraham A, Corchado E, Han SY, Guo W, Corchado J, Vasalikako A. Salamanca, Spain: IEEE; 2011:342-345. doi:10.1109/NWeSP.2011.6088202
  • [66]Sören A, Bizer C, Kobilarov G, Lehmann J, Cyganiak R, Ives Z: Dbpedia: A nucleus for a web of open data. Heidelberg: Springer Berlin; 2007:722-735. [The semantic web]
  • [67]Koch J, Velasco CA, Ackermann P: HTTP Vocabulary in RDF 1.0. W3C Working Draft. Published 2011-05-10. Accessed 2013-10-09 [http://www.w3.org/TR/2011/WD-HTTP-in-RDF10-20110510/ webcite]
  文献评价指标  
  下载次数:37次 浏览次数:10次