the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Making geoscientific lab data FAIR: A conceptual model for a geophysical laboratory database
Sven Nordsiek
Matthias Halisch
Abstract. The term of geoscientific laboratory measurements involves a variety of methods in geosciences. Accordingly, the resulting data comprise many different data types, formats, and sizes, respectively. Handling such a diversity of data, e.g., by storing the data in a generally applicable database, is difficult. Some discipline-specific approaches exist, but a geoscientific laboratory database that is generally applicable to different geoscientific disciplines is missing up to now. However, making research data available to scientists beyond a particular community has become increasingly important. Within a pilot project of the NFDI4Earth initiative, we developed a conceptual model for a geoscientific laboratory database. For being able to handle complex settings of geoscientific laboratory studies, flexibility and extensibility are key attributes of the presented approach. The model is intended to follow the FAIR data principles to facilitate interdisciplinary applicability. In this study, we consider different procedures from existing database models and include these methods in the conceptual model.
- Preprint
(949 KB) - Metadata XML
- BibTeX
- EndNote
Sven Nordsiek and Matthias Halisch
Status: open (extended)
-
RC1: 'Comment on gi-2023-9', Anonymous Referee #1, 20 Sep 2023
reply
General Comments
This paper is well written and describes creating a conceptual model for a geophysical data base. However, there already exists many papers in the literature and international ISO/OGC/W3C standards that set a higher-level abstract model that this paper needs to at least mention, if not consider and reframe their work into these standards. There are also existing papers on the use of Globally Unique Persistent Resolvable Identifiers and PIDs that are relevant to this paper.
I feel that there needs to be a major revision of this paper, which puts the local context of the NFDI paper into this existing global framework, standards and vocabularies. I did not find all that many errors within the paper itself, more it was the lack of acknowledgement of what exists.
If the authors can frame their data model within these global contexts, then it will greatly accelerate FAIR compliance and machine-to-machine interaction of data stored within their proposed data system. I also predict it will lead to greater uptake of this paper and increase its scientific and technical significance and quality.
The abstract is clear (but needs referencing of the international frameworks), the writing and language is clear and precise, but there is just not enough referencing of existing work. Providing evidence of how this data model fits into existing higher level ontologies and data models would increase my rankings of this paper substantially.
Specific Comments
Section 2.5, Line 96. I am very surprised that this section does not mention generic solutions (abstract models) that already exist for data models on this topic from ISO, OGC and W3C. These enable interdisciplinary integration of analytical and observational data and include
- ISO 19156:2023 Geographic information — Observations, measurements and samples (https://www.iso.org/standard/82463.html ) and the companion OGC/W3C Sensor Observation Sampling
- Janowicz, Krzysztof and Haller, Armin and Cox, Simon and Phuoc, Danh Le and Lefrancois, Maxime, SOSA: A Lightweight Ontology for Sensors, Observations, Samples, and Actuators (September 12, 2018). Available at SSRN: https://ssrn.com/abstract=3248499 or http://dx.doi.org/10.2139/ssrn.3248499
- Haller, Armin; Janowicz, Krzysztof; Cox, Simon; Lefrançois, Maxime; Taylor, Kerry; Le Phuoc, Danh; Lieberman, Joshua; García Castro, Raúl; Atkinson, Rob; Stadler, Claus. The Modular SSN Ontology: A Joint W3C and OGC Standard Specifying the Semantics of Sensors, Observations, Sampling, and Actuation. Semantic Web Journal. 2018; 10(1):9-32. https://doi.org/10.3233/SW-180320
- Magagna, B, Schindler, S., Stoica, M.,Moncoiffe, G., Devaraju, A., Pamment, A., 2023 The I-ADOPT Framework Ontology (InteroperAble Descriptions of Observable Property Terminology). Retrieved from: https://w3id.org/iadopt/ont/1.0.0 Papers 1-3 should be references in this paper.
Section 3.3 Line 135: Units of Measure – the paper cites literature on Medicine. It does quote Hall, B. D. and Kuster, M.: Representing quantities and units in digital systems, Measurement: Sensors, 23, 100387, doi:10.1016/j.measen.2022.100387, 2022 which mentions QUDT.
However, this issue is widely known – I feel that there are other papers which should be quoted and maybe also the CODATA task group on the Digital Representation of Units of Measure (DRUM) which is working internationally with the International Science Council and affiliated Science Unions to get a agreed understanding and implementation of digital unit representation (see https://codata.org/initiatives/task-groups/drum/ ). This paper in particular could be cited.
Hanisch, R. , Chalk, S. , Coulon, R. , Cox, S. , Emmerson, S. , Flamenco Sandoval, F. J. , Forbes, A. , Frey, J. , Hall, B. , Hartshorn, R. , Heus, P. , Hodson, S. , Hosaka, K. , Hutzschenreuter, D. , Kang, C.‐S. , Picard, S. , & White, R. (2022). Stop squandering data: Make units of measurement machine‐readable. Nature, 605, 222–224. 10.1038/d41586-022-01233-w
Section 3.4 Line 141: Persistent Identifiers – To make activities in laboratories more machine readable and transparent and be able to trace contributions of researchers, funders and institutions this section should also include identifiers for funding (e.g., grant identifier of Crossref (https://www.crossref.org/community/grants/ ) and RAiD (Research Activity identifier - https://www.raid.org.au/ ) This should also include the recent use of Identifiers for Instruments proposal of the RDA Persistent Identifiers for Instruments Working Group (https://www.rd-alliance.org/node/57186/outputs ) and their outputs:
- Krahl, R., Darroch, L., Huber, R., Devaraju, A., Klump, J., Habermann, T., Stocker, M., & The Research Data Alliance Persistent Identification of Instruments Working Group members. (2021). Metadata Schema for the Persistent Identification of Instruments. Research Data Alliance. https://doi.org/10.15497/RDA00070
- Stocker, M, et al. 2020. Persistent Identification of Instruments. Data Science Journal, 19: 18, pp. 1–12. DOI: https://doi.org/10.5334/dsj-2020-018
- McCafferty, Siobhann, Poger, David, Yvette, Wharton, Seal, Christopher, Burgess, Robin, & Kenna, Erin. (2023). Best Practices: PIDs for Instruments (1.0). Zenodo. https://doi.org/10.5281/zenodo.7759201. This is a white paper, and hence not necessarily citable, but it also includes using PIDS for instrument calibration data.
Section 4.1, line 166. I would recommend that this figure be mapped to the ISO 19153 or SOSA models as these are widely accepted.
Section 4.2, Line 179 : Conversion between community and database schemas. This paper should reference more of the exiting international metadata schemas that could be cited for this including ISO 19115 – Geographic metadata; DataCIte; Schema.org. I would suggest that each term be looked at closely for an existing international standard. Many of these have internationally agreed and published vocabularies and definitions (e.g., Analytical Methods for Geochemistry and Cosmochemistry (https://vocabs.ardc.edu.au/viewById/650 ) - there are many other published relevant vocabularies in Research Vocabularies Australia (https://vocabs.ardc.edu.au/ ), the NERC Vocab server (https://www.bodc.ac.uk/resources/products/web_services/vocab/ ) and other vocabulary services.
The FAIR principles are cited in line 268 and given that the FAIR principle 12 of Wilkinson et al (2016) states that all metadata and data use vocabularies that are themselves FAIR compliant and both machine and human actionable. For line 263 I would recommend that the authors look into what community vocabularies are already published and reference these. Even if these published vocabs do not cover all terms required by this paper, best community practice is to contact the authors of existing vocabularies and see if they will not extend these.
Citation: https://doi.org/10.5194/gi-2023-9-RC1 -
AC1: 'Reply on RC1', Matthias Halisch, 21 Sep 2023
reply
Dear reviewer #1,
many thanks for your valuable and constructive review regarding our manuscript. We do agree to all of your points and will address those within an extensive major revision process. You might take the note (as far as our own expertise goes) that especially in the field of Geophysics, topics such as FAIR data, RDM and related are not very pronounced and more or less do not exist for Petrophysical Laboratories. Hence we knew that we could not have all sources available. That said, we are delighted to check these new references for harvesting and implementation into our paper.
With kind regards
M. Halisch
Citation: https://doi.org/10.5194/gi-2023-9-AC1
Sven Nordsiek and Matthias Halisch
Sven Nordsiek and Matthias Halisch
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
294 | 29 | 5 | 328 | 2 | 3 |
- HTML: 294
- PDF: 29
- XML: 5
- Total: 328
- BibTeX: 2
- EndNote: 3
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1