<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=Windows-1252">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
</head>
<body dir="ltr">
<div style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);" class="elementToProof">
Donald-</div>
<div style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);" class="elementToProof">
I've been working on CDIF metadata for data bundles in the NASA <a title="https://astromat.org/" id="LPlnk938968" href="https://astromat.org/">
Astromaterials data system</a> (analogous to the bundles in the NASA planetary datasystem or RO-Crate). The only downloadable artifact is a zipped set of files; inside the bundle are data files, supplementary image and text files, and yaml metadata (very
thin content) for each of these. The bundle is a schema:Dataset, and it has a distribution that is a DataDownload for the zip archive, which hasPart the individual files represented as schema:Dataset, DigitalDocument, ImageObject as appropriate. Variables
described in the bundle are documented in a general way using schema:variableMeasured in the top-level dataset object (following CDIF discovery profile). Details about data structure/representation of variable are in the Dataset object describing the file
containing that information (linked back to the appropriate variableMeasured/PropertyValue. The structure metadata uses some DDI-CDI elements (this is exploring how the CDIF data integration profile will be implemented).</div>
<div style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);" class="elementToProof">
<br>
</div>
<div style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);" class="elementToProof">
The CDIF schema.org record is meant to support discovery and high-level assessment of the bundle, aggregating information from the various file-level metadata in the bundle. </div>
<div style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);" class="elementToProof">
<br>
</div>
<div style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);" class="elementToProof">
Example attached-- this is a test draft, work in progress, comments invited!</div>
<div style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);" class="elementToProof">
<br>
</div>
<div style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);" class="elementToProof">
Steve</div>
<div class="elementToProof" id="Signature">
<div style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);" class="elementToProof">
<br>
</div>
<div class="elementToProof">Stephen M. Richard</div>
<div class="elementToProof">US Geoscience Information Network (USGIN)</div>
<div class="elementToProof">smrTucson@gmail.com</div>
<div class="elementToProof">520-869-8545</div>
</div>
<div><br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<hr style="display: inline-block; width: 98%;">
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<b>From:</b> cdif-community <cdif-community-bounces@lists.codata.org> on behalf of Donald Hobern <donald.hobern@adelaide.edu.au><br>
<b>Sent:</b> Monday, October 13, 2025 5:16 PM<br>
<b>To:</b> cdif-community@lists.codata.org <cdif-community@lists.codata.org><br>
<b>Subject:</b> [cdif-community] Definition of schema:Dataset </div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div style="direction: ltr;">I'd like to check that there is a consistent definition for what we label as a schema:Dataset. Schema.org defined it as "<span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">A
body of structured information describing some topic(s) of interest" (https://schema.org/Dataset). In line with this, RO-Crate seems to use Dataset as a container for one or more Files via schema:hasPart (e.g. https://www.researchobject.org/ro-crate/specification/1.2/introduction.html).
Science on Schema.org doesn't provide a definition at <a data-auth="NotApplicable" class="OWAAutoLink" id="OWAb8c21c74-a12c-80a4-8df5-3317c4af9270" href="https://github.com/ESIPFed/science-on-schema.org/blob/main/guides/Dataset.md">
https://github.com/ESIPFed/science-on-schema.org/blob/main/guides/Dataset.md</a>, but my reading is that it expects a Dataset to be a file that contains PropertyValues.</span></div>
<div style="direction: ltr; font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div style="direction: ltr; font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Based on RO-Crate usage, I've been expecting to use schema:Dataset to describe the set of data, metadata and other files I expect to store together as the RO-Crate. I have also expected to use Dataset to delimit subsets of the RO-Crate that merit describing
as self-contained subunits worth describing separately. This would mean that a simple RO-Crate would be a Dataset and that it would have multiple Files as parts. A more complicated RO-Crate would be a Dataset that has multiple Files and Datasets as parts (with
the nested Datasets themselves having Files as parts). Based on this interpretation, most Datasets have a one-to-one relationship with a Folder.</div>
<div style="direction: ltr; font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div style="direction: ltr; font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Does this align with the expectations of other groups interested in CDIF? In short, is a schema:Dataset 1) a collection of Files representing the results of a study or 2) a File that contains PropertyValues (using e.g. CSV, NetCDF, HDF5, ...)?</div>
<div style="direction: ltr; font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div style="direction: ltr; font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Thanks,</div>
<div style="direction: ltr; font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div style="direction: ltr; font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Donald</div>
<div id="x_Signature">
<div style="direction: ltr; font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<p style="direction: ltr; text-align: left; background-color: rgb(255, 255, 255); margin: 0px; font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<span style="font-family: Arial, Helvetica, sans-serif; font-size: 8pt; color: rgb(82, 138, 45);"><b>Donald Hobern</b></span><span style="font-family: Arial, Helvetica, sans-serif; font-size: 8pt; color: black;"><br>
Data Management Director, Australian Plant Phenomics Network<b><br>
</b>University of Adelaide - working from Canberra, ACT</span></p>
<p style="direction: ltr; text-align: left; background-color: rgb(255, 255, 255); margin: 0px; font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<span style="font-family: Arial, Helvetica, sans-serif; font-size: 8pt; color: black;"><b>P</b> (04) 20511471 |
</span><span style="font-family: Arial, Helvetica, sans-serif; font-size: 8pt; color: rgb(101, 141, 27);"><a style="color: rgb(101, 141, 27); margin-top: 0px; margin-bottom: 0px;" data-auth="NotApplicable" class="x_OWAAutoLink" id="OWAeb15a01c-9636-b838-aea0-370145535864" href="http://www.plantphenomics.org.au/">plantphenomics.org.au</a></span><span style="font-family: Arial, Helvetica, sans-serif; font-size: 8pt; color: black;">
| </span><span style="font-family: Arial, Helvetica, sans-serif; font-size: 8pt; color: rgb(101, 141, 27);"><a style="color: rgb(101, 141, 27); margin-top: 0px; margin-bottom: 0px;" data-auth="NotApplicable" class="x_OWAAutoLink" id="OWA3dfe2025-dbad-c739-22b8-7b3d80a2c089" href="https://www.plantphenomics.org.au/news/#news-from-our-blog">subscribe
to our news</a></span><span style="font-family: Arial, Helvetica, sans-serif; font-size: 8pt; color: black;"> </span></p>
<div style="direction: ltr; line-height: normal; margin: 0cm 0cm 8pt; font-family: Arial, Helvetica, sans-serif; font-size: 8pt; color: black;">
<img style="max-width: 798px;" size="26130" id="x_image_1" data-outlook-trace="F:2|T:2" src="cid:ed0e4064-94e7-4d04-b9ec-02ece5eed479"></div>
<div style="direction: ltr; margin-top: 1em; margin-bottom: 1em; font-family: Arial, Helvetica, sans-serif; font-size: 8pt;">
<span style="color: rgb(82, 138, 45);">APPN acknowledges the Traditional Custodians of Country throughout Australia and their connections to land, sea and community. We pay our respect to their Elders past and present and extend that respect to all Aboriginal
and Torres Strait Islander peoples today.</span><span style="color: black;"><br>
The Australian Plant Phenomics Network (APPN) is supported by the Australian Governments National Collaborative Research Infrastructure Strategy (</span><span style="color: rgb(153, 159, 163);"><a style="color: rgb(153, 159, 163);" data-linkindex="3" data-auth="NotApplicable" class="x_OWAAutoLink" id="OWA3d3a8810-f220-0efa-bbec-aca61294a00f" href="https://www.education.gov.au/national-collaborative-research-infrastructure-strategy-ncris">NCRIS</a></span><span style="color: black;">)<br>
APPN National Head Office at the </span><span style="color: rgb(153, 159, 163);"><a style="color: rgb(153, 159, 163);" data-linkindex="4" data-auth="NotApplicable" class="x_OWAAutoLink" id="OWAeeef16e2-0789-78e1-1d55-589408575876" href="https://www.thewaite.org/">University
of Adelaide</a></span><span style="color: black;"> (UoA - CRICOS provider number 00123M). This email (and any attachment) is confidential and may also be privileged or otherwise exempt from disclosure. It is intended only for the addressee. If you are not
the intended recipient, please delete it and do not send it on, copy it or disclose its contents. No assurance is given about the security of information sent electronically. Think green and read on the screen.</span></div>
</div>
</body>
</html>