| Gavin Walker, Simon Cox | 2 December 2008 | Gavin.Walker@csiro.au |
This document covers the use of Universal Resource Identifiers (URIs) as names and code spaces in the Water Data Transfer Format (WDTF).
A scheme for identifying features and other objects that appear in WDTF documents has been defined, with the following format
Spatial reference systems used in WDTF shall use the identifier schema defined by OGC. This is a URN scheme with the following pattern:
For a general discussion of OGC URIs see http://www.oostethys.org/development/best-practices/best-practices-ogc-urns/
The Spatial Refernece System URIs can be combined. For example
combines the Bureau defined survey engineering coordinate system (which defines horizontal and vertical offset, in metres, relative to a point) with the Bureau defined gauge datum (the height of the gauge) to fix the engineering coordinate system relative to a fixed height. For a general discussion on vertical datums and combining coordinate systems see http://www.oostethys.org/development/best-practices/Vertical Datums/ and Definition identifier URNs in OGC namespace (07-092r1) v. 1.1.2 .
Where the identity of a referent is unknown, as is often the case with procedures, a URI designating a nil value may be used. The OGC URN for nil values has the form
urn:ogc:def:nil:OGC::unknown
The OGC URN scheme identifies five kinds of nil:
WDTF follows the ISO and OGC standards in using two types of XML element to encode a schema derived from UML modelling: objects and properties. Objects are complex elements that may contain many properties. Properties are elements than may contain only one value (but many attributes). That value may be
A reference can have two forms, either
<wdtf:SamplingPoint gml:id="l1">
<gml:name
codeSpace="http://www.bom.gov.au/std/water/xml/wio0.2/feature/SamplingPoint/w00001/">410729/1</gml:name>
...
</wdtf:SamplingPoint>
The feature of interest property of say a time series observation object could refer to the sampling point objects as
<om:featureOfInterest xlink:href="#l1"/>
if it were in the same document or as
<om:featureOfInterest xlink:href="http://www.bom.gov.au/std/water/xml/wio0.2/feature/SamplingPoint/w00001/410729/1"/>
if the sampling point (or location) was in a separate document.
The top level object in WDTF is called HydroCollection. It has eight main properties metadata, definitionMember, transactionMember, siteMember, specimenMember, observationMember, featureMember and conversionMember.
Metadata about the entire HydroCollection document belongs here. Most other major objects also have metadata members.
Here complex relationships between properties (or parameters) are defined. In particular the property (or parameter) pairings used in conversions (or gaugings) are placed here.
Tranaction member contains a SynchronisationTransaction which defines the time bounds for the supplied data. This is used to update the appropriate records at the Bureau, allowing deletions as well as additions.
The siteMember allows the choice of a SamplingGroup (or site) or a SamplingPoint (or location). The terminology here comes from the observations and measurements model. They are examples of imported schema fragments customized to the WDTF application. It is useful to think of a SamplingGroup (or site) as a place one can drive to and then walk to a SamplingPoint (or location). The SamplingPoint (or location) is typically the tower the sensors hang off. The SamplingGroup (or site) is a spatially cohesive set of SamplingPoints (or locations). Every SamplingPoint (or location) is required to have a SamplingGroup (or site) and vice versa.
The specimenMember contains water sample information as both samples and bottles (or fragments) of samples. Each Specimen has a relatedSamplingFeature which points to the SamplingPoint (or location) where the water was taken. Specimens are closely tied to Measurements which are the observations on the specimen.
The observationMember allows a choice of all the different types of observations: Measurement, ComplexObservation, TimeSeriesObservation and GeometryObservation. The observation objects embody the observations and measurements model.
The featureMember contains information about water courses and storages. These are the sampledFeatures refered to in the time series. Only a very small amount of information is covered. These will be replaced will full definitions from the Australian Hydrological GeoFabric (AHGF) when it is developed. While these objects are not mandatory, if this information is available it helps tie the water quality and quantity information to the river network.
The featureMember contains information about conversion tables and the DurationGroups which apply those tables in sequence.
Almost all objects in WDTF use a <gml:name> tag. The name refers to the persistent identity of the object (as opposed to the gml:id which give an identity only in the file). More than one object in the same document or across documents may have the same name. This is allows objects that belong together to be grouped together. For example two time series observation objects in different documents may have the same name, say
http://www.bom.gov.au/std/water/xml/wio0.2/feature/TimeSeriesObservation/w00001/410729/1/WaterCourseLevel/1
This means that both those time series objects are fragments of the same time series. An object may have multiple names which would allow it to be part of multiple persistent identities for different purposes, however at this stage all WDTF objects have only one name.
In the complex observation (or gauging) objects the name is used to group related gaugings together. That is all gaugings for the same artefact, such as a weir, and measuring the same thing, such as level and flow, under the same conditions have the same name.
Names and a few other elements have a code space attribute. This is a qualifier for the value of the name. It says that the meaning of the name can be found by looking up a dictionary or asking the authority defined by the code space. In WDTF the codespace is made up of the object's URI minus the identifier. For example the code space of sampling point
http://www.bom.gov.au/std/water/xml/wio0.2/feature/SamplingPoint/w00001/410729/1
is
http://www.bom.gov.au/std/water/xml/wio0.2/feature/SamplingPoint/w00001/
The exception to this rule is the time series observation name. In this case the name value is required to be an integer so the code space becomes everything in the URI except the last integer. For example with time series observation URI
http://www.bom.gov.au/std/water/xml/wio0.2/feature/TimeSeriesObservation/w00001/410729/1/WaterCourseLevel/1
the code space it
http://www.bom.gov.au/std/water/xml/wio0.2/feature/TimeSeriesObservation/w00001/410729/1/WaterCourseLevel/
The values of names (and the values of other elements with code spaces) are the identifiers of URIs and are meant to be opaque to all except the authority that created them. It is the responsibility of the authority to create the identifiers so that the URIs refer uniquely to the required object. An authority may do this by either assigning a unique id or by creating a unique id by combining relative identifiers (or primary key parts in database terms). In WDTF it is recomended that this combination be made by joining parts with forward slash "/". See WDTF URI scheme for an example.
Those data providers that also send flood data to the Bureau in the HCS format must adhere to the HCS rules for identifiers. The other exception is in the time series observation where the name value must be an integer and represents an instance number. The instance number distinguishes between different streams of data such as:
In this case the rest of the identifier required to separate data streams across sampling points (or locations) and properties (or parameters) are included in the code space. See the example under code spaces.
In WDTF a URI can be created from a name value and code space by simply adding the name value string to the code space string.
Data providers that currently send data to the Bureau in HCS format for flood forecasting need to use WDTF in a special way. HCS and WDTF data will be transmitted in parallel for some time and some of the URIs in WDTF need to be consistent with HCS data. In the HCS format the SiteId has type SR, SSR or SLSR which correspond to Site, Site-Sensor and Site-Location-Sensor respectively. In WDTF the site corresponds to a sampling group, the location a sampling point and the sensor a procedure, however in HCS the identifiers are relative whereas in WDTF they are absolute. Also in HCS the location is not always provided whereas in WDTF the sampling point (or location) is mandatory. In this case the data provider should assign the site a location. A relative id of 1 is recommended.
When a time series observation created by a data provider who also sends the equivalent data via HCS the following rules for identifiers must be obeyed.
Of course these elements use full URIs not just the authority's identifier so the code space should be prepended to the identifiers above. If the site type is SR and no sensor information is present then "urn:ogc:nil:OGC::unknown" should be used for the procedure (see unknown values).