Presentation is loading. Please wait.

Presentation is loading. Please wait.

A centre of expertise in digital information management Approaches To The Validation Of Dublin Core Metadata Embedded In (X)HTML Documents Background The.

Similar presentations


Presentation on theme: "A centre of expertise in digital information management Approaches To The Validation Of Dublin Core Metadata Embedded In (X)HTML Documents Background The."— Presentation transcript:

1 A centre of expertise in digital information management Approaches To The Validation Of Dublin Core Metadata Embedded In (X)HTML Documents Background The Dublin Core Metadata Element Set is a simple set of metadata elements used for resource discovery. It has been widely adopted in digital library applications. One simple mechanism for deploying DC metadata is to embed it in (X)HTML documents, following conventions recommended by DCMI. The Problem Many (X)HTML document creators limit their "validation" to checking the presentation of their documents in Web browsers. Even where authors do use (X)HTML syntax validators, such tools do not check that embedded metadata conforms to the conventions recommended by DCMI. Furthermore, to be really useful to the metadata creator, a validation process should check the metadata against the specific requirements of the service that will use that metadata (an "application profile").

2 A centre of expertise in digital information management A Simple Approach To Validation Use of DC-dot DC-dot is a popular Web-based tool for creating and managing Dublin Core metadata. DC-dot can also be used to carry out simple validation of Dublin Core embedded in HTML resources. Survey Findings Use of DC-dot across a digital library programme showed that the entry points contained various errors in the representation of Dublin Core: Use of DC.Author rather than DC.Creator Incorrect format of date field Incorrect use of delimiters Survey Findings Use of DC-dot across a digital library programme showed that the entry points contained various errors in the representation of Dublin Core: Use of DC.Author rather than DC.Creator Incorrect format of date field Incorrect use of delimiters Limitations of DC-dot DC-dot has some limitations: It was not designed primarily as a validation tool It performs only basic validation It validates against a single set of rules The DC-dot Tool

3 A centre of expertise in digital information management Using An RDF Validator Use of An RDF Validator An alternative approach was to make use of W3C's online Dublin Core to RDF XSLT transformation service and the RDF validator. This approach made use of several online services which were chained together: Tidy to convert project home page to XHTML format Dublin Core to RDF XSLT transformation service to convert embedded Dublin Core elements to RDF/XML RDF validation service to validate the RDF/XML Comments This approach helped by providing a visual display of the Dublin Core metadata. It was noticed, for example, that one page contained an invalid identifier: rather than However since the RDF validation service has no understanding of the semantics of the Dublin Core metadata, this approach has its limitations. Comments This approach helped by providing a visual display of the Dublin Core metadata. It was noticed, for example, that one page contained an invalid identifier: rather than However since the RDF validation service has no understanding of the semantics of the Dublin Core metadata, this approach has its limitations. The RDF Validator Tool

4 A centre of expertise in digital information management The dcmeta XSLT stylesheet: Creates a report on the embedded DC metadata Checks that general conventions for DC metadata are followed Checks the metadata against a specified "application profile" of the DC Metadata Element Set. The profile is a set of rules which specify: Permitted DC properties (e.g. only the 15 DC elements are allowed) Minimum/maximum permitted occurrences of a specified property (e.g. only one occurrence of DC.Title permitted) Permitted encoding schemes (e.g. DC.Subject properties should have the scheme "LCSH") Permitted values (e.g. DC.Publisher must have the value "UKOLN") These rules are described in a secondary XML document read by the stylesheet. dcmeta: An XSLT Approach Use of XSLT We have employed XSLT to provide validation of Dublin Core metadata embedded in (X)HTML resources. The dcmeta Tool

5 A centre of expertise in digital information management Conclusions Deployment The stylesheet can be deployed using any XSLT engine e.g. Using a Javascript bookmarklet to apply the transformation in a browser with built-in XSLT engine (e.g. IE/MSXML) As an online service using a server-side transformation Run from the command line Summary This poster summarises a number of approaches to validating Dublin Core metadata embedded in HTML resources. The poster reports on initial work in the development of an XSLT-based tool which can be used for validation of Dublin Core metadata. Further Details The stylesheet is available, together with details of the structure of the "profile" document, at For further information please contact Pete Johnston at the email address


Download ppt "A centre of expertise in digital information management Approaches To The Validation Of Dublin Core Metadata Embedded In (X)HTML Documents Background The."

Similar presentations


Ads by Google