The exff Logo

Welcome to the exff Pages

NEWS ABOUT SOFTWARE FAQ GUIDES PAPERS LINKS

A Brief Foray into Semantic Web Technology and STEP

David Price
October 2003
david.price@eurostep.com

Abstract

The Semantic Web standards being developed within W3C result in interesting new possibilities for the implementation of STEP and other EXPRESS-based applications that are well beyond typical data exchange scenarios. These possibilities include "integrated" databases that are actually distributed on different servers, managing/querying Reference Data Libraries and EXPRESS schema information using the same technology, validating data against a large database without adding the data to the database, and using Web search engines to find information about product data. The W3C Semantic Web standards are based on work in the fields of ontologies and logic languages. This paper outlines an approach to mapping the elements of a STEP implementation into Semantic Web languages and briefly describes some of the resulting capabilities. To test some aspects of the approach, an implementation of the described EXPRESS/OWL mapping was developed and is available as an open-source development.

The Semantic Web Vision and STEP

In some sense, the Semantic Web (SW) is a concept still being defined. It became widely known starting with the publication in Scientific American of the article The Semantic Web in May of 2001 written by Berners-Lee, Hendler and Lassila. The fundamental goal is to enable the use of the Web to provide information in a way that is processable by computers rather than by humans. One simple view of it is that the goal is to replace HTML with other languages that provide more "semantics" for the available data. Those semantics are being defined by technologies derived from the fields of knowledge representation, logic and artificial intelligence and the underlying transport mechanism is XML. Since that May 2001 article, much has happened with a key element being the nearing completion of new suite of standards under the name OWL,  the Web Ontology Language.

Models, called ontologies in the SW world, are the basis for defining the semantics. Ontologies are not so different from the EXPRESS schemas used in the STEP community today. They define classes, properties of those classes, and constraints that play the same roles as EXPRESS entity types, attributes and rules. Data instances conforming to the ontology can be represented, in the same manner that entity type instances conforming to the EXPRESS schema can be represented. Additionally, the SW languages allow the definition of logical relationships against which reasoning can occur - something the EXPRESS language does not support. Beyond the EXPRESS-based aspects of a STEP implementation, reference data libraries have become important as well. In the SW world, there are two options for representing them: as data instances or as an ontology themselves.

Because of the similar nature of schemas and ontologies and because of the spread of STEP applications into domains beyond design and geometry, SW technology could play an important role in STEP implementation. The remainder of this paper briefly explores one way that might be accomplished by showing how elements of a typical STEP implementation can be approached using SW languages and tools. For the purposes of this paper, those elements are the following:

  • an EXPRESS schema,
  • one or more Reference Data Libraries,
  • product data based on the EXPRESS schema and classified using the Reference Data,
  • a data validation capability checking the product data against the schema,
  • queries and manipulation over the EXPRESS schema, Reference Data Library and product data.

These are illustrated in Figure 1.

Traditional STEP Implementation

Figure 1 - Traditional STEP Implementation

W3C SW Standards

The W3C standards related to the Semantic Web are described in the OWL Web Ontology Language Overview as follows:

'OWL has been designed to meet this need for a Web Ontology Language. OWL is part of the growing stack of W3C
         recommendations related to the Semantic Web.
  • XML provides a surface syntax for structured documents, but imposes no semantic constraints on the meaning of these documents.
  • XML Schema is a language for restricting the structure of XML documents.
  • RDF is a datamodel for objects ("resources") and relations between them, provides a simple semantics for this datamodel, and these datamodels can be represented in an XML syntax.
  • RDF Schema is a vocabulary for describing properties and classes of RDF resources, with a semantics for generalization-hierarchies of such properties and classes.
  • OWL adds more vocabulary for describing properties and classes: among others, relations between classes (e.g. disjointness), cardinality (e.g. "exactly one"), equality, richer typing of properties, characteristics of properties (e.g. symmetry), and enumerated classes.'

The focus of this paper is applying the advanced capabilities the Semantic Web technology brings to STEP implementations. Therefore, a knowledge of XML and XML Schema is assumed, and the paper focuses on RDF, RDF Schema and OWL.

The Resource Description Framework (RDF) in concept, is quite simple. Triples based on the simple Resource-Property-Value structure are used to associate descriptions with Web resources. The original intent of RDF was to support areas such as applying various ratings to Web sites (e.g. parental guidance ratings). The Resources in RDF are identified using Uniform Resource Identifiers (URIs). A Web address, technically a Uniform Resource Locator, is one kind of URI. The Value in an RDF triple can also be a URI allowing one Web resource to be used to describe another.

The RDF Vocabulary Description Language (RDF Schema) builds on RDF adding the capability to define a classification scheme for Web resources that includes properties of the classes themselves.

The Web Ontology Language (OWL) builds on RDF Schema adding advanced knowledge representation and reasoning capabilities. OWL is based on earlier work in this area, by the DARPA Agent Markup Language Program , on a language called DAML+OIL. Because OWL is built on RDF Schema which is built on RDF, an OWL ontology contains markup from all three languages. OWL also draws on XML Schema for some datatypes and so XML Schema markup appear as well. It should be noted that although RDF Schema and OWL extend RDF, an OWL ontology is a valid RDF document and an RDF document is a valid OWL document. It's the interpretation of the OWL constructs by parsers, validators and reasoning tools that adds capability.

As of September 2003, OWL is a W3C Candidate Recommendation. That means it's relatively new and therefore full implementations of all it's capabilities are not yet commercial products. However, enough tools do exist to be able to build applications testing all the elements of a typical STEP implementation that were listed earlier in this paper.

Schemas and OWL

EXPRESS schemas define a domain. Ontologies define a domain. Therefore, mapping EXPRESS schemas into OWL ontologies should be possible. A similar exercise is already underway in the Object Management Group where an RFP for a UML/OWL capability has been issued. The mapping described in this paper is simple and does not try to address every intricacy of EXPRESS. The purpose is to encourage more work towards proving that SW technologies can be applied to STEP.

Table 1 lists EXPRESS concepts and how they are mapped into OWL in the examples contained in this paper.
 
 

Table 1 - EXPRESS to OWL mapping
EXPRESS concept OWL Concept
schema Ontology
entity type Class
supertype/subtype graph subClassOf
select type Class and subClassOf
explicit attribute of type simple data type DatatypeProperty
explicit attribute of type entity or type except enums ObjectProperty
string, integer, boolean, number/real data types XML Schema string, integer, boolean, double data types
Entity name and Entity.Attribute name Identifiers for OWL representations of EXPRESS items

An EXPRESS entity type such as:

ENTITY Product;
END_ENTITY;

appears as:

<owl:Class rdf:ID="Product"/>

Table 2 shows a very simple example EXPRESS schema and the resulting OWL ontology.

Instance data and OWL

Instance data representing EXPRESS entity instances that is not reference data can be represented in an OWL document as a concept called an individual. The individuals can be within the same document as the ontology representing the EXPRESS schema or in other XML documents. There are two ways in which an individual appears in OWL documents. It can be represented using elements from the OWL namespace (see Example 1) or it may be represented using markup where the element names are derived from the names of the OWL Classes, which were derived from the EXPRESS entity type names (see Example 2) - for example an element named "Requirement" can exist.

EXAMPLE 1 - An entity type instance as an individual using elements from OWL namespace - rdf:type

<owl:Thing rdf:ID="instance001"/>

<owl:Thing rdf:about="#instance001">
  <rdf:type rdf:resource="#Requirement"/>
  <Product.id rdf:datatype="&xsd;string">R01</Product.id>
  <Product.name rdf:datatype="&xsd;string">Number 1 requirement</Product.name>
</owl:Thing>

EXAMPLE 2 - An entity type instance as an individual using schema-specific elements

<Requirement rdf:ID="instance001" >
  <Product.id rdf:datatype="&xsd;string">R01</Product.id>
  <Product.name rdf:datatype="&xsd;string">Number 1 requirement</Product.name>
</Requirement>

From the point of view of an RDF or OWL parser, these two representations result in exactly the same triples being defined - there is no semantic difference.

Example 3 is a complete OWL document containing the representation of an EXPRESS schema and corresponding entity instances.

Reference data libraries and OWL

Reference data libraries (RDLs) have become important in several recently developed STEP standards. They are also at the core of the EPISTLE approach to data integration. At the heart of an RDL, is a class hierarchy with properties related to the classes. There is little conceptual difference between these RDL class hierarchies, an EXPRESS schema and an OWL ontology. For these class hierarchies, the mapping into an OWL Ontology is therefore straightforward. Example 4 contains a small part of the EPISTLE Reference Data Library showing one approach to how this might be accomplished for a class called "total volume". Associating any OWL individual with a class from the representation of an RDL is accomplished with the use of the rdf:type element. An OWL individual may a member of any number of classes.

The OWL representation of the RDL need not be in the same OWL document as the individuals that use it. OWL provides a mechanism to import one ontology into another. In this way, an RDL can be published on the Web using OWL and used by any OWL or RDF application with access to the Internet. There is no mixing of classes or properties between representations of EXPRESS schemas and representations of different RDLs as each has its own unique URI. However, it is possible to add knowledge about a particular resource. For example, if there was an ISO standard RDL for PLCS and an organization had local extensions to that RDL, an OWL processor would combine the knowledge in both RDLs when making validating, making inferences or reasoning about the instance data that uses the two RDLs. The knowledge about resources represented using OWL is additive.

Figure 2 shows how Semantic Web technologies can overlay onto the traditional STEP implementation architecture. This is not, however, taking advantage of the added capbilities that conversion to OWL enables.

Using Semantic Web Technology for Traditional STEP Implementation

Figure 2 - Using SW Technology for Traditional STEP Implementation

Logic-based capabilities

The logic built into OWL is based on Description Logics, a restriction of First Order Predicate Logics. Practically, what this means is that a layer of reasoning can be applied over the ontologies defined using the simpler OWL concepts. With this reasoning, it's possible to check the consistency of ontologies, define relationships like "is equivalent to" between classes in different ontologies, check that instances are valid with respect to an ontology and to perform queries over a set of ontologies. The query mechanisms are not that different in many cases from SQL queries with which many people are familiar.

Based on the logic capabilities of OWL, it is clear that the basic capabilities required for STEP implementation are built into OWL-conforming software. In fact, these capabilities can be distributed around the Web or held locally providing much more flexibility than with a typical STEP implementation. In this distributed environment, finding the appropriate ontologies is important. There are a several ways this can happen - a few are listed here:

  1. a Web crawler can find ontologies and then your Web search returns them to you
  2. if an interesting ontology is available on the Web and known, use the Web address as the URI in your OWL document and the software will import it from the Web site
  3. there are repositories for publishing ontologies, OASIS for example is working in this area

Figure 3 shows one approach for taking advantage of the Semantic Web technologies in a STEP implementation rather than using a traditional approach. In this example, an MoD system can generate data about a Ship that that is logically integrated with the pre-existing data adding to the body of knowledge about the Ship.

Taking Advantage of SW Technology for STEP Implementation

Figure 3 - Taking Advantage of SW Technology for STEP Implementation

Issues

There are several issues with respect to the use of SW technologies for STEP implementation. A brief list of some follows.

  • security - some organizations are particularly concerned about using ontologies over the Internet. Their concerns include questions about certification and accuracy of the data and how to guarantee that the data has not been tampered with. These issues are being addressed in within W3C and industry using a layered approach for building trust in a Web resource. The Semantic Web Advanced Development for Europe project has produced a paper aimed at a Framework for Security and Trust Standards.
  • Semantic Web and Web Services - the relationship between these two capabilities and when to apply each is not clear to all at this time. Berners-Lee WWW2003 keynote talk on Web Services - Semantic Web discusses their relationship.
  • yet another language - unless OWL-based tools and products exist, OWL is yet-another-XML-format. Some suggest that adopting XML Schema and moving forward with implementations is the best approach today. As part of the W3C standardization process, implementation experience is required. For the OWL Candidate Recommendation, evidence of several implementations was provided.
  • complexity - the OWL/RDF syntax is not designed for human readability. It's certainly possible to design XML DTDs and XML Schemas that result in XML documents that are more human readable. Recognizing this concern, more human-friendly representations like N3 can be used to help people begin understanding the core concepts.

Conclusions

This paper has shown some of the technical details of a Semantic Web-based implementation of STEP/EXPRESS. No major technical barriers have been identified during the development of this paper. The approach to addressing each element of a STEP implementation is summarized in Table 3. The question remains whether to choose these technologies for a particular application and in which scenarios do they bring significant benefit.

Table 3 - Implementing STEP using SW Technology
STEP Implementation Element SW Approach
an EXPRESS schema an OWL Ontology
one or more Reference Data Libraries an OWL Ontology for each RDL
product data based on the EXPRESS schema and classified using the Reference Data OWL Individuals
a data validation capability checking the product data against the schema OWL Validator
queries and manipulation over the EXPRESS schema, Reference Data Library and product data RDF- and OWL-based toolkit

STEP Application Protocols (APs) that are designed purely for data exchange may find that the capabilities resulting from the application of SW technologies are not of significant benefit. However, for STEP applications that are longer-lived, such as large databases or integration scenarios, the SW technologies do seem to offer significant capabilities - such as the the Web publication of RDLs and their use as an ontology in their own right. By using SW technologies, the EXPRESS schemas and RDLs that make up the "semantics" against which instance data is created can be managed in a single language. The EXPRESS and RDLs can also be made available as Web sites that software applications can process. These same Web sites can also be queried by users using Web browsers.

ISO 15926 and EPISTLE are designed to address data integration requirements. EPISTLE EXPRESS schemas contain no domain-specific semantics. All semantics are held in RDLs. Applying SW technologies in these cases should also provide significant benefit to implementors and users alike. However, more detailed analysis than is covered in this paper should be performed to understand how the latest EPISTLE approach and SW technologies relate.

Notes on the implementation

The EXPRESS-to-OWL mapping described in this paper was implemented as part of a contribution to a project called EXPRESS for Free, or exff. The concept underlying exff is to map the ISO EXPRESS language into OMG's UML language providing access to sophisticated software engineering capabilities. The EXPRESS-to-OWL capability is therefore implemented in three steps:

  1. parse the EXPRESS schema generating an XML representation of the EXPRESS
  2. transform the XML representation of EXPRESS into the corresponding OMG UML XMI file
  3. transform the OMG UML XMI file into the corresponding OWL document

All the components of the implementation are open-source software as part of exff, with the exception of eep which is freeware from Eurostep. eep is used to parse the EXPRESS and generate the XML representation of the schema. XSLT is used to transform that XML into OMG UML XMI format. XSLT is used to transform the OMG UML XMI document into an OWL XML document.

The resulting OWL documents were validated using the BBN Owl Validator and the University of Manchester and University of Karlsruhe OWL Ontology Validator.

Links


W3C - The World Wide Web Consortium

RDF - Resource Description Framework (RDF) Model and Syntax Specification

OWL Web Ontology Language Overview

XMLS Datatypes - XML Schema Part 2: Datatypes

exff - EXPRESS for Free open-source toolkit

eep - Eurostep EXPRESS Parser found under Products & Services