A Brief Introduction to EXPRESS-Driven Data as HDF5

This document introduces the concept of the EXPRESS-Driven Data as HDF5 project.

1 Introduction

The purpose of this project is to enable the use of the HDF5 format for representing large volume datasets with information models written using the ISO EXPRESS language. The Hierarchical Data Format (HDF) is a product of the United States National Center for Supercomputing Applications at the University of Illinois at Urbana-Champaign. HDF5 is Version 5 of HDF (see What is HDF5? and Introduction to HDF5 for an introduction).

The architecture of HDF5 separates the definition of the file structures from the APIs used to manipulate those structures. This mapping of EXPRESS-driven data into HDF5 is based on the file structures only. No EXPRESS-specific API is specified. The HDF5 API itself is used to manipulate the EXPRESS-driven data.

It expected that the results of this project will be standardized in ISO Technical Committee 184 Subcomittee 4 Industrial Data. It is expected that the standard will be one of the ISO 10303 Product data representation and exchange series of standards known as STEP - the Standard for the Exchange of Product Data. The EXPRESS language is standardized as ISO 10303-11:2004.

It should also be noted that there are other ISO 10303 standards that map EXPRESS into various implementation methods. This initial specification does not take alignment with those into account in any real detail. However, an attempt to exclude unnecessary aspects of EXPRESS itself into the HDF5 file has been made. A simple example is the encoding of HDF5 datasets for "Type" and "Objects" rather than "Entity Type" and "Entity Instance". This may, for example, allow the same HDF5 file to contain data that may be described by both an EXPRESS schema and a related UML static class diagram.