HDF Product Designer User Guide

HDF Product Designer (HPD) helps data producers design conventional HDF5 products easily and generate consistently interoperable data products by utilizing best practices and standards if they exist in their data user communities. Conventions are defined using the powerful CLIPS expert system and designs can be re-used across product suites.

Key goals this application strives to fulfill:

  • Facilitate creation of interoperable and standards-compliant data products in HDF5 as early as possible in the project development process.
  • Support design on Windows, Mac OS X, and Linux computing platforms without requiring the full software stack of development tools and libraries installed.
  • Easy and intuitive editing (create, update, move, copy, delete) of HDF5 objects.
  • Collaborative approach to product design (project/team/organization-wide).
  • Incorporation of best practices and standards from targeted data user communities.
  • Integration of compliance and interoperability tests into the design workflow.
  • Content import from existing files.
  • Export of designs as HDF5 files, HDF5/JSON, or as source code in several programming languages.

The Hierarchical Data Format (HDF5) provides a flexible container that supports groups and datasets, each of which can have attributes. In many ways, HDF5 is similar to a directory structure in a file and, like directory structures, the same data can be structured and annotated in many ways. This flexibility empowers HDF5 users to arrange data in ways that make sense to them. However, it can make it difficult to share data as users, and tools, must understand the structure and the properties of data in order to use and understand it.

Many communities have successfully addressed this problem by creating conventional structures and annotations for data in HDF5. This approach depends on data files (e.g., products) that carefully follow these conventions. In some cases, designing and writing those files can be challenging or the user creating the product may be driven by local needs that lead to deviations from the conventions. Unfortunately, even small deviations can cause problems for downstream tools and future users.

What is a HDF5 product in the context of this software? HDF5 product is the content that should exist in a single HDF5 file. This content is defined by the HDF5 objects (groups, attributes, datasets), their names, the hierarchies they create (links and references), and attribute values. Dataset values are typically not stored in such files (unless they qualify as metadata) thus this software cannot be used as a data server. Once completed, a HDF5 product is replicated in many files (commonly on the order of tens of thousands or more) and filled with real data.