Blog

A few years ago, I was looking for a data format with low latency block and stream support. While protocol buffers offered streams, it was lacking indexed block access. Soon, I realized I was looking for a container with file system-like properties. When I examined HDF5, I found it was very close to what I needed to store massive financial engineering datasets...

The topic of software citation has been discussed in many forums recently and several major discovery repositories (e.g. zenodo and DataCite) support metadata for software in addition to datasets and other resource types. HDF5 stradles the boundary between the dataset and software worlds. It is most commonly thought of and referred to as a data format, but, as in any case, data written in the HDF formats can not be read without HDF software. So, the answer to the question: is it a format or is it software? is clearly both....