From HDF5 Datasets to Apache Spark RDDs

… HDF% and Spark: Balancing the workload among tasks is a concern in any parallel environment. However, that does not mean that all datasets have to be the same size. HDF5 can help with partial I/O: Instead of reading entire datasets, one could just read hyperslabs or other selections. Sampling is…

Welcome to our blog

Welcome, again, to the new HDF Blog. Let this be the beginning of a lively and informative dialogue.

The HDF Group – who we are

The HDF Group’s mission is to provide high quality software for managing large complex data, to provide outstanding services for users of these technologies, and to insure effective management of data throughout the data life cycle.

Scroll to Top