Python Tag

Anthony Scopatz, Assistant Professor at the University of South Carolina, HDF guest blogger "Python is great and its ecosystem for scientific computing is world class. HDF5 is amazing and is rightly the gold standard for persistence for scientific data. Many people use HDF5 from Python, and this number is only growing due to pandas’ HDFStore. However, using HDF5 from Python has at least one more knot than it needs to.  Let’s change that." Almost immediately when going to use HDF5 from Python you are faced with a choice between two fantastic packages with overlapping capabilities: h5py and PyTables.  h5py wraps the HDF5 API more closely using autogenerated Cython.  PyTables, while also wrapping HDF5, focuses more on a Table data structure and adds...

David Dotson, doctoral student, Center for Biological Physics, Arizona State University; HDF Guest Blogger

Recently I had the pleasure of meeting Anthony Scopatz for the first time at SciPy 2015, and we talked shop. I was interested in his opinions on MDSynthesis, a Python package our lab has designed to help manage the complexity of raw and derived data sets from molecular dynamics simulations, about which I was

Mohamad Chaarawi, The HDF Group

Second in a series: Parallel HDF5

NERSC’s Cray Sonexion system provides data storage for its Mendel scientific computing cluster.

In my previous blog post, I discussed the need for parallel I/O and a few paradigms for doing parallel I/O from applications. HDF5 is an I/O middleware library that supports (or will support in the near future) most of the I/O paradigms we talked about.

In this blog post I will discuss how to use HDF5 to implement some of the parallel I/O methods and some of the ongoing research to support new I/O paradigms. I will not discuss pros and cons of each method since we discussed those in the previous blog post.

But before getting on with how HDF5 supports parallel I/O, let’s address a question that comes up often, which is,

“Why do I need Parallel HDF5 when the MPI standard already provides an interface for doing I/O?”

John Readey, The HDF Group   We’ve recently announced a new viewer application for HDF5 files: HDF Compass. In this blog post we’ll explore the motivations for providing this tool, review its features, and speculate a bit about future direction for Compass. HDF Compass is a desktop viewer application for HDF5 and other file formats. A free and open source software product, it runs on Mac OS X, Windows, and Linux.   Compass was initially developed by Andrew Collette, a Research Scientist with IMPACT (Institute for Modeling Plasma, Atmospheres and Cosmic Dust).  He has decided to work with The HDF Group to further the development of Compass. Andrew has written a very interesting blog that goes into some of the background of Compass and...

* With Python’s Help

Gerd Heber, The HDF Group

Before the recent release of our PyHexad Excel add-in for HDF5[1], the title might have sounded like the slogan of a global coffee and baked goods chain. That was then. Today, it is an expression of hope for the spreadsheet users who run this country and who either felt neglected by the HDF5 community or who might suffer from a medical condition known as data-bulging workbook stress disorder. In this article, I would like to give you a quick overview of the novel PyHexad therapy and invite you to get involved (after consulting with your doctor).

To access the data in HDF5 files from Excel is a frontrunner among the all-time TOP 10 most frequently asked for features. A spreadsheet tool might be a convenient window into, and user interface for, certain data stored in HDF5 files. Such a tool could help overcome Excel storage and performance limitations, and allow data to be freely “shuttled” between worksheets and HDF5 data containers. PyHexad ([4],[5],[6],[7]) is an attempt to further explore this concept.  

HDF Group has just announced “HDF Server” - a freely available service that enables remote access to HDF5 content using a RESTful API. In our scenario, using HDF Server, we upload our Monopoly simulation results to the server and then interested parties can make requests for any desired content to the server - no file size issues, no downloading entire files...