HDFql – the new HDF tool that speaks SQL

Rick, HDFql team, HDF guest blogger

HDFql (Hierarchical Data Format query language) was recently released to enable users to handle HDF5 files with a language as easy and powerful as SQL. 

By providing a simpler, cleaner, and faster interface for HDF across C/C++/Java/Python/C#, HDFql aims to ease scientific computing, big data management, and real-time analytics. As the author of HDFql, Rick is collaborating with The HDF Group by integrating HDFql with tools such as HDF Compass, while continuously improving HDFql to feed user needs.

Introducing HDFql

HDFqlIf you’re handling HDF files on a regular basis, chances are you’ve had your (un)fair share of programming headaches. Sure, you might have gotten used to the hassle, but navigating the current APIs probably feels a tad like filing expense reports: rarely a complete pleasure!

If you’re new to HDF, you might seek to avoid the format all together. Even trained users have been known to occasionally scout for alternatives.  One doesn’t have to have a limited tolerance for unnecessary complexity to get queasy around these APIs – one simply needs a penchant for clean and simple data management.

This is what we heard from scientists and data veterans when asked about HDF. It’s what challenged our own synapses and inspired us to create HDFql. Because on the flip-side, we also heard something else:

  • HDF has proven immensely valuable in research and science
  • the data format pushes the boundaries on what is achievable with large and complex datasets
  • and it provides an edge on speed and fast access which is critical in the big data / advanced analytics arena

With an aspiration of becoming the de facto language for HDF, we hope that HDFql will play a vital role in the future of HDF data management by:

  • Enabling current users to arrive at (scientific) insights faster via cleaner data handling experiences
  • Inspiring prospective users to adopt the powerful data format HDF by removing current roadblocks
  • Perhaps even grabbing a few HDF challengers or dissenters along the way…

Continue reading

The HDF Group welcomes new CEO Dave Pearah

HDF
Pearah joins The HDF Group as new Chief Executive Officer

Champaign, IL —  The HDF Group today announced that its Board of Directors has appointed David Pearah as its new Chief Executive Officer. The HDF Group is a software company dedicated to creating high performance computing technology to address many of today’s Big Data challenges.

Pearah replaces Mike Folk upon his retirement after ten years as company President and Board Chair. Folk will remain a member of the Board of Directors, and Pearah will become the company’s Chairman of the Board of Directors.

Pearah said, “I am honored to have been selected as The HDF Group’s next CEO. It is a privilege to be part of an organization with a nearly 30-year history of delivering innovative technology to meet the Big Data demands of commercial industry, scientific research and governmental clients.”

Industry leaders in fields from aerospace and biomedicine to finance join the company’s client list.  In addition, government entities such as the Department of Energy and NASA, numerous research facilities, and scientists in disciplines from climate study to astrophysics depend on HDF technologies.

Pearah continued, “We are an organization led by a mission to make a positive impact on everyone we engage, whether they are individuals using our open-source software, or organizations who rely on our talented team of scientists and engineers as trusted partners. I will do my best to serve the HDF community by enabling our team to fulfill their passion to make a difference.  We’ve just delivered a major release of HDF5 with many additional powerful features, and we’re very excited about several innovative new products that we’ll soon be making available to our user community.”

“Dave is clearly the leader for HDF’s future, and Continue reading

Easy access to the NASA HDF products via OPeNDAP’s Hyrax

MuQun (Kent) Yang, The HDF Group

Many NASA HDF and HDF5 data products can be visualized via the Hyrax OPeNDAP server through Hyrax’s HDF4 and HDF5 handlers.  Now we’ve enhanced the HDF5 OPeNDAP handler so that SMAP level 1, level 3 and level 4 products can be displayed properly using popular visualization tools.

Organizations in both the public and private sectors use HDF to meet long term, mission-critical data management needs. For example, NASA’s Earth Observing System, the primary data repository for understanding global climate change, uses HDF.  Over the lifetime of the project, which began in 1999, NASA has stored 15 petabytes of satellite data in HDF which will be accessible by NASA data centers and NASA HDF end users for many years to come.

In a previous blog, we discussed the concept of using the Hyrax OPeNDAP web server to serve NASA HDF4 and HDF5 products.  Each year, The HDF Group has enhanced the HDF4 and HDF5 handlers that work within the Hyrax OPeNDAP framework to support all sorts of NASA HDF data products, making them interoperable with popular Earth Science tools such as NASA’s Panoply and UCAR’s IDVThe Hyrax HDF4 and HDF5 handlers make data products display properly using popular visualization tools.  Continue reading

ESIP Summer Meeting – HDF Workshop and Town Hall

Lindsay Powers, The HDF Group

Please join us to learn about new HDF tools, projects and perspectives.

The HDF Group will be hosting a one-day workshop at the upcoming Federation for Earth Science Information Partners (ESIP) Summer Meeting in Asilomar, CA on Tuesday, July 14th.

There will also be an HDF Town Hall meeting on Wednesday afternoon, July 15th.

Please join us for any and all of the events.  If you are unable to join us in person, you may participate through remote access. Remote access details will be made available through the ESIP meeting website. Questions? Contact Lindsay at lpowers@hdfgroup.org.

The agenda for the July 14 HDF Group workshop:  Continue reading

Letter to the HDF User Community

Lindsay Powers – The HDF Group

The HDF Group provides free, open-source software that is widely used in government, academia and industry. The goal of The HDF Group is to ensure the sustainable development of HDF (Hierarchical Data Format) technologies and the ongoing accessibility of HDF-stored data because users and organizations have mission-critical systems and archives relying on these technologies. These users and organizations are a critical element of the HDF community and an important source of new and innovative uses of, and sustainability for, the HDF platforms, libraries and tools.

We want to create a sustainability model for the open access platforms and libraries that can serve these diverse communities in the future use and preservation of their data. As a step towards engaging this community, we are seeking partners for a National Science Foundation Research Coordination Network (RCN).

The National Science Foundation supports RCNs in order to foster collaboration and communication among scientists and technologists in the areas of research coordination, education and training, collaborative technologies, and standards development. Our vision of this RCN is to develop a core community of experienced and dedicated HDF users to:

  1. Foster education and training of new and existing users through development of teaching modules, workshops and other mechanisms for sharing knowledge and experience,
  2. Provide a forum for sharing tools and techniques related to HDF technologies,
  3. Convene diverse users to foster interdisciplinary collaboration, and
  4. Formalize a community of committed HDF users invested in the sustainability of HDF products.

Continue reading

Worried about your unlimited data plan bills? Cut them with OPeNDAP

Joe Lee, The HDF Group

Sprint has recently hit the airwaves with a promotion claiming that they will cut your data bill in half.  But there’s no free lunch in this connected world we live in. Unlimited data plans always come with a steep price tag.

While the internet has been around awhile, there has recently been an explosion of data – email, the World Wide Web, social media, cloud computing, mobile apps for everything, and Big Data.  At the same time, the overall global population of people using the internet has skyrocketed, as has the “Internet of Things.” Getting around can be a challenge.

The overcrowded and congested internet will continue to throw more data on us. Consequently, getting the right amount of the right data can also be a great challenge.  When it’s delivered over the internet, getting the right amount of data also helps ensure that your data delivery time will be dramatically shortened, and your data delivery costs minimized.  Continue reading