The HDF Group's Call the Doctor

Linked chunks in HSDS and a new HSLS feature - John Readey

December 12, 2023 The HDF Group
The HDF Group's Call the Doctor
Linked chunks in HSDS and a new HSLS feature - John Readey
Show Notes

In this episode of "Call the Doctor," The HDF Group's John Readey explores the functionality of linked data sets in HSDS (Highly Scalable Data Service).  Using a Python notebook running on AWS, he walks through examples using data from the National Renewable Energy Lab, which has substantial HDF5 and HSDS data freely accessible. John covers various aspects, including domain information, data set details, and how to read and analyze chunks. He delves into the specifics of the chunk layout, discussing file URIs, offsets, and sizes. Comparisons between HSDS and direct S3 access using the HDF5 library reveal differences in performance due to the sequential nature of the HDF5 library's requests. John concludes by demonstrating a new feature for querying specific data sets using hsls.

You can also watch this session online.

Call the Doctor is a series of weekly, unscripted, live events! The HDF Group’s staff members will answer attendee questions and, for example, go over the previous week’s HDF Forum posts. The HDF Clinics are free sessions intended to help users tackle real-world HDF problems from a common cold to severe headaches and offer relief where that’s possible. As time permits, we will include how-tos, offer advice on tool usage, review your code samples, teach you survival in the documentation jungle, and discuss what’s new or just around the corner in the land of HDF.

Join us every Tuesday at 12:20 p.m. central (US/Canada.) on Zoom!