Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> HDF5 and Hadoop

Copy link to this message
Re: HDF5 and Hadoop
Hi Andrew,

There has been some work in the Tika [1] project recently on looking at NetCDF4 [2] and HDF4/5 [3] and extracting metadata/text content from them. Though this doesn't directly apply to your question below, it might be worth perhaps looking at how to marry Tika and Hadoop in that regard.



[1] http://lucene.apache.org/tika/
[2] http://issues.apache.org/jira/browse/TIKA-400
[3] https://issues.apache.org/jira/browse/TIKA-399
On 5/3/10 10:36 AM, "Andrew Nguyen" <[EMAIL PROTECTED]> wrote:

Does anyone know of any existing work integrating HDF5 (http://www.hdfgroup.org/HDF5/whatishdf5.html) with Hadoop?

I don't know much about HDF5 but it was recently brought to my attention as a way to store high-density scientific data.  Since I've confirmed that having Hadoop dramatically speeds up our analysis, it seems like marrying the two might have some benefits.

I've done some searches on google and it doesn't turn up much.



Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
WWW:   http://sunset.usc.edu/~mattmann/
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA