Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> HDF5 and Hadoop


Copy link to this message
-
Re: HDF5 and Hadoop
Hi Andrew,

There has been some work in the Tika [1] project recently on looking at NetCDF4 [2] and HDF4/5 [3] and extracting metadata/text content from them. Though this doesn't directly apply to your question below, it might be worth perhaps looking at how to marry Tika and Hadoop in that regard.

HTH!

Cheers,
Chris

[1] http://lucene.apache.org/tika/
[2] http://issues.apache.org/jira/browse/TIKA-400
[3] https://issues.apache.org/jira/browse/TIKA-399
On 5/3/10 10:36 AM, "Andrew Nguyen" <[EMAIL PROTECTED]> wrote:

Does anyone know of any existing work integrating HDF5 (http://www.hdfgroup.org/HDF5/whatishdf5.html) with Hadoop?

I don't know much about HDF5 but it was recently brought to my attention as a way to store high-density scientific data.  Since I've confirmed that having Hadoop dramatically speeds up our analysis, it seems like marrying the two might have some benefits.

I've done some searches on google and it doesn't turn up much.

Thanks!

--Andrew

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: [EMAIL PROTECTED]
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB