Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS >> mail # user >> What's the best way to compress a folder in hadoop?


+
Félix López 2012-06-29, 07:37
+
madhu phatak 2012-06-29, 09:18
+
Félix López 2012-06-29, 11:31
Copy link to this message
-
RE: What's the best way to compress a folder in hadoop?
Félix, I think your are looking for hadoop har files. http://hadoop.apache.org/mapreduce/docs/r0.21.0/hadoop_archives.html

From: Félix López [mailto:[EMAIL PROTECTED]]
Sent: viernes, 29 de junio de 2012 13:31
To: [EMAIL PROTECTED]
Subject: Re: What's the best way to compress a folder in hadoop?

thanks, I've read but that is about compressing a files, I'm talking about compressing folders with subfolders and files.
In fact i have a mapreduce task for compressing a folder:
https://github.com/flopezluis/testing-hadoop
2012/6/29 madhu phatak <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Hi,
 Please refer to the discussion here  http://stackoverflow.com/questions/7153087/hadoop-compress-file-in-hdfs

On Fri, Jun 29, 2012 at 1:07 PM, Félix López <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
The folder contains files with text and other folders with text files. The text is not key/value, it's just text. Something like this:
Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dumm...

I'm thinking about creating a new Hdfs command to zip in hadoop, but i'm not sure whether hadoop distributes the execution, because otherwise it may takes a long time and very cpu consuming.

Any ideas?

Thanks

--
https://github.com/zinnia-phatak-dev/Nectar

--
http://www.linkedin.com/in/flopezluis

It's easier to ask forgiveness than it is to get permission
".....it doesn't matter how many times you fail. It doesn't matter how many times you almost get it right. No one is going to know or care about your failures, and either should you. All you have to do is learn from them and those around you because...All that matters in business is that you get it right once. Then everyone can tell you how lucky you are."
--Mark Cuban"

"Always be the worst guy in every band you're in." If you're the best guy there, you need to be in a different band. And I think that works for almost everything that's out there as well." Pat Metheny

________________________________
Subject to local law, communications with Accenture and its affiliates including telephone calls and emails (including content), may be monitored by our systems for the purposes of security and the assessment of internal compliance with Accenture policy.
______________________________________________________________________________________

www.accenture.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB