Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> suggest Best way to upload xml files to HDFS

Copy link to this message
Re: suggest Best way to upload xml files to HDFS

Could you kindly provide the pros and cons of Multifile, combilefile,
sequencefile input format?

Thanks in Advance.


On Fri, Jul 13, 2012 at 10:15 AM, Bejoy KS <[EMAIL PROTECTED]> wrote:

> **
> Hi Manoj
> If you are looking at a scheduler and a work flow manager to carry out
> this task you can have a look at oozie.
> If your xml files are smaller(smaller than hdfs block size) then
> definitely it is a better practice to combine them to form larger files.
> Combining into Sequence Files should be good.
> Regards
> Bejoy KS
> Sent from handheld, please excuse typos.
> ------------------------------
> *From: * Manoj Babu <[EMAIL PROTECTED]>
> *Date: *Fri, 13 Jul 2012 08:59:51 +0530
> *Subject: *suggest Best way to upload xml files to HDFS
> Hi,
> I need to upload large xml files files daily. Right now am having a small
> program to read all the files from local folder and writing it to HDFS as a
> single file. Is this a right way?
> If there any best practices or optimized way to achieve this Kindly let me
> know.
> Thanks in advance!
> Cheers!
> Manoj.