Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - suggest Best way to upload xml files to HDFS


Copy link to this message
-
Re: suggest Best way to upload xml files to HDFS
Manoj Babu 2012-07-13, 05:59
Hi,

Could you kindly provide the pros and cons of Multifile, combilefile,
sequencefile input format?

Thanks in Advance.

Cheers!
Manoj.

On Fri, Jul 13, 2012 at 10:15 AM, Bejoy KS <[EMAIL PROTECTED]> wrote:

> **
> Hi Manoj
>
> If you are looking at a scheduler and a work flow manager to carry out
> this task you can have a look at oozie.
>
> If your xml files are smaller(smaller than hdfs block size) then
> definitely it is a better practice to combine them to form larger files.
> Combining into Sequence Files should be good.
> Regards
> Bejoy KS
>
> Sent from handheld, please excuse typos.
> ------------------------------
> *From: * Manoj Babu <[EMAIL PROTECTED]>
> *Date: *Fri, 13 Jul 2012 08:59:51 +0530
> *To: *<[EMAIL PROTECTED]>
> *ReplyTo: * [EMAIL PROTECTED]
> *Subject: *suggest Best way to upload xml files to HDFS
>
> Hi,
>
> I need to upload large xml files files daily. Right now am having a small
> program to read all the files from local folder and writing it to HDFS as a
> single file. Is this a right way?
> If there any best practices or optimized way to achieve this Kindly let me
> know.
>
> Thanks in advance!
>
> Cheers!
> Manoj.
>
>