Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> HDFS File Appending


Copy link to this message
-
Re: HDFS File Appending
HDFS doesnot support Appending i think . I m not sure about pig , if you are
using Hadoop directly you can zip the files and use zip as the input the
jobs.

On Fri, Jun 17, 2011 at 6:56 AM, Xiaobo Gu <[EMAIL PROTECTED]> wrote:

> please refer to FileUtil.CopyMerge
>
> On Fri, Jun 17, 2011 at 8:33 AM, jagaran das <[EMAIL PROTECTED]>
> wrote:
> > Hi,
> >
> > We have a requirement where
> >
> >  There would be huge number of small files to be pushed to hdfs and then
> use pig
> > to do analysis.
> >  To get around the classic "Small File Issue" we merge the files and push
> a
> > bigger file in to HDFS.
> >  But we are loosing time in this merging process of our pipeline.
> >
> > But If we can directly append to an existing file in HDFS we can save
> this
> > "Merging Files" time.
> >
> > Can you please suggest if there a newer stable version of Hadoop where
> can go
> > for appending ?
> >
> > Thanks and Regards,
> > Jagaran
>