Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Re: How to update a file which is in HDFS


Copy link to this message
-
Re: How to update a file which is in HDFS
I totally agree harsh. It was just to avoid any misinterpretation :). I
have seen quite a few discussions as well that talk about the issues.

I would strongly recommend to switch from 1.x if append is desired.

Warm Regards,
Tariq
cloudfront.blogspot.com
On Sat, Jul 6, 2013 at 7:29 AM, Harsh J <[EMAIL PROTECTED]> wrote:

> The append in 1.x is very broken. You'll run into very weird states
> and we officially do not support it (we even call out in the config as
> broken). I wouldn't recommend using it even if a simple test appears
> to work.
>
> On Sat, Jul 6, 2013 at 6:27 AM, Mohammad Tariq <[EMAIL PROTECTED]> wrote:
> > @Robin East :  Thank you for keeping me updated. I was on 1.0.3 when I
> had
> > tried append last time and it was not working despite of the fact that
> API
> > had it. I tried it with 1.1.2 and it seems to work fine.
> >
> > @Manickam : Apologies for the incorrect info. Latest stable
> release(1.1.2)
> > supports append. But, you should consider whatever Harsh has said.
> >
> > Warm Regards,
> > Tariq
> > cloudfront.blogspot.com
> >
> >
> > On Fri, Jul 5, 2013 at 4:24 PM, Harsh J <[EMAIL PROTECTED]> wrote:
> >>
> >> If it is 1k new records at the "end of the file" then you may extract
> >> them out and append the existing file in HDFS. I'd recommend using
> >> HDFS from Apache Hadoop 2.x for this purpose.
> >>
> >> On Fri, Jul 5, 2013 at 4:22 PM, Manickam P <[EMAIL PROTECTED]>
> wrote:
> >> > Hi,
> >> >
> >> > Let me explain the question clearly. I have a file which has one
> million
> >> > records and i moved into my hadoop cluster.
> >> > After one month i got a new file which has same one million plus 1000
> >> > new
> >> > records added in end of the file.
> >> > Here i just want to move the 1000 records alone into HDFS instead of
> >> > overwriting the entire file.
> >> >
> >> > Can i use HBase for this scenario? i don't have clear idea about
> HBase.
> >> > Just
> >> > asking.
> >> >
> >> >
> >> >
> >> >
> >> > Thanks,
> >> > Manickam P
> >> >
> >> >
> >> >> From: [EMAIL PROTECTED]
> >> >> Date: Fri, 5 Jul 2013 16:13:16 +0530
> >> >
> >> >> Subject: Re: How to update a file which is in HDFS
> >> >> To: [EMAIL PROTECTED]
> >> >
> >> >>
> >> >> The answer to the "delta" part is more that HDFS does not presently
> >> >> support random writes. You cannot alter a closed file for anything
> >> >> other than appending at the end, which I doubt will help you if you
> >> >> are also receiving updates (it isn't clear from your question what
> >> >> this added data really is).
> >> >>
> >> >> HBase sounds like something that may solve your requirement though,
> >> >> depending on how much of your read/write load is random. You could
> >> >> consider it.
> >> >>
> >> >> P.s. HBase too doesn't use the append() APIs today (and doesn't need
> >> >> it either). AFAIK, only Flume's making use of it, if you allow it to.
> >> >>
> >> >> On Thu, Jul 4, 2013 at 5:17 PM, Mohammad Tariq <[EMAIL PROTECTED]>
> >> >> wrote:
> >> >> > Hello Manickam,
> >> >> >
> >> >> > Append is currently not possible.
> >> >> >
> >> >> > Warm Regards,
> >> >> > Tariq
> >> >> > cloudfront.blogspot.com
> >> >> >
> >> >> >
> >> >> > On Thu, Jul 4, 2013 at 4:40 PM, Manickam P <[EMAIL PROTECTED]
> >
> >> >> > wrote:
> >> >> >>
> >> >> >> Hi,
> >> >> >>
> >> >> >> I have moved my input file into the HDFS location in the cluster
> >> >> >> setup.
> >> >> >> Now i got a new set of file which has some new records along with
> >> >> >> the
> >> >> >> old
> >> >> >> one.
> >> >> >> I want to move the delta part alone into HDFS because it will take
> >> >> >> more
> >> >> >> time to move the file from my local to HDFS location.
> >> >> >> Is it possible or do i need to move the entire file into HDFS
> again?
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> Thanks,
> >> >> >> Manickam P
> >> >> >
> >> >> >
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Harsh J
> >>
> >>
> >>
> >> --
> >> Harsh J
> >
> >
>
>
>
> --
> Harsh J
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB