-RE: How to update a file which is in HDFS
Manickam P 2013-07-05, 10:52
Let me explain the question clearly. I have a file which has one million records and i moved into my hadoop cluster. After one month i got a new file which has same one million plus 1000 new records added in end of the file. Here i just want to move the 1000 records alone into HDFS instead of overwriting the entire file.
Can i use HBase for this scenario? i don't have clear idea about HBase. Just asking.
> From: [EMAIL PROTECTED]
> Date: Fri, 5 Jul 2013 16:13:16 +0530
> Subject: Re: How to update a file which is in HDFS
> To: [EMAIL PROTECTED]
> The answer to the "delta" part is more that HDFS does not presently
> support random writes. You cannot alter a closed file for anything
> other than appending at the end, which I doubt will help you if you
> are also receiving updates (it isn't clear from your question what
> this added data really is).
> HBase sounds like something that may solve your requirement though,
> depending on how much of your read/write load is random. You could
> consider it.
> P.s. HBase too doesn't use the append() APIs today (and doesn't need
> it either). AFAIK, only Flume's making use of it, if you allow it to.
> On Thu, Jul 4, 2013 at 5:17 PM, Mohammad Tariq <[EMAIL PROTECTED]> wrote:
> > Hello Manickam,
> > Append is currently not possible.
> > Warm Regards,
> > Tariq
> > cloudfront.blogspot.com
> > On Thu, Jul 4, 2013 at 4:40 PM, Manickam P <[EMAIL PROTECTED]> wrote:
> >> Hi,
> >> I have moved my input file into the HDFS location in the cluster setup.
> >> Now i got a new set of file which has some new records along with the old
> >> one.
> >> I want to move the delta part alone into HDFS because it will take more
> >> time to move the file from my local to HDFS location.
> >> Is it possible or do i need to move the entire file into HDFS again?
> >> Thanks,
> >> Manickam P
> Harsh J