Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop, mail # user - How to update a file which is in HDFS


+
Manickam P 2013-07-04, 11:10
+
Mohammad Tariq 2013-07-04, 11:47
Copy link to this message
-
Re: How to update a file which is in HDFS
Mohammad Tariq 2013-07-05, 00:50
The current stable release doesn't support append, not even through the
API. If you really want this you have to switch to hadoop 2.x.
See this JIRA <https://issues.apache.org/jira/browse/HADOOP-8230>.

Warm Regards,
Tariq
cloudfront.blogspot.com
On Fri, Jul 5, 2013 at 3:05 AM, John Lilley <[EMAIL PROTECTED]>wrote:

>  Manickam,****
>
> ** **
>
> HDFS supports append; it is the command-line client that does not.  ****
>
> You can write a Java application that opens an HDFS-based file for append,
> and use that instead of the hadoop command line.****
>
> However, this doesn’t completely answer your original question: “How do I
> move only the delta part”?  This can be more complex than simply doing an
> append.  Have records in the original file changed in addition to new
> records becoming available?  If that is the case, you will need to
> completely rewrite the file, as there is no overwriting of existing file
> sections, even directly using HDFS.  There are clever strategies for
> working around this, like splitting the file into multiple parts on HDFS so
> that the overwrite can proceed in parallel on the cluster; however, that
> may be more work that you are looking for.  Even if the delta is limited to
> new records, the problem may not be trivial.  How do you know which records
> are new?  Are all of the new records a the end of the file?  Or can they be
> anywhere in the file?  If the latter, you will need more complex logic.***
> *
>
> ** **
>
> John****
>
> ** **
>
> ** **
>
> *From:* Mohammad Tariq [mailto:[EMAIL PROTECTED]]
> *Sent:* Thursday, July 04, 2013 5:47 AM
> *To:* [EMAIL PROTECTED]
> *Subject:* Re: How to update a file which is in HDFS****
>
> ** **
>
> Hello Manickam,****
>
> ** **
>
>         Append is currently not possible.****
>
>
> ****
>
> Warm Regards,****
>
> Tariq****
>
> cloudfront.blogspot.com****
>
> ** **
>
> On Thu, Jul 4, 2013 at 4:40 PM, Manickam P <[EMAIL PROTECTED]> wrote:
> ****
>
> Hi,****
>
> ** **
>
> I have moved my input file into the HDFS location in the cluster setup. **
> **
>
> Now i got a new set of file which has some new records along with the old
> one. ****
>
> I want to move the delta part alone into HDFS because it will take more
> time to move the file from my local to HDFS location. ****
>
> Is it possible or do i need to move the entire file into HDFS again? ****
>
> ** **
>
> ** **
>
> ** **
>
> Thanks,
> Manickam P****
>
> ** **
>
+
Harsh J 2013-07-06, 01:59