Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> file manipulation

Copy link to this message
Re: file manipulation
MapReduce (and hence Pig) does not support file append.  This is because in MapReduce tasks may be run multiple times in the case of failure or due to speculative execution.  This would result in duplicate appends.  Also, if the job fails, it would not be able to remove the appended data.

As far as updating your data, what kind of updates do you want to do?  Stores like HBase (which can be accessed from Pig) support updates.  But whether this is a good fit depends on your use case.


On Jun 1, 2012, at 11:54 AM, Michael G. wrote:

> Hi all
> I'm new in pig and in hadoop .
> Can you tell me how I can :
> 1. append to existing file on HDFS with pig
> 2. update file  with pig, if it could be passible.
> 10x.
> --
> -- Michael G. --