-Re: file manipulation
Alan Gates 2012-06-03, 02:58
MapReduce (and hence Pig) does not support file append. This is because in MapReduce tasks may be run multiple times in the case of failure or due to speculative execution. This would result in duplicate appends. Also, if the job fails, it would not be able to remove the appended data.
As far as updating your data, what kind of updates do you want to do? Stores like HBase (which can be accessed from Pig) support updates. But whether this is a good fit depends on your use case.
On Jun 1, 2012, at 11:54 AM, Michael G. wrote:
> Hi all
> I'm new in pig and in hadoop .
> Can you tell me how I can :
> 1. append to existing file on HDFS with pig
> 2. update file with pig, if it could be passible.
> -- Michael G. --