Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> file manipulation


+
Michael G. 2012-06-01, 18:54
+
Jonathan Coveney 2012-06-01, 18:57
Copy link to this message
-
Re: file manipulation
MapReduce (and hence Pig) does not support file append.  This is because in MapReduce tasks may be run multiple times in the case of failure or due to speculative execution.  This would result in duplicate appends.  Also, if the job fails, it would not be able to remove the appended data.

As far as updating your data, what kind of updates do you want to do?  Stores like HBase (which can be accessed from Pig) support updates.  But whether this is a good fit depends on your use case.

Alan.

On Jun 1, 2012, at 11:54 AM, Michael G. wrote:

> Hi all
> I'm new in pig and in hadoop .
> Can you tell me how I can :
> 1. append to existing file on HDFS with pig
> 2. update file  with pig, if it could be passible.
>
> 10x.
>
> --
> -- Michael G. --
+
Jagat 2012-06-03, 04:05
+
Michael G. 2012-06-03, 06:11
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB