Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> newbie question about a basic script


Copy link to this message
-
newbie question about a basic script
Hi to all,

I am newbie and  I am just testing small scripts for training.

My question is about the result of the script below in local mode:

grunt> cat nested.txt
{(8,9),(0,1)},{(8,9),(1,1)}
{(2,3),(4,5)},{(2,3),(4,5)}
{(6,7),(3,7)},{(2,2),(3,7)}
grunt> A = LOAD 'nested.txt' AS
(B1:bag{T1:tuple(t1:int,t2:int)},B2:bag{T2:tuple(f1:int,f2:int)});
grunt> DUMP A;
({(8,9),(0,1)},)
({(2,3),(4,5)},)
({(6,7),(3,7)},)

Why B2 is not displayed !????

When I executed the same script with PigPen, B2 is displayed but this
time I have only one result instead of three. You can find the
screenshot in the attachment.
When I use grunt shell, I have all the messages below before displaying
the result and it takes too much time.
Should I use a parameter with pig -x local to avoid this? or I made
errors with my installation?

THANKS IN ADVANCE

grunt> A = LOAD 'nested.txt' AS
(B1:bag{T1:tuple(t1:int,t2:int)},B2:bag{T2:tuple(f1:int,f2:int)});
grunt> DUMP
A;                                                                                  
2011-04-29 15:37:44,954 [main] INFO
org.apache.pig.tools.pigstats.ScriptState - Pig features used in the
script: UNKNOWN
2011-04-29 15:37:44,954 [main] INFO
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
pig.usenewlogicalplan is set to true. New logical plan will be used.
2011-04-29 15:37:44,955 [main] INFO
org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics
with processName=JobTracker, sessionId= - already initialized
2011-04-29 15:37:44,959 [main] INFO
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - (Name:
A:
Store(file:/tmp/temp643030084/tmp-1663465556:org.apache.pig.impl.io.InterStorage) - scope-48 Operator Key: scope-48)
2011-04-29 15:37:44,959 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler
- File concatenation threshold: 100 optimistic? false
2011-04-29 15:37:44,960 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2011-04-29 15:37:44,960 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2011-04-29 15:37:44,961 [main] INFO
org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics
with processName=JobTracker, sessionId= - already initialized
2011-04-29 15:37:44,964 [main] INFO
org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics
with processName=JobTracker, sessionId= - already initialized
2011-04-29 15:37:44,966 [main] INFO
org.apache.pig.tools.pigstats.ScriptState - Pig script settings are
added to the job
2011-04-29 15:37:44,966 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2011-04-29 15:37:46,270 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2011-04-29 15:37:46,273 [main] INFO
org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics
with processName=JobTracker, sessionId= - already initialized
2011-04-29 15:37:46,275 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2011-04-29 15:37:46,295 [Thread-57] INFO
org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics
with processName=JobTracker, sessionId= - already initialized
2011-04-29 15:37:46,300 [Thread-57] INFO
org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics
with processName=JobTracker, sessionId= - already initialized
2011-04-29 15:37:46,308 [Thread-57] INFO
org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input
paths to process : 1
2011-04-29 15:37:46,308 [Thread-57] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total
input paths to process : 1
2011-04-29 15:37:46,308 [Thread-57] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total
input paths (combined) to process : 1
2011-04-29 15:37:46,402 [Thread-66] INFO
org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics
with processName=JobTracker, sessionId= - already initialized
2011-04-29 15:37:46,407 [Thread-66] INFO
org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input
paths to process : 1
2011-04-29 15:37:46,407 [Thread-66] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total
input paths to process : 1
2011-04-29 15:37:46,407 [Thread-66] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total
input paths (combined) to process : 1
2011-04-29 15:37:46,442 [Thread-66] INFO
org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics
with processName=JobTracker, sessionId= - already initialized
2011-04-29 15:37:46,446 [Thread-66] INFO
org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics
with processName=JobTracker, sessionId= - already initialized
2011-04-29 15:37:46,449 [Thread-66] INFO
org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics
with processName=JobTracker, sessionId= - already initialized
2011-04-29 15:37:46,452 [Thread-66] INFO
org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics
with processName=JobTracker, sessionId= - already initialized
2011-04-29 15:37:46,486 [Thread-66] INFO
org.apache.hadoop.mapred.TaskRunner - Task:attempt_local_0005_m_000000_0
is done. And is in the process of commiting
2011-04-29 15:37:46,486 [Thread-66] INFO
org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics
with processName=JobTracker, sessionId= - already initialized
2011-04-29 15:37:46,489 [Thread-66] INFO
org.apache.hadoop.mapred.LocalJobRunner -
2011-04-29 15:37:46,489 [Thread-66] INFO
org.apache.hadoop.mapred.TaskRunner - Task attempt_local_0005_m_000000_0
is allowed to commit now
2011-04-29 15:37:46,489 [Thread-66] INFO
org.a
+
Richard Ding 2011-04-29, 18:26
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB