Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> Wordcount Hadoop pipes C++ Running issue


+
Massimo Simoniello 2014-01-08, 11:10
Copy link to this message
-
Re: Wordcount Hadoop pipes C++ Running issue
Hello,

Do you have a proper MR cluster configured? Does your
hadoop-1.2.1/bin/conf/mapred-site.xml point mapred.job.tracker to a
specific hostname and port, and a JT+TT is running?

I believe your error's due to the Pipes app probably running into
issues due to the LocalJobRunner default execution mode.

On Wed, Jan 8, 2014 at 4:40 PM, Massimo Simoniello
<[EMAIL PROTECTED]> wrote:
> Hi all,
>
> I am trying to run the example of wordcount in C++ like this link that
> describes the way to run the WordCount program in C++.
>
> So I have this code in the file wordcount.cpp:
>
> #include <algorithm>
> #include <limits>
> #include <string>
>
> #include  "stdint.h"  // <--- to prevent uint64_t errors!
>
> #include "Pipes.hh"
> #include "TemplateFactory.hh"
> #include "StringUtils.hh"
>
> using namespace std;
>
> class WordCountMapper : public HadoopPipes::Mapper {
> public:
>   // constructor: does nothing
>   WordCountMapper( HadoopPipes::TaskContext& context ) {
>   }
>
>   // map function: receives a line, outputs (word,"1")
>   // to reducer.
>   void map( HadoopPipes::MapContext& context ) {
>     //--- get line of text ---
>     string line = context.getInputValue();
>
>     //--- split it into words ---
>     vector< string > words = HadoopUtils::splitString( line, " " );
>
>     //--- emit each word tuple (word, "1" ) ---
>     for ( unsigned int i=0; i < words.size(); i++ ) {
>       context.emit( words[i], HadoopUtils::toString( 1 ) );
>     }
>   }
> };
>
> class WordCountReducer : public HadoopPipes::Reducer {
> public:
>   // constructor: does nothing
>   WordCountReducer(HadoopPipes::TaskContext& context) {
>   }
>
>   // reduce function
>   void reduce( HadoopPipes::ReduceContext& context ) {
>     int count = 0;
>
>     //--- get all tuples with the same key, and count their numbers ---
>     while ( context.nextValue() ) {
>       count += HadoopUtils::toInt( context.getInputValue() );
>     }
>
>     //--- emit (word, count) ---
>     context.emit(context.getInputKey(), HadoopUtils::toString( count ));
>   }
> };
>
> int main(int argc, char *argv[]) {
>   return
> HadoopPipes::runTask(HadoopPipes::TemplateFactory<WordCountMapper,WordCountReducer>()
> );
> }
>
> I have this Makefile:
>
> CC = g++
> HADOOP_INSTALL = /home/hduser/Scrivania/hadoop-1.2.1
> PLATFORM = Linux-amd64-64
> CPPFLAGS = -m64 -I$(HADOOP_INSTALL)/c++/$(PLATFORM)/include/hadoop/
>
> wordcount: wordcount.cpp
>     $(CC) $(CPPFLAGS) $< -Wall -lssl -lcrypto
> -L$(HADOOP_INSTALL)/c++/$(PLATFORM)/lib -lhadooppipes -lhadooputils
> -lpthread -g -O2 -o $@
>
> The compilation works fine, but when I try to run my program as follow:
>
> $ hadoop-1.2.1/bin/hadoop pipes -D hadoop.pipes.java.recordreader=true \
> -D hadoop.pipes.java.recordwriter=true -input input -output output -program
> wordcount
>
> I have this result:
>
> INFO util.NativeCodeLoader: Loaded the native-hadoop library
> WARN mapred.JobClient: No job jar file set.  User classes may not be found.
> See JobConf(Class) or JobConf#setJar(String).
> WARN snappy.LoadSnappy: Snappy native library not loaded
> INFO mapred.FileInputFormat: Total input paths to process : 4
> INFO filecache.TrackerDistributedCacheManager: Creating filewordcount in
> /tmp/hadoop-hduser/mapred/local/archive/8648114132384070327_893673541_1470671038-work--6818354830621303575
> with rwxr-xr-x
> INFO filecache.TrackerDistributedCacheManager: Cached wordcount as
> /tmp/hadoop-hduser/mapred/local/archive/8648114132384070327_893673541_1470671038/filewordcount
> INFO filecache.TrackerDistributedCacheManager: Cached wordcount as
> /tmp/hadoop-hduser/mapred/local/archive/8648114132384070327_893673541_1470671038/filewordcount
> INFO mapred.JobClient: Running job: job_local2050700100_0001
> INFO mapred.LocalJobRunner: Waiting for map tasks
> INFO mapred.LocalJobRunner: Starting task:
> attempt_local2050700100_0001_m_000000_0
> INFO util.ProcessTree: setsid exited with exit code 0
> INFO mapred.Task:  Using ResourceCalculatorPlugin :

Harsh J
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB