Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - RecordReader and non thread safe JNI libraries


Copy link to this message
-
RecordReader and non thread safe JNI libraries
Saptarshi Guha 2009-03-02, 04:07
Hello,
My RecordReader subclass reads from object X. To parse this object and
emit records, i need the use of a C library and a JNI wrapper.

public boolean next(LongWritable key, BytesWritable value) throws IOException {
   if (leftover == 0) return false;
   long wi = pos + split.getStart();
   key.set(wi);
   value.readFields(X.at( wi);
   pos ++; leftover --;
   return true;
}

X.at uses the JNI lib to read a record number wi

My question is who running this?
1) For a given job, is one instance of this running on each
tasktracker? reading records and feeding to the mappers on its
machine?
Or,
2) as I have mapred.tasktracker.map.tasks.maximum == 7, does each jvm
launched have one RecordReader running feeding records to the maps its
jvm is running.

If it's either (1) or (2), I guess I'm safe from threading issues.

Please correct me if i'm totally wrong.
Regards

Saptarshi Guha