Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> RecordReader and non thread safe JNI libraries

Copy link to this message
RecordReader and non thread safe JNI libraries
My RecordReader subclass reads from object X. To parse this object and
emit records, i need the use of a C library and a JNI wrapper.

public boolean next(LongWritable key, BytesWritable value) throws IOException {
   if (leftover == 0) return false;
   long wi = pos + split.getStart();
   value.readFields(X.at( wi);
   pos ++; leftover --;
   return true;

X.at uses the JNI lib to read a record number wi

My question is who running this?
1) For a given job, is one instance of this running on each
tasktracker? reading records and feeding to the mappers on its
2) as I have mapred.tasktracker.map.tasks.maximum == 7, does each jvm
launched have one RecordReader running feeding records to the maps its
jvm is running.

If it's either (1) or (2), I guess I'm safe from threading issues.

Please correct me if i'm totally wrong.

Saptarshi Guha