Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> Mapper outputs an empty file


Copy link to this message
-
Re: Mapper outputs an empty file
The lack of conditional logic suggests that an empty file should never
occur for a _successful_ parse.

So the question boils down to successful parsing. What exactly is your
RecordReader/InputFormat here? The TextInputFormat reads documents
line by line, and is not suited for direct XML document based parsing,
which you rely on here, as you are considering a single KV pair input
into the mapper to contain the whole document to run the parser upon.

If your catching logic is catching and logging exceptions, I suggest
taking a look at the Mapper's task logs to see your actual error here.

On Fri, Nov 30, 2012 at 5:51 PM, dyuti a <[EMAIL PROTECTED]> wrote:
> Hi All,
> Am trying with xml processing in hadoop,used the below code inside map
> method. It results an empty file (not used reducer class).is there anything
>  wrong ?
>
> //code used inside map method
> public void map(LongWritable key, Text value1,Context context)
> throws IOException, InterruptedException {
>         String xmlString = value1.toString();
> SAXBuilder builder = new SAXBuilder();
> Reader in = new StringReader(xmlString);
> String value="";
> try {
>                         Document doc = builder.build(in);
> Element rootNode = doc.getRootElement();
>                         List<Element> list = rootNode.getChildren("staff");
>             for (int i = 0; i < list.size(); i++) {
> Element node = (Element) list.get(i);
>               String tag1 = node.getChildText("firstname");
>                                 String tag2 = node.getChildText("lastname");
>                          String tag3 = node.getChildText("nickname");
>                    String tag4 = node.getChildText("salary");
>
> value = tag1 + "," + tag2 + "," + tag3 + "," + tag4;
> context.write(NullWritable.get(), new Text(value));
> }
> } followed by catch statements....................
>
> //xml input file
> <?xml version="1.0" encoding="UTF-8"?>
> <company>
> <staff>
> <firstname>yong</firstname>
> <lastname>mook kim</lastname>
> <nickname>mkyong</nickname>
> <salary>100000</salary>
> </staff>
> <staff>
> <firstname>low121</firstname>
> <lastname>yin fong1</lastname>
> <nickname>fong fong1</nickname>
> <salary>2000001</salary>
> </staff>
> </company>
>
> Thanks for your help!
>
> Regards,
> dti
>

--
Harsh J
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB