|
|
-
Mapper outputs an empty file
dyuti a 2012-11-30, 12:21
Hi All, Am trying with xml processing in hadoop,used the below code inside map method. It results an empty file (not used reducer class).is there anything wrong ?
//code used inside map method public void map(LongWritable key, Text value1,Context context) throws IOException, InterruptedException { String xmlString = value1.toString(); SAXBuilder builder = new SAXBuilder(); Reader in = new StringReader(xmlString); String value=""; try { Document doc = builder.build(in); Element rootNode = doc.getRootElement(); List<Element> list = rootNode.getChildren("staff"); for (int i = 0; i < list.size(); i++) { Element node = (Element) list.get(i); String tag1 = node.getChildText("firstname"); String tag2 = node.getChildText("lastname"); String tag3 = node.getChildText("nickname"); String tag4 = node.getChildText("salary");
value = tag1 + "," + tag2 + "," + tag3 + "," + tag4; context.write(NullWritable.get(), new Text(value)); } } followed by catch statements....................
//xml input file <?xml version="1.0" encoding="UTF-8"?> <company> <staff> <firstname>yong</firstname> <lastname>mook kim</lastname> <nickname>mkyong</nickname> <salary>100000</salary> </staff> <staff> <firstname>low121</firstname> <lastname>yin fong1</lastname> <nickname>fong fong1</nickname> <salary>2000001</salary> </staff> </company>
Thanks for your help!
Regards, dti
+
dyuti a 2012-11-30, 12:21
-
Re: Mapper outputs an empty file
Bertrand Dechoux 2012-11-30, 13:06
You should write unit tests (MRUnit) and do debugging if that's not enough. I would assume that you are a reading your file line by line. And each line is not a valid xml, thus an exception is thrown and then caught but without any logs or counters.
Regards
Bertrand
On Fri, Nov 30, 2012 at 1:21 PM, dyuti a <[EMAIL PROTECTED]> wrote:
> Hi All, > Am trying with xml processing in hadoop,used the below code inside map > method. It results an empty file (not used reducer class).is there anything > wrong ? > > //code used inside map method > public void map(LongWritable key, Text value1,Context context) > throws IOException, InterruptedException { > String xmlString = value1.toString(); > SAXBuilder builder = new SAXBuilder(); > Reader in = new StringReader(xmlString); > String value=""; > try { > Document doc = builder.build(in); > Element rootNode = doc.getRootElement(); > List<Element> list = rootNode.getChildren("staff"); > for (int i = 0; i < list.size(); i++) { > Element node = (Element) list.get(i); > String tag1 = node.getChildText("firstname"); > String tag2 > node.getChildText("lastname"); > String tag3 = node.getChildText("nickname"); > String tag4 = node.getChildText("salary"); > > value = tag1 + "," + tag2 + "," + tag3 + "," + tag4; > context.write(NullWritable.get(), new Text(value)); > } > } followed by catch statements.................... > > //xml input file > <?xml version="1.0" encoding="UTF-8"?> > <company> > <staff> > <firstname>yong</firstname> > <lastname>mook kim</lastname> > <nickname>mkyong</nickname> > <salary>100000</salary> > </staff> > <staff> > <firstname>low121</firstname> > <lastname>yin fong1</lastname> > <nickname>fong fong1</nickname> > <salary>2000001</salary> > </staff> > </company> > > Thanks for your help! > > Regards, > dti > > -- Bertrand Dechoux
+
Bertrand Dechoux 2012-11-30, 13:06
-
Re: Mapper outputs an empty file
Harsh J 2012-11-30, 12:57
The lack of conditional logic suggests that an empty file should never occur for a _successful_ parse.
So the question boils down to successful parsing. What exactly is your RecordReader/InputFormat here? The TextInputFormat reads documents line by line, and is not suited for direct XML document based parsing, which you rely on here, as you are considering a single KV pair input into the mapper to contain the whole document to run the parser upon.
If your catching logic is catching and logging exceptions, I suggest taking a look at the Mapper's task logs to see your actual error here.
On Fri, Nov 30, 2012 at 5:51 PM, dyuti a <[EMAIL PROTECTED]> wrote: > Hi All, > Am trying with xml processing in hadoop,used the below code inside map > method. It results an empty file (not used reducer class).is there anything > wrong ? > > //code used inside map method > public void map(LongWritable key, Text value1,Context context) > throws IOException, InterruptedException { > String xmlString = value1.toString(); > SAXBuilder builder = new SAXBuilder(); > Reader in = new StringReader(xmlString); > String value=""; > try { > Document doc = builder.build(in); > Element rootNode = doc.getRootElement(); > List<Element> list = rootNode.getChildren("staff"); > for (int i = 0; i < list.size(); i++) { > Element node = (Element) list.get(i); > String tag1 = node.getChildText("firstname"); > String tag2 = node.getChildText("lastname"); > String tag3 = node.getChildText("nickname"); > String tag4 = node.getChildText("salary"); > > value = tag1 + "," + tag2 + "," + tag3 + "," + tag4; > context.write(NullWritable.get(), new Text(value)); > } > } followed by catch statements.................... > > //xml input file > <?xml version="1.0" encoding="UTF-8"?> > <company> > <staff> > <firstname>yong</firstname> > <lastname>mook kim</lastname> > <nickname>mkyong</nickname> > <salary>100000</salary> > </staff> > <staff> > <firstname>low121</firstname> > <lastname>yin fong1</lastname> > <nickname>fong fong1</nickname> > <salary>2000001</salary> > </staff> > </company> > > Thanks for your help! > > Regards, > dti >
-- Harsh J
+
Harsh J 2012-11-30, 12:57
|
|