Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - hadoop -Mapreduce


Copy link to this message
-
Re: Hadoop-MapReduce
Shekhar Sharma 2013-12-09, 16:53
It does work i have used it long back..

BTW if it is not working, write the custom input format and implement
your record reader. That would be far more easy than breaking your
head with others code.

Break your problem in step:

(1) First the XML data is multiline...Meaning multiple lines makes a
single record for you...May be a record for you would be

<person>
 <fname>x</fname>
  <lname>y</lname>
</person>

(2) Implement a record reader that looks out for the starting and
ending person tag ( Checkout how RecordReader.java is written)

(3) Once you got the contents between starting and ending tag, now you
can use a xml parser to parse the contents into an java object and
form your own key value pairs ( custom key and custom value)
Hope you have enough pointers to write the code.
Regards,
Som Shekhar Sharma
+91-8197243810
On Mon, Dec 9, 2013 at 6:30 PM, Ranjini Rathinam <[EMAIL PROTECTED]> wrote:
> Hi Subroto Sanyal,
>
> The link  provided about xml, it does not work . The Class written
> XmlContent is not allowed in the XmlInputFormat.
>
> I request you to help , whether this scenaio some one has coded, and needed
> working code.
>
> I have written using SAX Parser too, but eventhough the jars are added in
> classpath THe error is is coming has NoClasFoung Exception.
>
> Please provide sample code for the same.
>
> Thanks in advance,
> Ranjini.R
>
> On Mon, Dec 9, 2013 at 12:34 PM, Ranjini Rathinam <[EMAIL PROTECTED]>
> wrote:
>>
>>
>>>> Hi,
>>>>
>>>> As suggest by the link below , i have used for my program ,
>>>>
>>>> but i am facing the below issues, please help me to fix these error.
>>>>
>>>>
>>>> XmlReader.java:8: XmlReader.Map is not abstract and does not override
>>>> abstract method
>>>> map(org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text,org.apache.hadoop.mapred.OutputCollector<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>,org.apache.hadoop.mapred.Reporter)
>>>> in org.apache.hadoop.mapred.Mapper
>>>>  public static class Map extends MapReduceBase implements Mapper
>>>> <LongWritable, Text, Text, Text> {
>>>>                ^
>>>> ./XmlInputFormat.java:16: XmlInputFormat.XmlRecordReader is not abstract
>>>> and does not override abstract method
>>>> next(java.lang.Object,java.lang.Object) in
>>>> org.apache.hadoop.mapred.RecordReader
>>>> public class XmlRecordReader implements RecordReader {
>>>>        ^
>>>> Note: XmlReader.java uses unchecked or unsafe operations.
>>>> Note: Recompile with -Xlint:unchecked for details.
>>>> 2 errors
>>>>
>>>>
>>>> i am using hadoop 0.20 version and java 1.6 .
>>>>
>>>> Please suggest.
>>>>
>>>> Thanks in advance.
>>>>
>>>> Regrads,
>>>> Ranjini. R
>>>> On Mon, Dec 9, 2013 at 11:08 AM, Ranjini Rathinam
>>>> <[EMAIL PROTECTED]> wrote:
>>>>>
>>>>>
>>>>>
>>>>> ---------- Forwarded message ----------
>>>>> From: Subroto <[EMAIL PROTECTED]>
>>>>> Date: Fri, Dec 6, 2013 at 4:42 PM
>>>>> Subject: Re: Hadoop-MapReduce
>>>>> To: [EMAIL PROTECTED]
>>>>>
>>>>>
>>>>> Hi Ranjini,
>>>>>
>>>>> A good example to look into :
>>>>> http://www.undercloud.org/?p=408
>>>>>
>>>>> Cheers,
>>>>> Subroto Sanyal
>>>>>
>>>>> On Dec 6, 2013, at 12:02 PM, Ranjini Rathinam wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> How to read xml file via mapreduce and load them in hbase and hive
>>>>> using java.
>>>>>
>>>>> Please provide sample code.
>>>>>
>>>>> I am using hadoop 0.20 version and java 1.6. Which parser version
>>>>> should be used.
>>>>>
>>>>> Thanks in advance.
>>>>>
>>>>> Ranjini
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>