Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - Unit test UDF


Copy link to this message
-
Re: Unit test UDF
Mohit Anchlia 2012-04-24, 18:38
I am still having difficulty converting this line from a file to tuple.

1333477861077/home/hadoop/pigtest/./formml_dat/999000093_return.xml
04/03/12 11:36:25 {(ST:NC),(ZIP:28613),(CITY:Xxxxxxx),(NAM2:Xxxxx X &xxx;
Xxxxx X Xxxxxx)} {(OCCUP:xxxxxxx xxxxx),(AGE:55),(MARITAL:Married)}

I looked at:

 static public Tuple loadTuple(Tuple t, String[] input) throws ExecException
{
        for (int i = 0; i < input.length; i++) {
            t.set(i, input[i]);
        }
        return t;
    }
but now my question is:
1. how do I break it into an array of String?
2. Are first 2 fields also tuple?
3. Do I just pass the Bag in the input string?

If someone could help me break down above line such that I can call
loadTuple would be helpful. It will also help me understand what that above
line is made up of.

On Fri, Apr 20, 2012 at 9:43 PM, Russell Jurney <[EMAIL PROTECTED]>wrote:

> The unit tests for TOP should be helpful?
>
> Russell Jurney http://datasyndrome.com
>
> On Apr 20, 2012, at 6:40 PM, Thejas Nair <[EMAIL PROTECTED]> wrote:
>
> > Though, not exactly what you are asking for - There is a
> getTuplesFromConstantTupleStrings function in
> test//org/apache/pig/test/Util.java that converts string representation of
> tuples to tuple objects. It is an easier way and more maintainable way of
> creating tuples in test cases.
> >
> > For example -  List<Tuple> expectedRes > >            Util.getTuplesFromConstantTupleStrings(
> >                    new String[] {
> >                            "(10,20,30,40L)",
> >                            "(11,21,31,41L)",
> >                    });
> >
> > But not exposed as public interface right now. It make sense to make it
> part of a public interface.
> >
> > -Thejas
> >
> >
> > On 4/20/12 7:48 AM, Mohit Anchlia wrote:
> >> Thanks for your response. Yes I am using those in my udf eval function.
> >> Actually my quesiton was around how do I build the tuple? Is there a
> >> utility method that would let me build my tuple with the following
> record
> >> type. I need to populate the tuple in below format so that I can pass
> it in
> >> the unit test. It's tab delimited and also has bags.
> >>
> >> 1333477861077/home/hadoop/pigtest/./formml_dat/999000093_return.xml
> >> 04/03/12 11:36:25 {(ST:NC),(ZIP:28613),(CITY:Xxxxxxx),(NAM2:Xxxxx X&xxx;
> >> Xxxxx X Xxxxxx)} {(OCCUP:xxxxxxx xxxxx),(AGE:55),(MARITAL:Married
> >>
> >> On Thu, Apr 19, 2012 at 6:44 PM, Dmitriy Ryaboy<[EMAIL PROTECTED]>
>  wrote:
> >>
> >>> Something like this (not tested):
> >>>
> >>> List<Tuple>  bagtuples = Lists.newArrayList();
> >>>
> >>> // populate inner tuples, then...
> >>>
> >>> DataBag myBag = BagFactory.getInstance().newBag(bagtuples);
> >>> Tuple t = TupleFactory.getInstance().newTuple(myBag);
> >>>
> >>> D
> >>>
> >>>
> >>> On Thu, Apr 19, 2012 at 5:51 PM, Mohit Anchlia<[EMAIL PROTECTED]>
> >>> wrote:
> >>>> Thanks! I am trying to figure out how to create a Tuble object that
> also
> >>>> has bags in it. I have a record like this that I want to pass to UDF
> as a
> >>>> tuple. Any info would be very helpful.
> >>>>
> >>>>
> >>>> 1333477861077/home/hadoop/pigtest/./formml_dat/999000093_return.xml
> >>>> 04/03/12 11:36:25 {(ST:NC),(ZIP:28613),(CITY:Xxxxxxx),(NAM2:Xxxxx
> X&xxx;
> >>>> Xxxxx X Xxxxxx)} {(OCCUP:xxxxxxx xxxxx),(AGE:55),(MARITAL:Married)}
> >>>>
> >>>>
> >>>> On Thu, Apr 19, 2012 at 5:16 PM, Dmitriy Ryaboy<[EMAIL PROTECTED]>
> >>> wrote:
> >>>>
> >>>>> Hi Mohit,
> >>>>> We just write standard Java unit tests for pig UDFs. You can see a
> ton
> >>>>> of them here:
> >>>>>
> >>>
> https://github.com/apache/pig/blob/trunk/test/org/apache/pig/test/TestStringUDFs.java
> >>>>>
> >>>>> Does that help?
> >>>>>
> >>>>> D
> >>>>>
> >>>>> On Thu, Apr 19, 2012 at 5:05 PM, Mohit Anchlia<
> [EMAIL PROTECTED]>
> >>>>> wrote:
> >>>>>> Is there a way I can just unit test my pig UDF? What's the best way
> to
> >>>>> unit
> >>>>>> test in pig. I saw pigunittest but couldn't find a way to unit test