Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> how to get input schema in UDF

Copy link to this message
RE: how to get input schema in UDF
Thanks, Robert.

However, I'm still not clear on how to get the original fields for the tuple inside the bag. Following is the code to generate the schema.

public Schema outputSchema(Schema input) {
      Schema.FieldSchema counter = new Schema.FieldSchema("counter", DataType.INTEGER);
      // here is my question, how do I get fields out of the original tuple inside the bag?
      // If I use the following line, I only get the BAG, not the tuple.
      Schema tupleSchema = new Schema(input.getFields());
      // After I get the original fields from the tuple, I can add the counter here

      Schema.FieldSchema tupleFs;
      tupleFs = new Schema.FieldSchema("with_counter", tupleSchema, DataType.TUPLE);

      Schema bagSchema = new Schema(tupleFs);
      return new Schema(new Schema.FieldSchema("row_counter",
                                                bagSchema, DataType.BAG));
     }catch (Exception e){
        return null;


-----Original Message-----
From: Robert Yerex [mailto:[EMAIL PROTECTED]]
Sent: Monday, August 13, 2012 4:15 PM
Subject: Re: how to get input schema in UDF

Chapter 10 in Alan Gates' excellent book "Programmin Pig" discusses this issue.

Robert Yerex
Data Scientist
Civitas Leaning

On Mon, Aug 13, 2012 at 3:43 PM, Danfeng Li <[EMAIL PROTECTED]> wrote:

> I have a big, e.g. A: {(name: chararray,age: int)}, I wrote a udf
> which adds 1 more field in the tuple inside the bag. E.g. B: {(name:
> chararray,age: int, rank:int)}. Because the number of fields in the
> original bag is not fixed, e.g I can have one more field such as gender:int.
> In my udf, in order to generate the correct output schema, I need to
> get the input schema first. I tried to find some examples but
> couldn't, could someone show me how to do it?
> Thanks.
> Dan
Robert Yerex
Data Scientist
Civitas Learning