Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> schema definition and subschema


+
Keren Ouaknine 2013-08-08, 21:42
Copy link to this message
-
Re: schema definition and subschema
Hi Keren,

Hope this is too late.

>> I am wondering why is LogicalFieldShema containing a LogicalSchema
member?

That's for nested tuple fields. For example, consider "( i:int,
t:tuple(j:int) )". The field t:tuple needs to contain a list of field
schemas, so you need a LogicalSchema. Here is how you can verify it.

1) Debug Pig main in eclipse.
2) Set a breakpoint in the LogicalFieldSchema constructor.
3) Run "a = load '/dev/null' as (i:int, t:tuple(j:int));" on grunt.

Thanks,
Cheolsoo
On Thu, Aug 8, 2013 at 2:42 PM, Keren Ouaknine <[EMAIL PROTECTED]> wrote:

> Hi,
>
> A schema in Pig (LogicalSchema.java) is defined as an array list of
> LogicalFieldSchema whose class members are:
> - String alias
> - byte type
> - long uid
> - LogicalSchema schema
>
> I am wondering why is LogicalFieldShema containing a LogicalSchema member?
> My guess so far is that perhaps there's a subschema used by some operators?
> I tried to figure out which operators might be using it and categorized the
> main ones as follow:
>
> ==> SCHEMA IS DEFINED BY INPUT SCHEMA ONLY
> LOAD
> DISTINCT
> FILTER
> ORDER BY
> SPLIT
>
> ==> SCHEMA IS DEFINED BY THE LIST OF "AS" IN THE FOREACH STATEMENT
> FOREACH
>
> ==> IF SCHEMA CAN BE DEFINED (SAME LENGTH AND CASTABLE) OR UNKNOWN SCHEMA
> UNION
>
> ==> SCHEMA IS DEFINED BY THE CONCATENATION OF THE TWO INPUT SCHEMAS (+
> ADDING THE ALIAS TO THE FIELD NAME x ==> A::x)
> JOIN
> *Are the two inputs here considered subschemas?*
>
> ==> SCHEMA: (key_to_order_by, bag)
> GROUP
>
> Thanks,
> Keren
>
> --
> Keren Ouaknine
> Web: www.kereno.com
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB