Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Declaring schema for unknown number of columns


Copy link to this message
-
Re: Declaring schema for unknown number of columns
Hi Jinyuan,

Since I don't know how many columns I will have, I do something like this.

six_month_and_variable_month_sales_2 = FOREACH
six_month_and_variable_month_sales
  GENERATE $0 AS ed_style_id,
    $1 AS sale_start_month,
    $2 AS sale_month_1,
    $3 AS sale_month_2,
    $4 AS sale_month_3,
    $5 AS sale_month_4,
    $6 AS sale_month_5,
    $7 AS sale_month_6,
    $8 ..;

I still get the same error when I try to join on this relation.
On Mon, Jan 7, 2013 at 2:27 PM, Jinyuan Zhou <[EMAIL PROTECTED]> wrote:

> If you can load it but join operation need the complete schema, then you
> can try  do a generate statement to project your original relation  to
> produce the one you can define schema for all fields.
>
> On Mon, Jan 7, 2013 at 2:19 PM, Chan, Tim <[EMAIL PROTECTED]> wrote:
>
> > Is it possible to declare a schema when doing a LOAD for data in which
> you
> > do not know the total number of columns?
> >
> > For instance. I know the data contains 6 or more columns. These columns
> are
> > of the same data type.
> >
> > I basically want to join this data with another data set, but I was
> getting
> > the following error:
> >
> > ERROR 1109: Input (six_month_and_variable_month_sales) on which outer
> > join is desired should have a valid schema
> >
>
>
>
> --
> -- Jinyuan (Jack) Zhou
>