Sorry, Looks like my suggestion won't help unless you were able to specify
the schema with the original load statement. If the number of field is ONLY
available at runtime but each row have the same number field and you know
the position of join key, then I have a ugly approach. First, sample the
first line to get the number of fields. Write a UDF that takes all fields
of the data. Pass the number to UDF and override the method public Schema
outputSchema(Schema input) to output a complete schema. your exec method
would return the tuple with same length as input tuple and convert each
item in tuple to the datatype you know. The resulting relation should have
valid schema. But I don't know how to pass the number to UDF efficiently. I
hope some one can have better suggestions.
On Mon, Jan 7, 2013 at 5:48 PM, Chan, Tim <[EMAIL PROTECTED]> wrote:
> Hi Jinyuan,
> Since I don't know how many columns I will have, I do something like this.
> six_month_and_variable_month_sales_2 = FOREACH
> GENERATE $0 AS ed_style_id,
> $1 AS sale_start_month,
> $2 AS sale_month_1,
> $3 AS sale_month_2,
> $4 AS sale_month_3,
> $5 AS sale_month_4,
> $6 AS sale_month_5,
> $7 AS sale_month_6,
> $8 ..;
> I still get the same error when I try to join on this relation.
> On Mon, Jan 7, 2013 at 2:27 PM, Jinyuan Zhou <[EMAIL PROTECTED]>
> > If you can load it but join operation need the complete schema, then you
> > can try do a generate statement to project your original relation to
> > produce the one you can define schema for all fields.
> > On Mon, Jan 7, 2013 at 2:19 PM, Chan, Tim <[EMAIL PROTECTED]> wrote:
> > > Is it possible to declare a schema when doing a LOAD for data in which
> > you
> > > do not know the total number of columns?
> > >
> > > For instance. I know the data contains 6 or more columns. These columns
> > are
> > > of the same data type.
> > >
> > > I basically want to join this data with another data set, but I was
> > getting
> > > the following error:
> > >
> > > ERROR 1109: Input (six_month_and_variable_month_sales) on which outer
> > > join is desired should have a valid schema
> > >
> > --
> > -- Jinyuan (Jack) Zhou
-- Jinyuan (Jack) Zhou