|
|
Chan, Tim 2012-12-08, 18:48
After many joins, my relation's schema because very verbose.
For example:
e::d::c::b::a::column1:bytearray, e::d::c::b::a::column2:bytearray
Is there a way simplify the schema back to:
column1:bytearray, column2:bytearray
I seem to be able to achieve this by doing a STORE then LOAD, but this doesn't seem very efficient.
-
Re: Simplifying schema?
Jesse Jackson 2012-12-08, 19:28
Good Afternoon Tim, What I've done is in my next command is something like this:
NewBag = foreach oldBag generate e::d::c::b::a::column1 as column1, e::d::c::b::a::column2 as column2;
then your down to more manageable names.
-JJ
On Sat, Dec 8, 2012 at 1:48 PM, Chan, Tim <[EMAIL PROTECTED]> wrote: > After many joins, my relation's schema because very verbose. > > For example: > > e::d::c::b::a::column1:bytearray, e::d::c::b::a::column2:bytearray > > Is there a way simplify the schema back to: > > column1:bytearray, column2:bytearray > > I seem to be able to achieve this by doing a STORE then LOAD, but this > doesn't seem very efficient.
-
Re: Simplifying schema?
Lauren Blau 2012-12-10, 13:47
yeah, this is really annoying. I'd love to see an option to automatically strip these 'parentage' values for unique names. On Sat, Dec 8, 2012 at 1:48 PM, Chan, Tim <[EMAIL PROTECTED]> wrote: > After many joins, my relation's schema because very verbose. > > For example: > > e::d::c::b::a::column1:bytearray, e::d::c::b::a::column2:bytearray > > Is there a way simplify the schema back to: > > column1:bytearray, column2:bytearray > > I seem to be able to achieve this by doing a STORE then LOAD, but this > doesn't seem very efficient.
-
Re: Simplifying schema?
Aaron Zimmerman 2012-12-10, 13:52
You still can reference the columns by their name (column1, column2), the only time you'd need to use the fully qualified name is if you have duplicated column names. In this case, they would have different qualifier strings (as they came from differently named relations). That could happen if you load from more than one data source, or if you do a self join. On 12/10/12 7:47 AM, "Lauren Blau" <[EMAIL PROTECTED]> wrote:
>yeah, this is really annoying. I'd love to see an option to >automatically strip these 'parentage' values for unique names. > > >On Sat, Dec 8, 2012 at 1:48 PM, Chan, Tim <[EMAIL PROTECTED]> wrote: >> After many joins, my relation's schema because very verbose. >> >> For example: >> >> e::d::c::b::a::column1:bytearray, e::d::c::b::a::column2:bytearray >> >> Is there a way simplify the schema back to: >> >> column1:bytearray, column2:bytearray >> >> I seem to be able to achieve this by doing a STORE then LOAD, but this >> doesn't seem very efficient.
|
|