yeah. database design is always subjective so everybody has an opinion
about it. but if you're just starting out i would recommend you kinda
follow the rules as you would in a traditional relational database system.
so two different datasets would mean two different tables in both Hive and
an Rdb database.
Start there anyway and get your feet wet. :)
On Wed, Aug 21, 2013 at 7:24 AM, Chris Driscol <[EMAIL PROTECTED]>wrote:
> Hi -
> I just started to get my feet wet with Hive and have a question that I
> have not been able to find an answer to..
> Suppose I have 2 CSV files:
> >cat Schema1.csv
> Name, Address, Phone
> Chris, address1, 999-999-9999
> >cat Schema2.csv
> Id, Name, Address, Gender, Phone
> 13, Tom, address2, male, 888-888-8888
> I put these two files into Hadoop and want to be able to query these 2
> different schema's via Hive..
> Do I need to create two tables in Hive to represent both schemas and use a
> join? Or is there a better way that can handle these two different schemas?
> Please reply back with any other specific questions, I realize this is
> somewhat open-ended.. thanks!