Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - Merging different HDFS file for HIVE


Copy link to this message
-
Re: Merging different HDFS file for HIVE
Nitin Pawar 2013-07-26, 12:30
Option 1 ) Use pig or oozie, write a workflow and join the files to a
single file
Option 2 ) Create a temp table for each of the different file and then join
them to a single table and delete temp table
Option 3 ) don't do anything, change your queries to look at three
different files when they query  about different files

Wait for others to give better suggestions :)
On Fri, Jul 26, 2013 at 4:22 PM, Ramasubramanian Narayanan <
[EMAIL PROTECTED]> wrote:

> Hi,
>
> Please help in providing solution for the below problem... this scenario
> is applicable in Banking atleast...
>
> I have a HIVE table with the below structure...
>
> Hive Table:
> Field1
> ...
> Field 10
>
>
> For the above table, I will get the values for each feed in different
> file. You can imagine that these files belongs to same branch and will get
> at any time interval. I have to load into table only if I get all 3 files
> for the same branch. (assume that we have a common field in all the files
> to join)
>
> *Feed file 1 :*
> EMP ID
> Field 1
> Field 2
> Field 6
> Field 9
>
> *Feed File2 :*
> EMP ID
> Field 5
> Field 7
> Field 10
>
> *Feed File3 :*
> EMP ID
> Field 3
> Field 4
> Field 8
>
> Now the question is,
> what is the best way to make all these files to make it as a single file
> so that it can be placed under the HIVE structure.
>
> regards,
> Rams
>

--
Nitin Pawar