Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> How to load csv data into HIVE


+
Sandeep Reddy P 2012-09-07, 14:18
+
Connell, Chuck 2012-09-07, 14:39
+
Sandeep Reddy P 2012-09-07, 14:41
+
Connell, Chuck 2012-09-07, 14:57
+
Mohammad Tariq 2012-09-07, 15:02
+
Sandeep Reddy P 2012-09-07, 15:07
+
praveenesh kumar 2012-09-08, 11:54
+
Connell, Chuck 2012-09-08, 12:18
+
Bejoy KS 2012-09-08, 12:33
+
praveenesh kumar 2012-09-08, 14:35
Copy link to this message
-
Re: How to load csv data into HIVE
Hello Sandeep,

  I would suggest you to write a MapReduce job instead of usual sequential
program to transform your files. It would be much faster. Then use Hive to
load the data.

Regards,
    Mohammad Tariq

On Fri, Sep 7, 2012 at 8:11 PM, Sandeep Reddy P <[EMAIL PROTECTED]
> wrote:

> Hi,
> I wrote a shell script to get csv data but when i run that script on a
> 12GB csv its taking more time. If i run a python script will that be faster?
>
>
> On Fri, Sep 7, 2012 at 10:39 AM, Connell, Chuck <[EMAIL PROTECTED]>wrote:
>
>>  How about a Python script that changes it into plain tab-separated
>> text? So it would look like this…****
>>
>> ** **
>>
>> 174969274<tab>14-mar-2006<tab>3522876<tab>
>> <tab>14-mar-2006<tab>500000308<tab>65<tab>1<newline>
>> etc…****
>>
>> ** **
>>
>> Tab-separated with newlines is easy to read and works perfectly on import.
>> ****
>>
>> ** **
>>
>> Chuck Connell****
>>
>> Nuance R&D Data Team****
>>
>> Burlington, MA****
>>
>> 781-565-4611****
>>
>> ** **
>>
>> *From:* Sandeep Reddy P [mailto:[EMAIL PROTECTED]]
>> *Subject:* How to load csv data into HIVE****
>>
>> ** **
>>
>> Hi,
>> Here is the sample data
>> "174969274","14-mar-2006","****
>>
>> 3522876","","14-mar-2006","500000308","65","1"|
>> "174969275","19-jul-2006","3523154","","19-jul-2006","500000308","65","1"|
>> "174969276","31-dec-2005","3530333","","31-dec-2005","500000308","65","1"|
>> "174969277","14-apr-2005","3531470","","14-apr-2005","500000308","65","1"|
>>
>> How to load this kind of data into HIVE?
>> I'm using shell script to get rid of double quotes and '|' but its taking
>> very long time to work on each csv which are 12GB each. What is the best
>> way to do this?****
>>
>> ** **
>>
>
>
>
> --
> Thanks,
> sandeep
>
>
+
Abhishek 2012-09-07, 14:31