Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - Re: Hive error when loading csv data.


Copy link to this message
-
Re: Hive error when loading csv data.
Ruslan Al-Fakikh 2012-06-27, 15:38
Hi,

You may try Cloudera's pseudo-distributed mode
https://ccp.cloudera.com/display/CDHDOC/CDH3+Deployment+in+Pseudo-Distributed+Mode
You may also try Cloudera's demo VM
https://ccp.cloudera.com/display/SUPPORT/Cloudera's+Hadoop+Demo+VM

Regards,
Ruslan Al-Fakikh

On Wed, Jun 27, 2012 at 4:39 PM, ramakanth reddy
<[EMAIL PROTECTED]> wrote:
> Hi
>
> Can any help me how to start working with hadoop in single Node and cluster
> environment,please send me some useful links.
>
> On Wed, Jun 27, 2012 at 4:50 PM, Subir S <[EMAIL PROTECTED]> wrote:
>
>> Pig has this CSVExcelStorage [1] and CSVLoader [2] as part of PiggyBank. It
>> may help.
>>
>> [1]
>>
>> http://pig.apache.org/docs/r0.9.2/api/org/apache/pig/piggybank/storage/CSVExcelStorage.html
>> [2]
>>
>> http://pig.apache.org/docs/r0.9.2/api/org/apache/pig/piggybank/storage/CSVLoader.html
>>
>> CCed pig user-list also.
>>
>>
>> On Wed, Jun 27, 2012 at 8:22 AM, Sandeep Reddy P <
>> [EMAIL PROTECTED]> wrote:
>>
>> > Thanks Michael Sorry i didnt get that soon. I'll try that and reply you
>> > back.
>> >
>> > On Tue, Jun 26, 2012 at 10:13 PM, Michel Segel <
>> [EMAIL PROTECTED]
>> > >wrote:
>> >
>> > > Sorry,
>> > > I was saying  that you can write a python script that replaces the
>> > > delimiter with a | and ignore the commas within quotes.
>> > >
>> > >
>> > > Sent from a remote device. Please excuse any typos...
>> > >
>> > > Mike Segel
>> > >
>> > > On Jun 26, 2012, at 8:58 PM, Sandeep Reddy P <
>> > [EMAIL PROTECTED]>
>> > > wrote:
>> > >
>> > > > If i do that my data will be d|"abc|def"|abcd my problem is not
>> solved
>> > > >
>> > > > On Tue, Jun 26, 2012 at 6:48 PM, Michel Segel <
>> > [EMAIL PROTECTED]
>> > > >wrote:
>> > > >
>> > > >> Yup. I just didnt add the quotes.
>> > > >>
>> > > >> Sent from a remote device. Please excuse any typos...
>> > > >>
>> > > >> Mike Segel
>> > > >>
>> > > >> On Jun 26, 2012, at 4:30 PM, Sandeep Reddy P <
>> > > [EMAIL PROTECTED]>
>> > > >> wrote:
>> > > >>
>> > > >>> Thanks for the reply.
>> > > >>> I didnt get that Michael. My f2 should be "abc,def"
>> > > >>>
>> > > >>> On Tue, Jun 26, 2012 at 4:00 PM, Michael Segel <
>> > > >> [EMAIL PROTECTED]>wrote:
>> > > >>>
>> > > >>>> Alternatively you could write a simple script to convert the csv
>> to
>> > a
>> > > >> pipe
>> > > >>>> delimited file so that "abc,def" will be abc,def.
>> > > >>>>
>> > > >>>> On Jun 26, 2012, at 2:51 PM, Harsh J wrote:
>> > > >>>>
>> > > >>>>> Hive's delimited-fields-format record reader does not handle
>> quoted
>> > > >>>>> text that carry the same delimiter within them. Excel supports
>> such
>> > > >>>>> records, so it reads it fine.
>> > > >>>>>
>> > > >>>>> You will need to create your table with a custom InputFormat
>> class
>> > > >>>>> that can handle this (Try using OpenCSV readers, they support
>> > this),
>> > > >>>>> instead of relying on Hive to do this for you. If you're
>> successful
>> > > in
>> > > >>>>> your approach, please also consider contributing something back
>> to
>> > > >>>>> Hive/Pig to help others.
>> > > >>>>>
>> > > >>>>> On Wed, Jun 27, 2012 at 12:37 AM, Sandeep Reddy P
>> > > >>>>> <[EMAIL PROTECTED]> wrote:
>> > > >>>>>>
>> > > >>>>>>
>> > > >>>>>> Hi all,
>> > > >>>>>> I have a csv file with 46 columns but i'm getting error when i
>> do
>> > > some
>> > > >>>>>> analysis on that data type. For simplification i have taken 3
>> > > columns
>> > > >>>> and
>> > > >>>>>> now my csv is like
>> > > >>>>>> c,zxy,xyz
>> > > >>>>>> d,"abc,def",abcd
>> > > >>>>>>
>> > > >>>>>> i have created table for this data using,
>> > > >>>>>> hive> create table test3(
>> > > >>>>>>> f1 string,
>> > > >>>>>>> f2 string,
>> > > >>>>>>> f3 string)
>> > > >>>>>>> row format delimited
>> > > >>>>>>> fields terminated by ",";
>> > > >>>>>> OK
>> > > >>>>>> Time taken: 0.143 seconds
>> > > >>>>>> hive> load data local inpath '/home/training/a.csv'
>> > > >>>>>>> into table test3;

Best Regards,
Ruslan Al-Fakikh