Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Re: Hive error when loading csv data.


Copy link to this message
-
Re: Hive error when loading csv data.
More options -
Official apache instructions for 1.0 -
http://hadoop.apache.org/common/docs/r1.0.3/single_node_setup.html

If you want to try it out on single node on Amazon ec2-
Instructions for HDP distro -
http://hortonworks.com/community/virtual-sandbox/

If you want a wizard based guided install on single node, you can use
HDP for that as well - http://hortonworks.com/download/

Thanks,
Thejas

On 6/27/12 8:38 AM, Ruslan Al-Fakikh wrote:
> Hi,
>
> You may try Cloudera's pseudo-distributed mode
> https://ccp.cloudera.com/display/CDHDOC/CDH3+Deployment+in+Pseudo-Distributed+Mode
> You may also try Cloudera's demo VM
> https://ccp.cloudera.com/display/SUPPORT/Cloudera's+Hadoop+Demo+VM
>
> Regards,
> Ruslan Al-Fakikh
>
> On Wed, Jun 27, 2012 at 4:39 PM, ramakanth reddy
> <[EMAIL PROTECTED]>  wrote:
>> Hi
>>
>> Can any help me how to start working with hadoop in single Node and cluster
>> environment,please send me some useful links.
>>
>> On Wed, Jun 27, 2012 at 4:50 PM, Subir S<[EMAIL PROTECTED]>  wrote:
>>
>>> Pig has this CSVExcelStorage [1] and CSVLoader [2] as part of PiggyBank. It
>>> may help.
>>>
>>> [1]
>>>
>>> http://pig.apache.org/docs/r0.9.2/api/org/apache/pig/piggybank/storage/CSVExcelStorage.html
>>> [2]
>>>
>>> http://pig.apache.org/docs/r0.9.2/api/org/apache/pig/piggybank/storage/CSVLoader.html
>>>
>>> CCed pig user-list also.
>>>
>>>
>>> On Wed, Jun 27, 2012 at 8:22 AM, Sandeep Reddy P<
>>> [EMAIL PROTECTED]>  wrote:
>>>
>>>> Thanks Michael Sorry i didnt get that soon. I'll try that and reply you
>>>> back.
>>>>
>>>> On Tue, Jun 26, 2012 at 10:13 PM, Michel Segel<
>>> [EMAIL PROTECTED]
>>>>> wrote:
>>>>
>>>>> Sorry,
>>>>> I was saying  that you can write a python script that replaces the
>>>>> delimiter with a | and ignore the commas within quotes.
>>>>>
>>>>>
>>>>> Sent from a remote device. Please excuse any typos...
>>>>>
>>>>> Mike Segel
>>>>>
>>>>> On Jun 26, 2012, at 8:58 PM, Sandeep Reddy P<
>>>> [EMAIL PROTECTED]>
>>>>> wrote:
>>>>>
>>>>>> If i do that my data will be d|"abc|def"|abcd my problem is not
>>> solved
>>>>>>
>>>>>> On Tue, Jun 26, 2012 at 6:48 PM, Michel Segel<
>>>> [EMAIL PROTECTED]
>>>>>> wrote:
>>>>>>
>>>>>>> Yup. I just didnt add the quotes.
>>>>>>>
>>>>>>> Sent from a remote device. Please excuse any typos...
>>>>>>>
>>>>>>> Mike Segel
>>>>>>>
>>>>>>> On Jun 26, 2012, at 4:30 PM, Sandeep Reddy P<
>>>>> [EMAIL PROTECTED]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Thanks for the reply.
>>>>>>>> I didnt get that Michael. My f2 should be "abc,def"
>>>>>>>>
>>>>>>>> On Tue, Jun 26, 2012 at 4:00 PM, Michael Segel<
>>>>>>> [EMAIL PROTECTED]>wrote:
>>>>>>>>
>>>>>>>>> Alternatively you could write a simple script to convert the csv
>>> to
>>>> a
>>>>>>> pipe
>>>>>>>>> delimited file so that "abc,def" will be abc,def.
>>>>>>>>>
>>>>>>>>> On Jun 26, 2012, at 2:51 PM, Harsh J wrote:
>>>>>>>>>
>>>>>>>>>> Hive's delimited-fields-format record reader does not handle
>>> quoted
>>>>>>>>>> text that carry the same delimiter within them. Excel supports
>>> such
>>>>>>>>>> records, so it reads it fine.
>>>>>>>>>>
>>>>>>>>>> You will need to create your table with a custom InputFormat
>>> class
>>>>>>>>>> that can handle this (Try using OpenCSV readers, they support
>>>> this),
>>>>>>>>>> instead of relying on Hive to do this for you. If you're
>>> successful
>>>>> in
>>>>>>>>>> your approach, please also consider contributing something back
>>> to
>>>>>>>>>> Hive/Pig to help others.
>>>>>>>>>>
>>>>>>>>>> On Wed, Jun 27, 2012 at 12:37 AM, Sandeep Reddy P
>>>>>>>>>> <[EMAIL PROTECTED]>  wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Hi all,
>>>>>>>>>>> I have a csv file with 46 columns but i'm getting error when i
>>> do
>>>>> some
>>>>>>>>>>> analysis on that data type. For simplification i have taken 3
>>>>> columns
>>>>>>>>> and
>>>>>>>>>>> now my csv is like
>>>>>>>>>>> c,zxy,xyz
>>>>>>>>>>> d,"abc,def",abcd
>>>>>>>>>>>
>>>>>>>>>>> i have created table for this data using,