|
|
-
Re: Hive error when loading csv data.Thejas Nair 2012-06-28, 18:49
More options -
Official apache instructions for 1.0 - http://hadoop.apache.org/common/docs/r1.0.3/single_node_setup.html If you want to try it out on single node on Amazon ec2- Instructions for HDP distro - http://hortonworks.com/community/virtual-sandbox/ If you want a wizard based guided install on single node, you can use HDP for that as well - http://hortonworks.com/download/ Thanks, Thejas On 6/27/12 8:38 AM, Ruslan Al-Fakikh wrote: > Hi, > > You may try Cloudera's pseudo-distributed mode > https://ccp.cloudera.com/display/CDHDOC/CDH3+Deployment+in+Pseudo-Distributed+Mode > You may also try Cloudera's demo VM > https://ccp.cloudera.com/display/SUPPORT/Cloudera's+Hadoop+Demo+VM > > Regards, > Ruslan Al-Fakikh > > On Wed, Jun 27, 2012 at 4:39 PM, ramakanth reddy > <[EMAIL PROTECTED]> wrote: >> Hi >> >> Can any help me how to start working with hadoop in single Node and cluster >> environment,please send me some useful links. >> >> On Wed, Jun 27, 2012 at 4:50 PM, Subir S<[EMAIL PROTECTED]> wrote: >> >>> Pig has this CSVExcelStorage [1] and CSVLoader [2] as part of PiggyBank. It >>> may help. >>> >>> [1] >>> >>> http://pig.apache.org/docs/r0.9.2/api/org/apache/pig/piggybank/storage/CSVExcelStorage.html >>> [2] >>> >>> http://pig.apache.org/docs/r0.9.2/api/org/apache/pig/piggybank/storage/CSVLoader.html >>> >>> CCed pig user-list also. >>> >>> >>> On Wed, Jun 27, 2012 at 8:22 AM, Sandeep Reddy P< >>> [EMAIL PROTECTED]> wrote: >>> >>>> Thanks Michael Sorry i didnt get that soon. I'll try that and reply you >>>> back. >>>> >>>> On Tue, Jun 26, 2012 at 10:13 PM, Michel Segel< >>> [EMAIL PROTECTED] >>>>> wrote: >>>> >>>>> Sorry, >>>>> I was saying that you can write a python script that replaces the >>>>> delimiter with a | and ignore the commas within quotes. >>>>> >>>>> >>>>> Sent from a remote device. Please excuse any typos... >>>>> >>>>> Mike Segel >>>>> >>>>> On Jun 26, 2012, at 8:58 PM, Sandeep Reddy P< >>>> [EMAIL PROTECTED]> >>>>> wrote: >>>>> >>>>>> If i do that my data will be d|"abc|def"|abcd my problem is not >>> solved >>>>>> >>>>>> On Tue, Jun 26, 2012 at 6:48 PM, Michel Segel< >>>> [EMAIL PROTECTED] >>>>>> wrote: >>>>>> >>>>>>> Yup. I just didnt add the quotes. >>>>>>> >>>>>>> Sent from a remote device. Please excuse any typos... >>>>>>> >>>>>>> Mike Segel >>>>>>> >>>>>>> On Jun 26, 2012, at 4:30 PM, Sandeep Reddy P< >>>>> [EMAIL PROTECTED]> >>>>>>> wrote: >>>>>>> >>>>>>>> Thanks for the reply. >>>>>>>> I didnt get that Michael. My f2 should be "abc,def" >>>>>>>> >>>>>>>> On Tue, Jun 26, 2012 at 4:00 PM, Michael Segel< >>>>>>> [EMAIL PROTECTED]>wrote: >>>>>>>> >>>>>>>>> Alternatively you could write a simple script to convert the csv >>> to >>>> a >>>>>>> pipe >>>>>>>>> delimited file so that "abc,def" will be abc,def. >>>>>>>>> >>>>>>>>> On Jun 26, 2012, at 2:51 PM, Harsh J wrote: >>>>>>>>> >>>>>>>>>> Hive's delimited-fields-format record reader does not handle >>> quoted >>>>>>>>>> text that carry the same delimiter within them. Excel supports >>> such >>>>>>>>>> records, so it reads it fine. >>>>>>>>>> >>>>>>>>>> You will need to create your table with a custom InputFormat >>> class >>>>>>>>>> that can handle this (Try using OpenCSV readers, they support >>>> this), >>>>>>>>>> instead of relying on Hive to do this for you. If you're >>> successful >>>>> in >>>>>>>>>> your approach, please also consider contributing something back >>> to >>>>>>>>>> Hive/Pig to help others. >>>>>>>>>> >>>>>>>>>> On Wed, Jun 27, 2012 at 12:37 AM, Sandeep Reddy P >>>>>>>>>> <[EMAIL PROTECTED]> wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Hi all, >>>>>>>>>>> I have a csv file with 46 columns but i'm getting error when i >>> do >>>>> some >>>>>>>>>>> analysis on that data type. For simplification i have taken 3 >>>>> columns >>>>>>>>> and >>>>>>>>>>> now my csv is like >>>>>>>>>>> c,zxy,xyz >>>>>>>>>>> d,"abc,def",abcd >>>>>>>>>>> >>>>>>>>>>> i have created table for this data using, |