Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Permissions preventing me from inserting data into table I have just created


Copy link to this message
-
Re: Permissions preventing me from inserting data into table I have just created
I solved the problem by using a fully qualified path for  
hive.exec.scratchdir and then the umask trick worked. It turns out  
that hive was creating a different directory (on hdfs) than the one  
mapreduce was trying to write into, and that's why the umask didn't  
work. This remains a nasty workaround, and I wish someone would say  
how to do this right!

Quoting [EMAIL PROTECTED]:

> Thanks for the reply Tim. It is writable to all (permission 777). As  
> a side note, I have discovered now that the mapreduce task spawned  
> by the RCFileOutputDriver is setting mapred.output.dir to a folder  
> under file:// regardrless of the fs.default.name. This might be  
> expected beahviour, but I just wanted to note it.
>
> Quoting Tim Havens <[EMAIL PROTECTED]>:
>
>> make sure :/home/yaboulnaga/tmp/**hive-scratch/ is writeable by your
>> processes.
>>
>>
>> On Mon, Nov 26, 2012 at 10:07 AM, <[EMAIL PROTECTED]> wrote:
>>
>>> Hello,
>>>
>>> I'm using Cloudera's CDH4 with Hive 0.9 and Hive Server 2. I am trying to
>>> load data into hive using the JDBC driver (the one distributed with
>>> Cloudera CDH4 "org.apache.hive.jdbc.**HiveDriver". I can create the
>>> staging table and LOAD LOCAL into it. However when I try to insert data
>>> into a table with Columnar SerDe Stored As RCFILE I get an error caused by
>>> file permissions. I don't think that the SerDE or the Stored as parameters
>>> have anything to do with the problem but I mentioned them for completeness.
>>> The problem is that hive creates a temporary file in its scratch folder
>>> (local) owned by hive:hive with permissions 755, then pass it as an input
>>> to a mapper running as the user mapred:mapred. Now the mapper tries to
>>> create something inside the input folder (probably can do this elsewhere),
>>> and the following exception is thrown:
>>>
>>> org.apache.hadoop.hive.ql.**metadata.HiveException: java.io.IOException:
>>> Mkdirs failed to create file:/home/yaboulnaga/tmp/**
>>> hive-scratch/hive_2012-11-26_**10-46-44_887_**
>>> 2004468370569495405/_task_tmp.**-ext-10002
>>>        at org.apache.hadoop.hive.ql.io.**HiveFileFormatUtils.**
>>> getHiveRecordWriter(**HiveFileFormatUtils.java:237)
>>>        at org.apache.hadoop.hive.ql.**exec.FileSinkOperator.**
>>> createBucketFiles(**FileSinkOperator.java:477)
>>>        at org.apache.hadoop.hive.ql.**exec.FileSinkOperator.closeOp(**
>>> FileSinkOperator.java:709)
>>>        at org.apache.hadoop.hive.ql.**exec.Operator.close(Operator.**
>>> java:557)
>>>        at org.apache.hadoop.hive.ql.**exec.Operator.close(Operator.**
>>> java:566)
>>>        at org.apache.hadoop.hive.ql.**exec.Operator.close(Operator.**
>>> java:566)
>>>        at org.apache.hadoop.hive.ql.**exec.Operator.close(Operator.**
>>> java:566)
>>>        at org.apache.hadoop.hive.ql.**exec.Operator.close(Operator.**
>>> java:566)
>>>        at org.apache.hadoop.hive.ql.**exec.ExecMapper.close(**
>>> ExecMapper.java:193)
>>>        at org.apache.hadoop.mapred.**MapRunner.run(MapRunner.java:**57)
>>>        at org.apache.hadoop.mapred.**MapTask.runOldMapper(MapTask.**
>>> java:393)
>>>        at org.apache.hadoop.mapred.**MapTask.run(MapTask.java:327)
>>>        at org.apache.hadoop.mapred.**Child$4.run(Child.java:268)
>>>        at java.security.**AccessController.doPrivileged(**Native Method)
>>>        at javax.security.auth.Subject.**doAs(Subject.java:396)
>>>        at org.apache.hadoop.security.**UserGroupInformation.doAs(**
>>> UserGroupInformation.java:**1332)
>>>        at org.apache.hadoop.mapred.**Child.main(Child.java:262)
>>>
>>>
>>> As you might have noticed, I moved the scrach folder to a directory under
>>> my home dir so that I can give this directory 777 permissions. The idea was
>>> to use hive.files.umask.value of 0000 to cause subdirectories to inherit
>>> the same open permission (not the best workaround, but wouldn't hurt on my
>>> local machine). Unfortunately this didn't work even when I added a umask
>>> 0000 to /etc/init.d/hiveserver2. Can someone please tell me what's the

Best regards,
Younos Aboulnaga

Masters candidate
David Cheriton school of computer science
University of Waterloo
http://cs.uwaterloo.ca

E-Mail: [EMAIL PROTECTED]
Mobile: +1 (519) 497-5669