Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> AccessControlException in estimateNumberOfReducers


Copy link to this message
-
Re: AccessControlException in estimateNumberOfReducers
You should be able to workaround this issue by explicitly setting the
number of reducer (parallel keyword in the statements or define
default_parallel).
This is an unusual use case, but i don't see any harm in doing what you
suggest. Please feel free to open a jira and submit a patch.

Thanks,
Thejas
On 11/21/11 7:45 PM, Adam Portley wrote:
> I'm running into an issue with pig 0.9.1. My top-level data directory
> contains several files and directories with restricted permissions, and
> my LoadFunc and input format ignore these directories if the user does
> not have permission to read them. Unfortunately pig's execution engine
> still throws an exception.
>
> Example:
>
> $ hadoop fs -ls /data
> Found 2 items
> drwxr-xr-x - owner users 0 2011-11-16 06:47 /data/readable
> drwxr-x--- - owner secure 0 2011-11-16 06:48 /data/secure
>
> The /data/secure directory is readable only by users in the 'secure'
> group. Non-secure users encounter the following pig exception even
> though the loader and input format do not touch secure data:
>
> REGISTER my-jar;
> data = LOAD /data USING myLoader();
> (do something..)
>
> Caused by: org.apache.hadoop.security.AccessControlException:
> org.apache.hadoop.security.AccessControlException: Permission denied:
> user=<removed>, access=READ_EXECUTE, inode="secure":owner:secure:rwxr-x---
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>
> at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
> at
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:95)
>
> at
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:57)
>
> at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:669)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:280)
>
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getPathLength(JobControlCompiler.java:791)
>
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getPathLength(JobControlCompiler.java:794)
>
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getTotalInputFileSize(JobControlCompiler.java:779)
>
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.estimateNumberOfReducers(JobControlCompiler.java:739)
>
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:587)
>
> ... 12 more
>
>
> I think Pig should probably catch this exception and ignore unreadable
> directories when estimating the number of reducers.
>
> Thanks,
> --Adam
>
>