Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - Pig and DistributedCache


Copy link to this message
-
Re: Pig and DistributedCache
Eugene Morozov 2013-02-20, 04:54
Rohini,

thanks a lot, I'll check the parameter.

On Wed, Feb 20, 2013 at 1:39 AM, Rohini Palaniswamy <[EMAIL PROTECTED]
> wrote:

> Eugene,
>   As I said earlier, you can use a different dfs.umaskmode. Running pig
> with -Ddfs.umaskmode=022 will give read access to all(755 instead of 700).
> But all the files output, by the pig script will have those permission.
>
> Better thing would be when you write the serialized file in the below step,
> write it with more accessible permissions.
> 2. After that client side builds the filter, serialize it and move it to
> server side.
>
> Regards,
> Rohini
>
>
> On Tue, Feb 19, 2013 at 4:26 AM, Eugene Morozov
> <[EMAIL PROTECTED]>wrote:
>
> > Rohini,
> >
> > Sorry for misleading in previous e-mails with these users. Here is more
> > robust explanation of my issue.
> >
> > This is what I've got when I've tried to run it.
> >
> > File has been successfully copied by using "tmpfiles".
> > 2013-02-08 13:38:56,533 INFO
> > org.apache.hadoop.hbase.filter.PrefixFuzzyRowFilterWithFile: File
> >
> >
> [/var/lib/hadoop-hdfs/cache/mapred/mapred/staging/vagrant/.staging/job_201302081322_0001/files/pairs-tmp#pairs-tmp]
> > has been found
> > 2013-02-08 13:38:56,539 ERROR
> > org.apache.hadoop.hbase.filter.PrefixFuzzyRowFilterWithFile: Cannot read
> > file:
> >
> >
> [/var/lib/hadoop-hdfs/cache/mapred/mapred/staging/vagrant/.staging/job_201302081322_0001/files/pairs-tmp#pairs-tmp]
> > org.apache.hadoop.security.AccessControlException: Permission denied:
> > user=hbase, access=EXECUTE,
> >
> >
> inode="/var/lib/hadoop-hdfs/cache/mapred/mapred/staging/vagrant/.staging":vagrant:supergroup:drwx------
> >
> >
> > org.apache.hadoop.hbase.filter.PrefixFuzzyRowFilterWithFile - it's my
> > filter, it just lives in org.apache... package.
> >
> > 1. I have user vagrant and this user runs pig script.
> > 2. After that client side builds the filter, serialize it and move it to
> > server side.
> > 3. RegionServer starts playing here: it deserializes the filter and tries
> > to use it while reading table.
> > 4. Filter in its turn tries to read the file, but since RegionServer has
> > been started under system user called "hbase", the filter also has
> > corresponding authentification and cannot access the file, which has been
> > written with another user.
> >
> > Any ideas of what to try?
> >
> > On Sun, Feb 17, 2013 at 8:22 AM, Rohini Palaniswamy <
> > [EMAIL PROTECTED]
> > > wrote:
> >
> > > Hi Eugene,
> > >       Sorry. Missed your reply earlier.
> > >
> > >     tmpfiles has been around for a while and will not be removed in
> > hadoop
> > > anytime soon. So don't worry about it. The hadoop configurations have
> > never
> > > been fully documented and people look at code and use them. They
> usually
> > > deprecate for  years before removing it.
> > >
> > >   The file will be created with the permissions based on the
> > dfs.umaskmode
> > > setting (or fs.permissions.umask-mode in Hadoop 0.23/2.x) and the owner
> > of
> > > the file will be the user who runs the pig script. The map job will be
> > > launched as the same user by the pig script. I don't understand what
> you
> > > mean by user runs map task does not have permissions. What kind of
> hadoop
> > > authentication are you are doing such that the file is created as one
> > user
> > > and map job is launched as another user?
> > >
> > > Regards,
> > > Rohini
> > >
> > >
> > > On Sun, Feb 10, 2013 at 10:26 PM, Eugene Morozov
> > > <[EMAIL PROTECTED]>wrote:
> > >
> > > > Hi, again.
> > > >
> > > > I've been able to successfully use the trick with DistributedCache
> and
> > > > "tmpfiles" - during run of my Pig script the files are copied by
> > > JobClient
> > > > to job-cache.
> > > >
> > > > But here is the issue. The files are there, but they have permission
> > 700
> > > > and user that runs maptask (I suppose it's hbase) doesn't have
> > permission
> > > > to read them. Permissions are belong to my current OS user.
> > >
Evgeny Morozov
Developer Grid Dynamics
Skype: morozov.evgeny
www.griddynamics.com
[EMAIL PROTECTED]