Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Pig and DistributedCache


Copy link to this message
-
Re: Pig and DistributedCache
Rohini,

thanks a lot, I'll check the parameter.

On Wed, Feb 20, 2013 at 1:39 AM, Rohini Palaniswamy <[EMAIL PROTECTED]
> wrote:

> Eugene,
>   As I said earlier, you can use a different dfs.umaskmode. Running pig
> with -Ddfs.umaskmode=022 will give read access to all(755 instead of 700).
> But all the files output, by the pig script will have those permission.
>
> Better thing would be when you write the serialized file in the below step,
> write it with more accessible permissions.
> 2. After that client side builds the filter, serialize it and move it to
> server side.
>
> Regards,
> Rohini
>
>
> On Tue, Feb 19, 2013 at 4:26 AM, Eugene Morozov
> <[EMAIL PROTECTED]>wrote:
>
> > Rohini,
> >
> > Sorry for misleading in previous e-mails with these users. Here is more
> > robust explanation of my issue.
> >
> > This is what I've got when I've tried to run it.
> >
> > File has been successfully copied by using "tmpfiles".
> > 2013-02-08 13:38:56,533 INFO
> > org.apache.hadoop.hbase.filter.PrefixFuzzyRowFilterWithFile: File
> >
> >
> [/var/lib/hadoop-hdfs/cache/mapred/mapred/staging/vagrant/.staging/job_201302081322_0001/files/pairs-tmp#pairs-tmp]
> > has been found
> > 2013-02-08 13:38:56,539 ERROR
> > org.apache.hadoop.hbase.filter.PrefixFuzzyRowFilterWithFile: Cannot read
> > file:
> >
> >
> [/var/lib/hadoop-hdfs/cache/mapred/mapred/staging/vagrant/.staging/job_201302081322_0001/files/pairs-tmp#pairs-tmp]
> > org.apache.hadoop.security.AccessControlException: Permission denied:
> > user=hbase, access=EXECUTE,
> >
> >
> inode="/var/lib/hadoop-hdfs/cache/mapred/mapred/staging/vagrant/.staging":vagrant:supergroup:drwx------
> >
> >
> > org.apache.hadoop.hbase.filter.PrefixFuzzyRowFilterWithFile - it's my
> > filter, it just lives in org.apache... package.
> >
> > 1. I have user vagrant and this user runs pig script.
> > 2. After that client side builds the filter, serialize it and move it to
> > server side.
> > 3. RegionServer starts playing here: it deserializes the filter and tries
> > to use it while reading table.
> > 4. Filter in its turn tries to read the file, but since RegionServer has
> > been started under system user called "hbase", the filter also has
> > corresponding authentification and cannot access the file, which has been
> > written with another user.
> >
> > Any ideas of what to try?
> >
> > On Sun, Feb 17, 2013 at 8:22 AM, Rohini Palaniswamy <
> > [EMAIL PROTECTED]
> > > wrote:
> >
> > > Hi Eugene,
> > >       Sorry. Missed your reply earlier.
> > >
> > >     tmpfiles has been around for a while and will not be removed in
> > hadoop
> > > anytime soon. So don't worry about it. The hadoop configurations have
> > never
> > > been fully documented and people look at code and use them. They
> usually
> > > deprecate for  years before removing it.
> > >
> > >   The file will be created with the permissions based on the
> > dfs.umaskmode
> > > setting (or fs.permissions.umask-mode in Hadoop 0.23/2.x) and the owner
> > of
> > > the file will be the user who runs the pig script. The map job will be
> > > launched as the same user by the pig script. I don't understand what
> you
> > > mean by user runs map task does not have permissions. What kind of
> hadoop
> > > authentication are you are doing such that the file is created as one
> > user
> > > and map job is launched as another user?
> > >
> > > Regards,
> > > Rohini
> > >
> > >
> > > On Sun, Feb 10, 2013 at 10:26 PM, Eugene Morozov
> > > <[EMAIL PROTECTED]>wrote:
> > >
> > > > Hi, again.
> > > >
> > > > I've been able to successfully use the trick with DistributedCache
> and
> > > > "tmpfiles" - during run of my Pig script the files are copied by
> > > JobClient
> > > > to job-cache.
> > > >
> > > > But here is the issue. The files are there, but they have permission
> > 700
> > > > and user that runs maptask (I suppose it's hbase) doesn't have
> > permission
> > > > to read them. Permissions are belong to my current OS user.
> > >
Evgeny Morozov
Developer Grid Dynamics
Skype: morozov.evgeny
www.griddynamics.com
[EMAIL PROTECTED]
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB