Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Is there a mechanism similar to hadoop -archive in hive (add archive is not apparently)


Copy link to this message
-
Re: Is there a mechanism similar to hadoop -archive in hive (add archive is not apparently)
Good eyes Ramki!  thanks this "directory" in place of filename appears to
be working.  The script is getting loaded now using the "Attempt two" i.e.
 the hivetry/classification_wf.py as the script path.

thanks again.

stephenb
2013/6/20 Ramki Palle <[EMAIL PROTECTED]>

> In the *Attempt two, *are you not supposed to use "hivetry" as the
> directory?
>
> May be you should try giving the full path "
> /opt/am/ver/1.0/hive/hivetry/classifier_wf.py" and see if it works.
>
> Regards,
> Ramki.
>
>
> On Thu, Jun 20, 2013 at 9:28 AM, Stephen Boesch <[EMAIL PROTECTED]> wrote:
>
>>
>> Stephen:  would you be willing to share an example of specifying a
>> "directory" as the  add "file" target?    I have not seen this working
>>
>> I have attempted to use it as follows:
>>
>> *We will access a script within the "hivetry" directory located here:*
>> hive> ! ls -l  /opt/am/ver/1.0/hive/hivetry/classifier_wf.py;
>> -rwxrwxr-x 1 hadoop hadoop 11241 Jun 18 19:37
>> /opt/am/ver/1.0/hive/hivetry/classifier_wf.py
>>
>> *Add the directory  to hive:*
>> hive> add file /opt/am/ver/1.0/hive/hivetry;
>> Added resource: /opt/am/ver/1.0/hive/hivetry
>>
>> *Attempt to run transform query using that script:*
>> *
>> *
>> *Attempt one: use the script name unqualified:*
>>
>> hive>    from (select transform (aappname,qappname) using 'classifier_wf.py' as (aappname2 string, qappname2 string) from eqx ) o insert overwrite table c select o.aappname2, o.qappname2;
>>
>>
>> (Failed:   Caused by: java.io.IOException: Cannot run program "classifier_wf.py": java.io.IOException: error=2, No such file or directory)
>>
>>
>> *Attempt two: use the script name with the directory name prefix: *
>>
>> hive>    from (select transform (aappname,qappname) using 'hive/classifier_wf.py' as (aappname2 string, qappname2 string) from eqx ) o insert overwrite table c select o.aappname2, o.qappname2;
>>
>>
>> (Failed:   Caused by: java.io.IOException: Cannot run program "hive/classifier_wf.py": java.io.IOException: error=2, No such file or directory)
>>
>>
>>
>>
>>
>> 2013/6/20 Stephen Sprague <[EMAIL PROTECTED]>
>>
>>> yeah.  the archive isn't unpacked on the remote side. I think add
>>> archive is mostly used for finding java packages since CLASSPATH will
>>> reference the archive (and as such there is no need to expand it.)
>>>
>>>
>>> On Thu, Jun 20, 2013 at 9:00 AM, Stephen Boesch <[EMAIL PROTECTED]>wrote:
>>>
>>>> thx for the tip on "add <file>" where <file> is directory. I will try
>>>> that.
>>>>
>>>>
>>>> 2013/6/20 Stephen Sprague <[EMAIL PROTECTED]>
>>>>
>>>>> i personally only know of adding a .jar file via add archive but my
>>>>> experience there is very limited.  i believe if you 'add file' and the file
>>>>> is a directory it'll recursively take everything underneath but i know of
>>>>> nothing that inflates or un tars things on the remote end automatically.
>>>>>
>>>>> i would 'add file' your python script and then within that untar your
>>>>> tarball to get at your model data. its just the matter of figuring out the
>>>>> path to that tarball that's kinda up in the air when its added as 'add
>>>>> file'.  Yeah. "local downlooads directory".  What's the literal path is
>>>>> what i'd like to know. :)
>>>>>
>>>>>
>>>>> On Thu, Jun 20, 2013 at 8:37 AM, Stephen Boesch <[EMAIL PROTECTED]>wrote:
>>>>>
>>>>>>
>>>>>> @Stephen:  given the  'relative' path for hive is from a local
>>>>>> downloads directory on each local tasktracker in the cluster,  it was my
>>>>>> thought that if the archive were actually being expanded then
>>>>>> somedir/somefileinthearchive  should work.  I will go ahead and test this
>>>>>> assumption.
>>>>>>
>>>>>> In the meantime, is there any facility available in hive for making
>>>>>> archived files available to hive jobs?  archive or hadoop archive ("har")
>>>>>> etc?
>>>>>>
>>>>>>
>>>>>> 2013/6/20 Stephen Sprague <[EMAIL PROTECTED]>
>>>>>>
>>>>>>> what would be interesting would be to run a little experiment and
>>>>>>> find out what the default PATH is on your data nodes.  How much of a pain
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB