Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive, mail # user - Is there a mechanism similar to hadoop -archive in hive (add archive is not apparently)

Copy link to this message
Is there a mechanism similar to hadoop -archive in hive (add archive is not apparently)
Stephen Boesch 2013-06-20, 12:32
We have a few dozen files that need to be made available to all
mappers/reducers in the cluster while running  hive transformation steps .

It seems the "add archive"  does not make the entries unarchived and thus
available directly on the default file path - and that is what we are
looking for.

To illustrate:

   add file modelfile.1;
   add file modelfile.2;
    add file modelfile.N;

  Then, our model that is invoked during the transformation step *does *have
correct access to its model files in the defaul path.

But .. those model files take low *minutes* to all load..

instead when we try:
   add archive  modelArchive.tgz.

The problem is the archive does not get exploded apparently ..

I have an archive for example that contains shell scripts under the "hive"
directory stored inside.  I am *not *able to access hive/my-shell-script.sh
 after adding the archive. Specifically the following fails:

$ tar -tvf appm*.tar.gz | grep launch-quixey_to_xml
-rwxrwxr-x stephenb/stephenb    664 2013-06-18 17:46

from (select transform (aappname,qappname)
*using *'*hive/parse_qx.py*' as (aappname2 string, qappname2 string) from
eqx ) o insert overwrite table c select o.aappname2, o.qappname2;

Cannot run program "hive/parse_qx.py": java.io.IOException: error=2,
No such file or directory
Stephen Sprague 2013-06-20, 14:50
Stephen Boesch 2013-06-20, 15:37
Stephen Sprague 2013-06-20, 15:58
Stephen Boesch 2013-06-20, 16:00
Stephen Sprague 2013-06-20, 16:15
Stephen Boesch 2013-06-20, 16:28
Ramki Palle 2013-06-20, 16:56
Stephen Boesch 2013-06-20, 17:23
Stephen Boesch 2013-06-20, 15:57