Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive, mail # user - Best practice for automating jobs


+
Tom Brown 2013-01-10, 22:03
Copy link to this message
-
Re: Best practice for automating jobs
Qiang Wang 2013-01-11, 01:31
I believe the HWI (Hive Web Interface) can give you a hand.

https://github.com/anjuke/hwi

You can use the HWI to submit and run queries concurrently.
Partition management can be achieved by creating crontabs using the HWI.

It's simple and easy to use. Hope it helps.

Regards,
Qiang
2013/1/11 Tom Brown <[EMAIL PROTECTED]>

> All,
>
> I want to automate jobs against Hive (using an external table with
> ever growing partitions), and I'm running into a few challenges:
>
> Concurrency - If I run Hive as a thrift server, I can only safely run
> one job at a time. As such, it seems like my best bet will be to run
> it from the command line and setup a brand new instance for each job.
> That quite a bit of a hassle to solves a seemingly common problem, so
> I want to know if there are any accepted patterns or best practices
> for this?
>
> Partition management - New partitions will be added regularly. If I
> have to setup multiple instances of Hive for each (potentially)
> overlapping job, it will be difficult to keep track of the partitions
> that have been added. In the context of the preceding question, what
> is the best way to add metadata about new partitions?
>
> Thanks in advance!
>
> --Tom
>
+
Tom Brown 2013-01-11, 02:55
+
Qiang Wang 2013-01-11, 03:06
+
Tom Brown 2013-01-11, 03:17
+
Qiang Wang 2013-01-11, 03:22
+
Sean McNamara 2013-01-10, 22:11
+
Dean Wampler 2013-01-10, 22:30
+
Alexander Alten-Lorenz 2013-01-11, 07:23
+
Manish Malhotra 2013-01-11, 18:56
+
Tom Brown 2013-01-11, 22:58