Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Best practice for automating jobs

Copy link to this message
Re: Best practice for automating jobs
I believe the HWI (Hive Web Interface) can give you a hand.


You can use the HWI to submit and run queries concurrently.
Partition management can be achieved by creating crontabs using the HWI.

It's simple and easy to use. Hope it helps.

2013/1/11 Tom Brown <[EMAIL PROTECTED]>

> All,
> I want to automate jobs against Hive (using an external table with
> ever growing partitions), and I'm running into a few challenges:
> Concurrency - If I run Hive as a thrift server, I can only safely run
> one job at a time. As such, it seems like my best bet will be to run
> it from the command line and setup a brand new instance for each job.
> That quite a bit of a hassle to solves a seemingly common problem, so
> I want to know if there are any accepted patterns or best practices
> for this?
> Partition management - New partitions will be added regularly. If I
> have to setup multiple instances of Hive for each (potentially)
> overlapping job, it will be difficult to keep track of the partitions
> that have been added. In the context of the preceding question, what
> is the best way to add metadata about new partitions?
> Thanks in advance!
> --Tom