I'm a big user of HCatalog from Pig but wish the HCat code used the
RESTful HTTP interface instead of Thrift because the version
incompatibilities in Thrift are frankly mind-boggling. We get around
them by shading the thrift classes used by HCat
Also, have many of scripts that are launched as part of larger
workflows from Oozie - nice but again we fight with classpath issues
all the time. Oozie at least has some nice classpath isolation between
hive and pig actions.
On Wed, Dec 4, 2013 at 10:31 AM, John Omernik <[EMAIL PROTECTED]> wrote:
> This can be an interesting subject, I know orgs that are all over on this
> questions. I'd be interested in hearing what you use, how it works for you,
> and what you are wishing you had in your interface.
> I'll start:
> We've used a number of things:
> - CLI for scheduled jobs. Pros: Solid running, fairly bug free. Cons: not
> for analysis of data, clunky in that regard.
> -SQL Squirrel via JDBC: Pros: Supported platform. Some nice analysis
> features (keeping old results, sorting of results once obtained, keeping of
> old queries Cons: Buggy with Hive, sometimes it just crashes for no reason,
> can be frustrating with lots of tabs, hard to extend and add little features
> for how you work (from my perspective)
> - Custom web based tools: pros designed around how we interact with our
> data. cons: no support, it currently has memory leak issues etc.
> - Apache Hue/Beeswax: Just starting to look into this now.
> I'd be curious on what you are using and challenges/wins you've had.