Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # dev >> [Discuss] project chop up

Copy link to this message
Re: [Discuss] project chop up
Summary from hive-irc channel. Minor edits for spell check/grammar.

The last 10 lines are a summary of the key points.

[10:59:17] <ecapriolo> noland: et all. Do you want to talk about hive in
[11:01:06] smonchi [~
Quit: ... 'cause there is no patch for human stupidity ...
[11:10:04] <noland> ecapriolo: yeah that sounds good to me!
[11:10:22] <noland> I saw you created the jira but haven't had time to look
[11:10:32] <ecapriolo> So I found a few things
[11:10:49] <ecapriolo> In common there is one or two testats that actually
fork a process :)
[11:10:56] <ecapriolo> and use build.test.resources
[11:11:12] <ecapriolo> Some serde, uses some methods from ql in testing
[11:11:27] <ecapriolo> and shims really needs a separate hadoop test shim
[11:11:32] <ecapriolo> But that is all simple stuff
[11:11:47] <ecapriolo> The biggest problem is I do not know how to solve
shims with maven
[11:11:50] <ecapriolo> do you have any ideas
[11:11:52] <ecapriolo> ?
[11:13:00] <noland> That one is going to be a challenge. It might be that
in that section we have to drop down to ant
[11:14:44] <noland> Is it a requirement that we build both the .20 and .23
shims for a "package" as we do today?
[11:16:46] <ecapriolo> I was thinking we can do it like a JDBC driver
[11:16:59] <ecapriolo> Se separate out the interface of shims
[11:17:22] <ecapriolo> And then at runtime we drop in a driver implementing
[11:17:34] Wertax [~[EMAIL PROTECTED]] has quit IRC: Remote host
closed the connection
[11:17:36] <ecapriolo> That or we could use maven's profile system
[11:18:09] <ecapriolo> It seems that everything else can actually link
against hadoop-0.20.2 as a provided dependency
[11:18:37] <noland> Yeah either would work. The driver method would
probably require use to use ant build both the drivers?
[11:18:44] <noland> I am a fan of mvn profiles
[11:19:05] <ecapriolo> I was thinking we kinda separate the shim out into
its own project,, not a module
[11:19:10] <ecapriolo> to achive that jdbc thing
[11:19:27] <ecapriolo> But I do not have a solution yet, I was looking to
farm that out to someone smart...like you :)
[11:19:33] <noland> :)
[11:19:47] <ecapriolo> All I know is that we need a test shim because
HadoopShim requires hadoop-test jars
[11:20:10] <ecapriolo> then the Mini stuff is only used in qtest anyway
[11:20:48] <ecapriolo> Is this something you want to help with? I was
thinking of spinning up a github
[11:20:50] <noland> I think that the separate projects would work and
perhaps nicely.
[11:21:01] <noland> Yeah I'd be interested in helping!
[11:21:17] <noland> But I am going on vacation starting next week for about
10 days
[11:21:27] <ecapriolo> Ah cool where are you going?
[11:21:37] <noland> Netherlands
[11:21:42] <noland> Biking around and such
[11:23:52] <noland> The one thing I was thinking about with regards to a
branch is keeping history. We'll want to keep history for the files but
AFAICT svn doesn't understand git mv.
[11:24:16] Wertax [~[EMAIL PROTECTED]] has joined #hive
[11:31:19] jeromatron [~[EMAIL PROTECTED]] has
quit IRC: Quit: My MacBook Pro has gone to sleep. ZZZzzz…
[11:35:49] <ecapriolo> noland: Right I do not play to suggest that we will
do this in git
[11:36:11] <ecapriolo> I just see that we are going to have to hack stuff
up and it is not the type of work that lends itself well to branches.
[11:36:17] <noland> Ahh ok
[11:36:56] <ecapriolo> Once we come up with a solution for the shims, and
we have something that can reasonably build and test hive we can figure out
how to apply that to a branch/trunk
[11:36:58] <noland> yeah so just do a POC on github and then implement on
[11:37:05] <noland> cool
[11:37:29] <ecapriolo> Along the way we can probably find things that we
can do like that common test I found and other minor things
[11:37:41] <noland> sounds good
[11:37:50] <ecapriolo> Those we can likely just commit into the current
trunk and I will file issues for those now
[11:37:58] <noland> cool
[11:38:41] <ecapriolo> But yea man. I just cant take the project as it is
[11:38:51] <ecapriolo> in eclipse everytime I touch a file it rebuilds
[11:38:53] <ecapriolo> Its like WTF
[11:39:09] <ecapriolo> Running one tests takes like 3 minutes
[11:39:12] <ecapriolo> its out of control
[11:39:23] <noland> LOL
[11:39:29] <noland> I agree 110%
[11:39:32] <ecapriolo> eclipse was not always like that I am not sure how
the hell it happened
[11:39:51] <noland> The eclipse sep thing is so harmful
[11:40:08] <noland> dep thing that is
[11:40:12] <ecapriolo> I mean command line ant was always bad, but you used
to be able to work in eclipse without having to rebuild everything every
[11:40:39] <noland> Yeah the first thing I do these days is disable the ant
[11:40:52] <ecapriolo> Ow... I did not really know that was a thing
[11:40:55] <noland> it starts compiling while you are still working and
blocks for minutes
[11:41:02] <ecapriolo> Right that is what I mean
[11:41:11] <ecapriolo> Everyone has like 10 hacks to work on the project
[11:41:14] <noland> yeah you can remove it in project…one sec
[11:41:17] <ecapriolo> perm gen
[11:41:20] <ecapriolo> ant builder
[11:41:32] <noland> project -> properties -> builders
[11:41:34] <ecapriolo> hive does not build offline anymore
[11:41:37] <noland> yeah
[11:41:47] <ecapriolo> Im not sure when this stuff went bad, but it has
gotten really really bad
[11:42:09] <ecapriolo> Also what I plan on doing is stripping out
[11:42:25] <ecapriolo> like serde has all this thrift and avro stuff to
support custom formats
[11:42:30] <ecapriolo> that is going into its own module
[11:42:43] <ecapriolo> Going to rip out all the udfs accept between and or.
[11:43:50] <noland> yeah it'd be nice to have those items in their own
modules so you can just build/test them when you want
[11:44:12] <ecapriolo> hbase zookeepe