Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Bigtop, mail # user - [VOTE] Bigtop 0.7.0 RC0


Copy link to this message
-
Re: [VOTE] Bigtop 0.7.0 RC0
Bruno Mahé 2013-10-28, 09:47
On 10/18/2013 09:54 PM, Roman Shaposhnik wrote:
> This is the seventh release for Apache Bigtop, version 0.7.0
>
> It fixes the following issues:
>    http://s.apache.org/Pkp
>
> *** Please download, test and vote by Fri 10/25 noon PST
>
> Note that we are voting upon the source (tag):
>     release-0.7.0-RC0
>
> Source and binary files:
>    https://repository.apache.org/content/repositories/orgapachebigtop-194/org/apache/bigtop/bigtop/0.7.0/
>
> Binary convenience artifacts:
>     http://bigtop01.cloudera.org:8080/view/Releases/job/Bigtop-0.7.0/
>
> Documentation on how to install (just make sure to adjust the repos for 0.7.0):
>   https://cwiki.apache.org/confluence/display/BIGTOP/How+to+install+Hadoop+distribution+from+Bigtop+0.6.0
>
> Maven staging repo:
>     https://repository.apache.org/content/repositories/orgapachebigtop-194/
>
> The tag to be voted upon:
>     https://git-wip-us.apache.org/repos/asf?p=bigtop.git;a=commit;h=fb628180d289335dcf95641b44482fb680f11573
>
> Bigtop's KEYS file containing PGP keys we use to sign the release:
>     http://svn.apache.org/repos/asf/bigtop/dist/KEYS
>
> Thanks,
> Roman.
>
I am not voting yet since I still have some time, but so far I am
leaning toward a -1.

I am learning toward a -1 because of
https://issues.apache.org/jira/browse/BIGTOP-1129 and my issues with Hue.
Other than that, everything I tested either just works out of the box or
is nitpick.
But BIGTOP-1129 is what I would consider a blocker since it is part of
the basic use case of Apache Bigtop.

Things I tested:
* Apache Hadoop and some basic jobs
* Apache HBase and Phoenix. Just basic testing
* Apache Flume sending Apache Hadoop and Apache HBase logs to an
Elasticsearch instance and visualized through Kibana
* Apache Hue smoke tests
* Everything running on OpenJDK 6 on ec2 instances

Things I still want to test (or rather, things I hope I can test by
Tuesday evening):
* Apache Pig and datafu
* Apache Solr
* Load more data into Phoenix
Things we could do better:
* As described on BIGTOP-1129, I could not stop datanode/namenode
through init scripts.
* We could provide some templates for Apache Hadoop. I wasted a few
hours just to get the pi job running. Thankfully we have the init script
for hdfs (which needs some tweaks for the staging directory) and
templates for the configuration files in our puppet modules
* I enabled short-circuit in Apache HBase. Not sure if I missed
something, but I got some
"org.apache.hadoop.security.AccessControlException: Can't continue with
getBlockLocalPathInfo() authorization" exceptions. From reading
http://www.spaggiari.org/index.php/hbase/how-to-activate-hbase-shortcircuit
it seems there are a few things we could do to make it work out of the box
* Not sure what I did wrong but although I could access Hue UI, most
apps I tried were not working. Ex: all shells give me the error "value
222 for UID is less than the minimum UID allowed (500)". And the file
browser gives me the error "Cannot access: /. Note: You are a Hue admin
but not a HDFS superuser (which is "hdfs").". Note that the first user I
created was a user named "ec2-user". Although it is not an hdfs super
user, I would expect to have a working equivalent of what I can browse
with the "hdfs -ls" command. Also creating a hue user named "hdfs"
yields the same result. Note that I did not have time to dig further.
* Phoenix directly embeds Apache Hadoop, Apache HBase and Apache
Zookeeper jars. These jars should be symlinks.
* Phoenix required me to delete some old Apache lucene jars from Apache
Flume installation directory. From the output of the command "mvn
dependency:tree" on the flume project, it appears these jars are only
needed for the ElasticSearch and MorphlineSolrSink plugins. but Flume
documentation for both of these plugin explicitly ask users to provide
jars of Apache Lucene and Apache Solr/ElasticSearch themselves (since
they may use a different version of Apache Lucene). So the dependency on
Apache Lucene by Apache Flume should probably be marked as "provided"
and we should probably provide some packages to manage these dependencies.
* I still need to figure out why my instance of Hue needs access to
google-analytics.com
Other than that, it was an enjoyable experience to use Apache Bigtop
0.7.0RC0.
Doing SQL queries through Phoenix was pretty impressive and did not
require much work to setup.
Also seeing Apache Hadoop and Apache HBase logs being shipped by flume
to ElasticSearch and then being able to query events and create some
dynamic charts on kibana was exciting!
Also, since I am about to test Apache Solr, is there an equivalent to
Kibana I can use for visualizing my indexed logs?
Thanks,
Bruno