Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Bigtop >> mail # user >> Getting Started Guide? (and some installation issues)


Copy link to this message
-
Re: Getting Started Guide? (and some installation issues)
also, a good strategy with bigtop is to "start small"... Pick your use case, such as

- I want to run bigtops smoke tests against my hadoop cluster for mahout, hive and pig to ensure that everything is working properly

Or

- I want to create custom rpm installs for my own hadoop and hive distributions

Or

- I want a VM for my developers with hadoop, mr2 and the latest hbase version .

Generally I've found that there is usually support for specific problems on this mailing list. Sometimes you have to ask twice though because people get busy :)...

> On Nov 21, 2013, at 2:37 AM, Bruno Mahé <[EMAIL PROTECTED]> wrote:
> See inline.
>
>> On 11/20/2013 06:10 PM, Sean Mackrory wrote:
>> >> Is there a ‘getting started’ guide?
>>
>> Beyond just installation, most of our documentation is very
>> developer-centric, I'm afraid. What there is can be found on our wiki:
>> https://cwiki.apache.org/confluence/display/BIGTOP/Index
>>
>> >> Something that will describe the filesystem and configuration file
>> conventions?
>>
>> Bigtop is a distribution of other open-source projects, so there is no
>> single configuration system. The file conventions will vary from project
>> to project, however Bigtop does not modify much about how the
>> configuration files work, so I would refer you to the upstream projects
>> for details of their configuration files (eg. http://hadoop.apache.org,
>> http://hbase.apache.org)
>
>
> I would like also to point out that while it's true that each project has its own way to be configured, Apache Bigtop packages projects in ways which should be familiar to GNU/Linux users and sysadmins.
> For instance, service scripts a provided and available through the usual means, all the config files are in /etc/<project>/, all the logs in /var/log/<project>/...
> Most project's files can also be found in /usr/lib/<project>/
>
>
>
>
>> >> In particular the existence of these conf.empty directories is
>> confusing.
>>
>> The conf.dist and conf.empty directories provide some default or
>> template configuration files. You should create a directory at the same
>> level for your own configuration. Perhaps "conf.steven". There is a
>> symlink for each component at /etc/<component>/conf. This symlink,
>> through a system called "alternatives", eventually points to the
>> currently active configuration for that component. Once you have
>> modified the configuration to suit your needs, you can make it the
>> active configuration using the alternatives command. See here for it's
>> documentation:
>> http://linux.about.com/library/cmd/blcmdl8_alternatives.htm. For
>> example, if you look at the /etc/hadoop/conf symlink, you will probably
>> find that it points to /etc/alternatives/hadoop-conf. You can see how
>> the alternatives are configured and point the configuration to your new
>> folder like this:
>>
>> alternatives --display hadoop-conf
>> alternatives --set hadoop-conf /etc/hadoop/conf.steven
>>
>> >> Is Hue supposed to be configured separately, or is BigTop supposed
>> to do that?
>>
>> As I recall, the misconfigurations that are reported at startup are
>> things like services not running (like Oozie, etc.) Once you configure
>> and start those services, these warnings should disappear. For other
>> warnings, post them here and we'll see if we can help you.
>>
>> >> What is the target time to set-up a Hadoop installation via BigTop?
>>
>> Not sure what to tell you here. I regularly set up pseudo-distributed
>> Hadoop installations in minutes with little more than "yum install
>> hadoop-conf-pseudo", "sudo service hadoop-hdfs-namenode init" and a
>> reboot. If you're using a bunch of other services on a fully-distributed
>> cluster and you're completely new to this, I would expect it take hours
>> / days to get everything running. Bigtop also maintains puppet code that
>> will configure everything with a pretty good default configuration and
>> have your cluster working pretty much out-of-the-box. Maybe this is a