Definitely you can do it. As a very basic solution you can ftp the
contents to LFS(LOCAL/Linux File System ) and they do a copyFromLocal into
HDFS. Create a hive table with appropriate regex support and load the data
in. Hive has classes that effectively support parsing and loading of Apache
log files into hive tables.
For the entite data transfer,you just need to write a shell script for the
same. Log analysis won't be real time right? So you can schedule the job
with some scheduler libe a cron or to be used in conjuction with hadoop
jobs you can use some workflow management within hadoop eco ecosystem.
On Wed, Oct 5, 2011 at 3:43 PM, Aditya Singh30
> We want to use Hadoop and Hive to store and analyze some Web Servers' Log
> files. The servers are running on windows platform. As mentioned about
> Hadoop, it is only supported for development on windows. I wanted to know is
> there a way that we can run the Hadoop server(namenode server) and cluster
> nodes on Linux, and have an interface using which we can send files and run
> analysis queries from the WebServer's windows environment.
> I would really appreciate if you could point me to a right direction.
> Aditya Singh
> Infosys. India
> **************** CAUTION - Disclaimer *****************
> This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended
> for the use of the addressee(s). If you are not the intended recipient,
> notify the sender by e-mail and delete the original message. Further, you
> are not
> to copy, disclose, or distribute this e-mail or its contents to any other
> person and
> any such actions are unlawful. This e-mail may contain viruses. Infosys has
> every reasonable precaution to minimize this risk, but is not liable for
> any damage
> you may sustain as a result of any virus in this e-mail. You should carry
> out your
> own virus checks before opening the e-mail or attachment. Infosys reserves
> right to monitor and review the content of all messages sent to or from
> this e-mail
> address. Messages sent to or from this e-mail address may be stored on the
> Infosys e-mail system.
> ***INFOSYS******** End of Disclaimer ********INFOSYS***