Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> any suggestions on IIS log storage and analysis?


Copy link to this message
-
Re: any suggestions on IIS log storage and analysis?
You can run a mapreduce firstly, Join these data sets into one data set.
then analyze the joined dataset.
On Mon, Dec 30, 2013 at 3:58 PM, Fengyun RAO <[EMAIL PROTECTED]> wrote:

> Hi,
>
> HDFS splits files into blocks, and mapreduce runs a map task for each
> block. However, Fields could be changed in IIS log files, which means
> fields in one block may depend on another, and thus make it not suitable
> for mapreduce job. It seems there should be some preprocess before storing
> and analyzing the IIS log files. We plan to parse each line to the same
> fields and store in Avro files with compression. Any other alternatives?
> Hbase?  or any suggestions on analyzing IIS log files?
>
> thanks!
>
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB