Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Chukwa >> mail # user >> Missing logs in hbase because of same timestamp

Copy link to this message
Missing logs in hbase because of same timestamp
I noticed that TsProcessor is using the timestamp as the key for putting logs
into hbase. But, my logs are coming in so fast that they have same timestamp
like this:

2012-01-20 20:03:14,041 [INFO] [communication thread]
[org.apache.hadoop.mapred.LocalJobRunner.statusUpdate()] 10 threads, 28
requests, 0 errors, 0 forbidden, 0.6 pages/s, 80 kb/s,
2012-01-20 20:03:14,852 [INFO] [Thread-274]
[jcrawler.fetch.mapreduce.FetchMapper.doWork()] -activeThreads=10,
spinWaiting=7, fetchQueues.totalSize=649
2012-01-20 20:03:14,852 [INFO] [Thread-274]
[jcrawler.fetch.mapreduce.FetchMapper.feedQueueManager()] feeding 649 input
urls ...
2012-01-20 20:03:14,852 [INFO] [Thread-274]
[jcrawler.fetch.mapreduce.FetchMapper.logHeapUsage()] Fetcher feeding queue
manager. Heap usage: 327668152 out of 932118528 bytes.

I think because of this, they are getting reduced and takes only one log for
a given timestamp.
Any idea how to fix this?


View this message in context: http://apache-chukwa.679492.n3.nabble.com/Missing-logs-in-hbase-because-of-same-timestamp-tp3677271p3677271.html
Sent from the Chukwa - Users mailing list archive at Nabble.com.