Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Chukwa, mail # user - Missing logs in hbase because of same timestamp


Copy link to this message
-
Missing logs in hbase because of same timestamp
Abhijit Dhar 2012-01-21, 04:20
I noticed that TsProcessor is using the timestamp as the key for putting logs
into hbase. But, my logs are coming in so fast that they have same timestamp
like this:

2012-01-20 20:03:14,041 [INFO] [communication thread]
[org.apache.hadoop.mapred.LocalJobRunner.statusUpdate()] 10 threads, 28
requests, 0 errors, 0 forbidden, 0.6 pages/s, 80 kb/s,
2012-01-20 20:03:14,852 [INFO] [Thread-274]
[jcrawler.fetch.mapreduce.FetchMapper.doWork()] -activeThreads=10,
spinWaiting=7, fetchQueues.totalSize=649
2012-01-20 20:03:14,852 [INFO] [Thread-274]
[jcrawler.fetch.mapreduce.FetchMapper.feedQueueManager()] feeding 649 input
urls ...
2012-01-20 20:03:14,852 [INFO] [Thread-274]
[jcrawler.fetch.mapreduce.FetchMapper.logHeapUsage()] Fetcher feeding queue
manager. Heap usage: 327668152 out of 932118528 bytes.

I think because of this, they are getting reduced and takes only one log for
a given timestamp.
Any idea how to fix this?

Thanks,

--
View this message in context: http://apache-chukwa.679492.n3.nabble.com/Missing-logs-in-hbase-because-of-same-timestamp-tp3677271p3677271.html
Sent from the Chukwa - Users mailing list archive at Nabble.com.
+
Eric Yang 2012-01-21, 04:31
+
Abhijit Dhar 2012-01-26, 03:16
+
Abhijit Dhar 2012-01-26, 03:50
+
Eric Yang 2012-01-26, 05:39