Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase, mail # user - Getting less write throughput due to more number of columns


+
Ankit Jain 2013-03-25, 16:49
Copy link to this message
-
RE: Getting less write throughput due to more number of columns
Anoop Sam John 2013-03-26, 06:28
When the number of columns (qualifiers) are more yes it can impact the performance. In HBase every where the storage will be in terms of KVs. The key will be some thing like rowkey+cfname+columnname+TS...

So when u have 26 cells in a put then there will be repetition of many bytes in the key.(One KV per column) So u will end up in transferring more data. Within memstore more data(actual KV data size) getting written and so more frequent flushes.. etc..

Have a look at Intel Panthera Document Store impl.

-Anoop-
________________________________________
From: Ankit Jain [[EMAIL PROTECTED]]
Sent: Monday, March 25, 2013 10:19 PM
To: [EMAIL PROTECTED]
Subject: Getting less write throughput due to more number of columns

Hi All,

I am writing a records into HBase. I ran the performance test on following
two cases:

Set1: Input record contains 26 columns and record size is 2Kb.

Set2: Input record contain 1 column and record size is 2Kb.

In second case I am getting 8MBps more performance than step.

are the large number of columns have any impact on write performance and If
yes, how we can overcome it.

--
Thanks,
Ankit Jain
+
Pankaj Gupta 2013-03-28, 14:26
+
Ted Yu 2013-03-28, 14:35