Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo, mail # user - What is the Communication and Time Complexity for Bulk Inserts?


Copy link to this message
-
Re: What is the Communication and Time Complexity for Bulk Inserts?
Josh Elser 2012-10-18, 15:20
Are you referring to "bulk inserts" as importing a pre-sorted rfile of
Key/Values or usinga BatchWriter?

On 10/18/12 10:49 AM, Jeff Kubina wrote:
> I am deriving the time complexities for an algorithm I implemented in
> Hadoop using Accumulo and need to know the time complexity of bulk
> inserting m records evenly distributed across p nodes into an empty
> table with p tablet servers. Assuming B is the bandwidth of the
> network, would the communication complexity be O(m/B) and the
> computation complexity O(m/p * log(m/p))? If the table contained n
> records would the values be O(m/B) and O(m/p * log(m/p) + n/p)?