Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # user >> Accumulo, Nutch, and GORA


Copy link to this message
-
Re: Accumulo, Nutch, and GORA
Eric,

Thanks.  I haven't run a large scale test with this yet, so far it was
just one node.  I suspect in a distributed environment, as long as you
pre-split your table, you should get excellent ingest rates.  As I do
more testing I will blog about it.

Thanks,

--Jason

On Tue, Feb 28, 2012 at 10:11 AM, Eric Newton <[EMAIL PROTECTED]> wrote:
> Very cool.  Thanks for the link back to the wikiexample page!
>
> What sort of performance do you see?   How fast can you ingest the internet?
>
> -Eric
>
>
> On Tue, Feb 28, 2012 at 6:54 AM, Jason Trost <[EMAIL PROTECTED]> wrote:
>>
>> Blog post for anyone who's interested.  I cover a basic howto for
>> getting Nutch to use Apache Gora to store web crawl data in Accumulo.
>> Let me know if you have any questions.
>>
>> Accumulo, Nutch, and GORA
>> http://www.covert.io/post/18414889381/accumulo-nutch-and-gora
>>
>> --Jason
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB