Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Accumulo >> mail # user >> Using Accumulo as input to a MapReduce job frequently hangs due to lost Zookeeper connection


+
Arjumand Bonhomme 2012-08-16, 07:59
+
Jim Klucar 2012-08-16, 11:22
+
Adam Fuchs 2012-08-16, 13:32
+
William Slacum 2012-08-16, 11:24
+
Arjumand Bonhomme 2012-08-16, 18:36
+
John Vines 2012-08-16, 19:24
+
Arjumand Bonhomme 2012-08-16, 19:48
+
Arjumand Bonhomme 2012-08-17, 02:10
+
Arjumand Bonhomme 2012-08-20, 17:00
+
Keith Turner 2012-08-20, 17:34
+
David Medinets 2012-08-21, 00:26
+
Keith Turner 2012-08-21, 12:23
Copy link to this message
-
Re: Using Accumulo as input to a MapReduce job frequently hangs due to lost Zookeeper connection
I have a related problem where I need to do a 1-1 join (every row in
table A joins with a unique row in table B and vice versa). My join
key is the row id of the table. In the past, I've used Hadoop's
CompositeInputFormat to do a map-side join over data in HDFS
(described here
http://www.congiu.com/joins-in-hadoop-using-compositeinputformat/)  My
tables in Accumulo seem to fit the eligibility criteria of
CompositeInputFormat: both tables are sorted by the join key, since
the join key is the row id in my case, and the tables are partitioned
the same way (i.e., same split points).

Has anyone tried using CompositeInputFormat over Accumulo tables? Is
it possible to configure CompositeInputFormat with
AccumuloInputFormat?

Thanks,
Ameet
On Tue, Aug 21, 2012 at 8:23 AM, Keith Turner <[EMAIL PROTECTED]> wrote:
> Yeah, that would certainly work.
>
> You could run two map only jobs (could run concurrently).  A job that
> reads D1 and writes to Table3 and a job that reads D2 and writes
> Table3.   Map reduce may be faster, unless you want the final result
> in Accumulo in which case this may be faster.  The two map reduce jobs
> could also produce files to bulk import into table3.
>
> Keith
>
> On Mon, Aug 20, 2012 at 8:26 PM, David Medinets
> <[EMAIL PROTECTED]> wrote:
>> Can you use a new table to join and then scan the new table? Use the foreign
>> key as the rowid. Basically create your own materialized view.
+
Billie Rinaldi 2012-10-11, 18:57
+
ameet kini 2012-10-17, 14:10
+
ameet kini 2012-10-17, 14:13
+
David Medinets 2012-08-17, 02:33
+
Arjumand Bonhomme 2012-08-17, 03:14