Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo, mail # dev - MultipleInputs with AccumuloInputFormat


Copy link to this message
-
Re: MultipleInputs with AccumuloInputFormat
Kevin Faro 2013-11-05, 17:13
I recently looked into that and came to the same realization.

I ended up writing a new input format that did the cartesian product of two
tables.  But to do that I had to store values for the left configuration
and right configuration and then copy over whichever config settings I
wanted to use for the AIF depending on which split i needed in the
RecordReader.

It would have been awesome if I could have just used the MultipleInputs ...

--Kevin
On Tue, Nov 5, 2013 at 10:24 AM, Josh Elser <[EMAIL PROTECTED]> wrote:

> In executing some MapReduce over Accumulo with the AccumuloInputFormat, I
> came to the realization that AIF fundamentally doesn't work with concepts
> like MultipleInputs in Hadoop (http://hadoop.apache.org/
> docs/current/api/org/apache/hadoop/mapreduce/lib/input/MultipleInputs.html).
> Given that you can only write one set of configuration for AIF into a
> Configuration object, there's not a mechanism to support multiple. This
> appears to be the case across all versions.
>
> Is this correct? Have I overlooked something?
>