Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Accumulo >> mail # user >> Accumulo / HBase migration


+
Donald Miner 2013-07-09, 17:26
+
Christopher 2013-07-09, 17:35
+
Kurt Christensen 2013-07-09, 17:31
+
Donald Miner 2013-07-09, 17:36
+
dlmarion@... 2013-07-09, 18:20
Copy link to this message
-
Re: Accumulo / HBase migration
We could also just add a transformation from HFileReader ->
LocalityGroupReader, since I think HBase's storage model (forgive me if
there's a better term) maps pretty well to that.
On Tue, Jul 9, 2013 at 2:20 PM, <[EMAIL PROTECTED]> wrote:

> I believe that Brian Loss committed code in 1.5 for a column visibility
> correction iterator or something that you could use to do this. You could
> use that and compact the table after the import.
>
> ------------------------------
> *From: *"Donald Miner" <[EMAIL PROTECTED]>
> *To: *[EMAIL PROTECTED]
> *Sent: *Tuesday, July 9, 2013 1:36:20 PM
> *Subject: *Re: Accumulo / HBase migration
>
>
> I did think about this. My naive answer is just by default ignore
> visibilities (meaning make everything public or make everything the same
> visibility). It would be interesting however to be able to insert a chunk
> of code that inferred the visibility from the record itself. That is, you'd
> have a function you can pass in that returns a ColumnVisibility and takes
> in a value/rowkey/etc.
>
>
> On Tue, Jul 9, 2013 at 1:31 PM, Kurt Christensen <[EMAIL PROTECTED]>wrote:
>
>>
>> I don't have a response to your question, but it seems to me that the big
>> capability difference is visibility field. When doing bulk translations
>> like this, do you just fill visibility with some default value?
>>
>> -- Kurt
>>
>>
>> On 7/9/13 1:26 PM, Donald Miner wrote:
>>
>>>  Has anyone developed tools to migrate data from an existing HBase
>>> implementation to Accumulo? My team has done it "manually" in the past but
>>> it seems like it would be reasonable to write a process that handled the
>>> steps in a more automated fashion.
>>>
>>> Here are a few sample designs I've kicked around:
>>>
>>> HBase -> mapreduce -> mappers bulk write to accumulo -> Accumulo
>>> or
>>> HBase -> mapreduce -> tfiles via AccumuloFileOutputFormat -> Accumulo
>>> bulk load -> Accumulo
>>> or
>>> HBase -> bulk export -> map-only mapreduce to translate hfiles into
>>> tfiles (how hard would this be??) -> Accumulo bulk load -> Accumulo
>>>
>>> I guess this could be extended to go the other way around (and also
>>> include Cassandra perhaps).
>>>
>>> Maybe we'll start working on this soon. I just wanted to kick the idea
>>> out there to see if it's been done before or if anyone has some gut
>>> reactions to the process.
>>>
>>> -Don
>>>
>>> This communication is the property of ClearEdge IT Solutions, LLC and
>>> may contain confidential and/or privileged information. Any review,
>>> retransmissions, dissemination or other use of or taking of any action in
>>> reliance upon this information by persons or entities other than the
>>> intended recipient is prohibited. If you receive this communication in
>>> error, please immediately notify the sender and destroy all copies of the
>>> communication and any attachments.
>>>
>>
>> --
>>
>> Kurt Christensen
>> P.O. Box 811
>> Westminster, MD 21158-0811
>>
>> ------------------------------**------------------------------**
>> ------------
>> I'm not really a trouble maker. I just play one on TV.
>>
>
>
>
> --
>   *
> *Donald Miner
> Chief Technology Officer
> ClearEdge IT Solutions, LLC
> Cell: 443 799 7807
> www.clearedgeit.com
>
> This communication is the property of ClearEdge IT Solutions, LLC and may
> contain confidential and/or privileged information. Any review,
> retransmissions, dissemination or other use of or taking of any action in
> reliance upon this information by persons or entities other than the
> intended recipient is prohibited. If you receive this communication in
> error, please immediately notify the sender and destroy all copies of the
> communication and any attachments.
>
+
Keith Turner 2013-07-09, 17:39
+
Donald Miner 2013-07-09, 17:49
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB