Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Data Deduplication in HBase

Copy link to this message
Re: Data Deduplication in HBase
bq.  Will hbase do some sort of deduplication?

I don't think so.

What is the granularity of segment overlap ? In the above example, it seems
to be 0.5

On Tue, Aug 27, 2013 at 7:12 AM, Anand Nalya <[EMAIL PROTECTED]> wrote:

> Hi,
> I have a use case in which I need to store segments of mp3 files in hbase.
> A song may come to the application in different ovelapping segments. For
> example, a 5 min song can have the following segments 0-1,0.5-2,2-4,3-5. As
> seen, some of the data is duplicate (3-4 is present in the last 2
> segments).
> What would be the ideal way of removing this duplicate storage? Will snappy
> compression help here or do I need to write some logic over HBase? Also,
> what if I store a single segment multiple times. Will hbase do some sort of
> deduplication?
> Regards,
> Anand