Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Hbase export / import Why doubling the Table Size ?


Copy link to this message
-
Re: Hbase export / import Why doubling the Table Size ?
Could you use the ComressionTest to verify that the library path is set up properly?

$ hbase org.apache.hadoop.hbase.util.CompressionTest hdfs://<your-namenode>:8020/<some-writable-path>/test.lzo lzo

Does it report OK? Same for Snappy? The reason I am asking is that when it does not find the native libs it uses no compression at all, and if your original was compressed then you will see the copied one being uncompressed and therefore much larger.

Also, what is the content like? How large are the cells that are stored?

Lars
On Dec 10, 2011, at 8:53 AM, Lord Khan Han wrote:

> I will check the reverse export imprt to cdh3b4 today to see is it same
> size in the cluster..
>
> when we use the hadoop dst copy how we candeal with the .META ? because we
> are copying 1 tabel not all and also there is region info in .META
> including their dns which is different offcoures in new  cluster.
>
> I tried the import again today with no compression.. It is doubled the
> exported file size!!  I mean I have 200gig exported hbase table size. when
> import without compression its going 400gig.. Its definitely writing twice
> something..
>
> thanks
>
>
>
> On Sat, Dec 10, 2011 at 2:19 AM, lars hofhansl <[EMAIL PROTECTED]> wrote:
>
>> There's copytable (also an MR job - written by J-D), but it reuses the
>> mapper class from the Import.java, so it
>> probably won't make a difference.
>>
>> What I meant to say below... When you export/import the table from your
>> CDH3u2 cluster back to your CDH3B4
>> cluster, is the size still doubled?
>>
>>
>> If both clusters are shutdown, you can use Hadoop's distcp to copy
>> directly on the filesystem level; in fact that might be your
>> best option.
>>
>> -- Lars
>>
>>
>> ----- Original Message -----
>> From: Lord Khan Han <[EMAIL PROTECTED]>
>> To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]>
>> Cc:
>> Sent: Friday, December 9, 2011 4:05 PM
>> Subject: Re: Hbase export / import Why doubling the Table Size ?
>>
>> Thanks for your time..
>>
>> Is there any reliable way to copy table between these cluster instead of
>> export/import?
>>
>>
>>
>> On Sat, Dec 10, 2011 at 1:39 AM, lars hofhansl <[EMAIL PROTECTED]>
>> wrote:
>>
>>> Hmm... I'm afraid I am out of options. If you want you can try to copy
>> the
>>> table
>>> from CHD3u2 and your CDH3B4 system, and see if the size remains doubled.
>>>
>>> Does this happen with very small table, too? If so, you could take a
>> small
>>> sample
>>> HFile and upload it (both the CHD3B4 and CDH3u2 versions) somewhere so
>>> that we can have a look.
>>>
>>>
>>> -- Lars
>>>
>>>
>>> ----- Original Message -----
>>> From: Lord Khan Han <[EMAIL PROTECTED]>
>>> To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]>
>>> Cc:
>>> Sent: Friday, December 9, 2011 2:45 PM
>>> Subject: Re: Hbase export / import Why doubling the Table Size ?
>>>
>>> in same configured cluster (carbon copy)  when I made import  there is no
>>> increas on size.. same size..
>>>
>>> problem in the cdh3u2..
>>>
>>>
>>> On Sat, Dec 10, 2011 at 12:42 AM, lars hofhansl <[EMAIL PROTECTED]>
>>> wrote:
>>>
>>>> What happens when you export/import into the same (CDH3B4) cluster
>> using
>>> a
>>>> new table name?
>>>> Does the size double as well?
>>>>
>>>>
>>>>
>>>> ----- Original Message -----
>>>> From: Lord Khan Han <[EMAIL PROTECTED]>
>>>> To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]>
>>>> Cc:
>>>> Sent: Friday, December 9, 2011 2:27 PM
>>>> Subject: Re: Hbase export / import Why doubling the Table Size ?
>>>>
>>>> I flush  ed  and major_compact  ed ..  nothing changed...   i am stuck
>>> this
>>>> last two days...:(  any idea?
>>>>
>>>>
>>>> On Sat, Dec 10, 2011 at 12:11 AM, Lord Khan Han <
>> [EMAIL PROTECTED]
>>>>> wrote:
>>>>
>>>>> Now flushed  and compacting again..
>>>>>
>>>>> one more clue:
>>>>>
>>>>> I tested to import CDH3B4 (same as exported cluster) with lzo..  all
>> is
>>>>> okay.. table size is same..