Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop, mail # user - Importing more than one column family in Hbase through Sqoop


Copy link to this message
-
Re: Importing more than one column family in Hbase through Sqoop
Kathleen Ting 2012-03-19, 19:58
Anil -

Understood. As it happens, the HBase release that supported atomicity came
after the Sqoop release that included HBase integration, hence the
limitation.

Please go ahead and file a Sqoop JIRA requesting that Sqoop needs a CLI way
to let the user specify multiple column families.

Regards, Kathleen

On Fri, Mar 16, 2012 at 3:09 PM, anil gupta <[EMAIL PROTECTED]> wrote:

> Hi Kathleen,
>
> Sorry for the delayed reply as i started working on HBase rather than
> Sqoop.
> Here is an example code from the book "HBase:The Definitive Guide" which
> will show that it is possible to load data into more than one column family
> through java api which was exactly the point i was trying to make.
>
> Have a look at these two classes:
>
> https://github.com/larsgeorge/hbase-book/blob/master/ch04/src/main/java/util/HBaseHelper.java
>
> https://github.com/larsgeorge/hbase-book/blob/master/ch04/src/main/java/filters/PrefixFilterExample.java
>
> Please let me know if you have further questions.
>
> Thanks,
> Anil
>
> On Fri, Feb 24, 2012 at 9:36 PM, Kathleen Ting <[EMAIL PROTECTED]>wrote:
>
>> Hi Anil,
>>
>> re: Is the above scenario not possible in Hbase Java api?
>> I would suggest asking that on [EMAIL PROTECTED].
>>
>> Thanks,
>> Kathleen
>>
>> On Wed, Feb 22, 2012 at 2:26 PM, anil gupta <[EMAIL PROTECTED]> wrote:
>>
>>> Hi Kathleen,
>>>
>>> I think my previous messages were misinterpreted, in previous message i
>>> was talking about generating separate put statement for separate
>>> columnfamily. I am having hard time understanding how this would violate
>>> the Hbase atomicity rule?
>>>
>>> For instance, on hbase shell my put statement would be like this for two
>>> column family:
>>> hbase shell>put 'merchant_data', '1', 'info:name', 'starbucks'
>>> hbase shell>put 'merchant_data', '1', 'user_reviews:id', '4545'
>>>
>>> Similarly, this can be achieved by using java api of HBase which sqoop
>>> is using. Is the above scenario not possible in Hbase Java api?
>>>
>>> Thanks,
>>> Anil
>>>
>>>
>>>
>>> On Wed, Feb 22, 2012 at 2:02 PM, Kathleen Ting <[EMAIL PROTECTED]>wrote:
>>>
>>>> Hi Anil -
>>>>
>>>> Good question and sorry for any confusion earlier. To be sure, because
>>>> HBase permits atomic operations across a single column family only, Sqoop
>>>> can not support multiple column families.
>>>>
>>>> Regards, Kathleen
>>>>
>>>> On Wed, Feb 22, 2012 at 12:43 PM, anil gupta <[EMAIL PROTECTED]>wrote:
>>>>
>>>>> Hi Kathleen,
>>>>>
>>>>> Yes, that is always an option. Thanks for suggestion.
>>>>>
>>>>> I am a beginner at HBase. However, I was thinking of cutting down the
>>>>> time to dump the data from Database. If i do it twice(assuming i have 2
>>>>> column families) then it increases the time of load the entire HBase table.
>>>>> AFAIK, Sqoop generates put statements to import data into HBase. If we
>>>>> can generate put statements for more than one column family. Would it
>>>>> violate the atomicity principle of HBase? I went through the atomicity
>>>>> section of http://hbase.apache.org/acid-semantics.html and I cant
>>>>> find anything which would stop sqoop loading more than one column family
>>>>> and Hbase bulk load also allows more than one column family although the
>>>>> approach of  HBase bulk loading might be different from Sqoop. Could you
>>>>> provide me more insight?  Sorry, if my question is dumb.
>>>>>
>>>>> Thanks,
>>>>> Anil Gupta
>>>>>
>>>>>
>>>>> On Wed, Feb 22, 2012 at 11:51 AM, Kathleen Ting <[EMAIL PROTECTED]
>>>>> > wrote:
>>>>>
>>>>>> Hi Anil,
>>>>>>
>>>>>> Sqoop does not support multiple column families because HBase only
>>>>>> permits atomic operations.
>>>>>>
>>>>>> One workaround is to run two imports, specifying a different column
>>>>>> family each time.
>>>>>>
>>>>>> Regards,
>>>>>> Kathleen
>>>>>>
>>>>>> On Wed, Feb 22, 2012 at 11:31 AM, anil gupta <[EMAIL PROTECTED]>wrote:
>>>>>>
>>>>>>> Hi All,
>>>>>>>
>>>>>>> I went through the User guide of Sqoop but i could not find anything