Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - bulk-load bug ?


Copy link to this message
-
Re: bulk-load bug ?
Anoop John 2013-06-21, 08:44
When adding data to HBase with same key, it is the timestamp (ts) which
determines the version. Diff ts will make diff versions for the cell. But
in case of bulk load using ImportTSV tool, the ts used by one mapper will
be same. All the Puts created from it will have the same ts. The tool
allows user to have the ts for each row in the raw data file..  While
running the tool we can specify which column (in raw data file) should be
considered for finding the Put ts..    if u can pass this then u can
achieve what u look for.

-Anoop-

On Fri, Jun 21, 2013 at 1:58 PM, fx_bull <[EMAIL PROTECTED]> wrote:

> hello everyone
>
>
> When I use bulk-load to import datas to HBase,  I found that if I have
> some rowkey with same values,  only one of them imported to HBase!
>
> but I want to import all of them to HBase with different versions,  How
> should I do?
>
>
>
> Original data
>
> mike    18:20
> mike    16:20
> mike    19:20
> jone     17:20
>
> ….
>
>
> Data imported to HBase:
>
> mike  16:20
> jone   17:20
> ….
>
>
>
>