Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Quick Question about Bulk loading of HFiles & Timestamps


Copy link to this message
-
Re: Quick Question about Bulk loading of HFiles & Timestamps
Jacques 2011-08-05, 23:24
Perfect.

thanks,
Jacques

On Fri, Aug 5, 2011 at 3:53 PM, Todd Lipcon <[EMAIL PROTECTED]> wrote:

> Hi Jacques,
>
> Yes, the timestamps are set at the time the MR job runs, not the time
> they're loaded. So, you'll see the values from the job that wrote its
> output most recently.
>
> You can also specify timestamps explicitly for each KeyValue, if you
> prefer.
>
> -Todd
>
> On Fri, Aug 5, 2011 at 2:10 PM, Jacques <[EMAIL PROTECTED]> wrote:
> > Can someone confirm that bulk loading hfiles keeps cell timestamps from
> > overwriting each other.
> >
> > For example:
> > I run mapreduce A job on Monday.
> > I run mapreduce B job on Tuesday.
> >
> > I then run LoadIncrementalHFiles on job B first, followed by A.
> >
> > Please confirm that at the intersection of outputs A & B will be the
> values
> > from B.
> >
> > Thanks,
> > Jacques
> >
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>