-Re: Quick Question about Bulk loading of HFiles & Timestamps
Jacques 2011-08-05, 23:24
On Fri, Aug 5, 2011 at 3:53 PM, Todd Lipcon <[EMAIL PROTECTED]> wrote:
> Hi Jacques,
> Yes, the timestamps are set at the time the MR job runs, not the time
> they're loaded. So, you'll see the values from the job that wrote its
> output most recently.
> You can also specify timestamps explicitly for each KeyValue, if you
> On Fri, Aug 5, 2011 at 2:10 PM, Jacques <[EMAIL PROTECTED]> wrote:
> > Can someone confirm that bulk loading hfiles keeps cell timestamps from
> > overwriting each other.
> > For example:
> > I run mapreduce A job on Monday.
> > I run mapreduce B job on Tuesday.
> > I then run LoadIncrementalHFiles on job B first, followed by A.
> > Please confirm that at the intersection of outputs A & B will be the
> > from B.
> > Thanks,
> > Jacques
> Todd Lipcon
> Software Engineer, Cloudera