Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Quick Question about Bulk loading of HFiles & Timestamps


Copy link to this message
-
Re: Quick Question about Bulk loading of HFiles & Timestamps
Perfect.

thanks,
Jacques

On Fri, Aug 5, 2011 at 3:53 PM, Todd Lipcon <[EMAIL PROTECTED]> wrote:

> Hi Jacques,
>
> Yes, the timestamps are set at the time the MR job runs, not the time
> they're loaded. So, you'll see the values from the job that wrote its
> output most recently.
>
> You can also specify timestamps explicitly for each KeyValue, if you
> prefer.
>
> -Todd
>
> On Fri, Aug 5, 2011 at 2:10 PM, Jacques <[EMAIL PROTECTED]> wrote:
> > Can someone confirm that bulk loading hfiles keeps cell timestamps from
> > overwriting each other.
> >
> > For example:
> > I run mapreduce A job on Monday.
> > I run mapreduce B job on Tuesday.
> >
> > I then run LoadIncrementalHFiles on job B first, followed by A.
> >
> > Please confirm that at the intersection of outputs A & B will be the
> values
> > from B.
> >
> > Thanks,
> > Jacques
> >
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB