|
|
-
Question regarding bulkload : overwriting duplicate records
Gaetan Deputier 2013-02-08, 00:23
Hi HBase users,
I am using Hbase 0.92.1 from the cloudera distribution cdh4.1.1. I am loading bulk files using the ImportTsv job but i have an issue regarding records having a different cell value.
I guessed that the underlying Map/Reducer sets the timestamp to the currentTime. Is there a way to inform the Tsv job to read the timestamp from a column ?
I can still do my own hadoop mapper and then split the lines and treat them but i was wondering if the issue on the Hbase Jira (HBASE-5564) which is solving this problem would be released soon.
Regards,
G.
-
Re: Question regarding bulkload : overwriting duplicate records
Ted Yu 2013-02-08, 01:04
I logged HBASE-7793 to backport.
Cheers
On Thu, Feb 7, 2013 at 4:23 PM, Gaetan Deputier <[EMAIL PROTECTED]> wrote:
> Hi HBase users, > > I am using Hbase 0.92.1 from the cloudera distribution cdh4.1.1. > I am loading bulk files using the ImportTsv job but i have an issue > regarding records having a different cell value. > > I guessed that the underlying Map/Reducer sets the timestamp to the > currentTime. Is there a way to inform the Tsv job to read the timestamp > from a column ? > > I can still do my own hadoop mapper and then split the lines and treat them > but i was wondering if the issue on the Hbase Jira (HBASE-5564) which is > solving this problem would be released soon. > > Regards, > > G. >
-
Re: Question regarding bulkload : overwriting duplicate records
Gaetan Deputier 2013-02-08, 01:30
Thanks, I appreciate. Keep up the good work.
G. On Thu, Feb 7, 2013 at 5:04 PM, Ted Yu <[EMAIL PROTECTED]> wrote:
> I logged HBASE-7793 to backport. > > Cheers > > On Thu, Feb 7, 2013 at 4:23 PM, Gaetan Deputier <[EMAIL PROTECTED]> wrote: > > > Hi HBase users, > > > > I am using Hbase 0.92.1 from the cloudera distribution cdh4.1.1. > > I am loading bulk files using the ImportTsv job but i have an issue > > regarding records having a different cell value. > > > > I guessed that the underlying Map/Reducer sets the timestamp to the > > currentTime. Is there a way to inform the Tsv job to read the timestamp > > from a column ? > > > > I can still do my own hadoop mapper and then split the lines and treat > them > > but i was wondering if the issue on the Hbase Jira (HBASE-5564) which is > > solving this problem would be released soon. > > > > Regards, > > > > G. > > >
|
|
All projects made searchable here are trademarks of the Apache Software Foundation.
Service operated by
Sematext