|
Jack Levin
2011-06-06, 22:57
Bill Graham
2011-06-06, 23:56
Jack Levin
2011-06-07, 00:15
Stack
2011-06-07, 03:18
Jack Levin
2011-06-07, 04:09
Jack Levin
2011-06-07, 17:20
Stack
2011-06-07, 17:34
Joey Echeverria
2011-06-07, 18:32
Jack Levin
2011-06-07, 18:35
Stack
2011-06-07, 21:04
Joey Echeverria
2011-06-08, 02:35
|
-
exporting from hbase as text (tsv)Jack Levin 2011-06-06, 22:57
Hello, does anyone have any tools you could share that would take a
table, and dump the contents as TSV text format? We want it in tsv for quick HIVE processing that we have in the another datamining cluster. We do not want to write custom map-reduce jobs for hbase because we already have an extensive hive framework that does map-reduce processing. We could then use bulk import to load new tables in, if we have to. Thanks. -Jack
-
Re: exporting from hbase as text (tsv)Bill Graham 2011-06-06, 23:56
You can do this in a few lines of Pig, check out the HBaseStorage class.
You'll need to now the names of your column families, but besides that it could be done fairly generically. On Mon, Jun 6, 2011 at 3:57 PM, Jack Levin <[EMAIL PROTECTED]> wrote: > Hello, does anyone have any tools you could share that would take a > table, and dump the contents as TSV text format? We want it in tsv > for quick HIVE processing that we have in the another datamining > cluster. We do not want to write custom map-reduce jobs for hbase > because we already have an extensive hive framework that does > map-reduce processing. We could then use bulk import to load new > tables in, if we have to. > > Thanks. > > -Jack >
-
Re: exporting from hbase as text (tsv)Jack Levin 2011-06-07, 00:15
there is export tool that exports tables into sequence files, question
is, what do I do with those seq. files to convert them to text? -Jack On Mon, Jun 6, 2011 at 4:56 PM, Bill Graham <[EMAIL PROTECTED]> wrote: > You can do this in a few lines of Pig, check out the HBaseStorage class. > You'll need to now the names of your column families, but besides that it > could be done fairly generically. > > > > On Mon, Jun 6, 2011 at 3:57 PM, Jack Levin <[EMAIL PROTECTED]> wrote: > >> Hello, does anyone have any tools you could share that would take a >> table, and dump the contents as TSV text format? We want it in tsv >> for quick HIVE processing that we have in the another datamining >> cluster. We do not want to write custom map-reduce jobs for hbase >> because we already have an extensive hive framework that does >> map-reduce processing. We could then use bulk import to load new >> tables in, if we have to. >> >> Thanks. >> >> -Jack >> >
-
Re: exporting from hbase as text (tsv)Stack 2011-06-07, 03:18
You could hook up
http://hadoop.apache.org/common/docs/r0.20.1/api/org/apache/hadoop/mapreduce/lib/output/TextOutputFormat.html to a map that emit tsv lines (use the tsv escaping lib du jour to make sure tabs are properly escaped or just search and replace tabs in source yourself if not corrupting). Can you hook hive to hbase? St.Ack On Mon, Jun 6, 2011 at 5:15 PM, Jack Levin <[EMAIL PROTECTED]> wrote: > there is export tool that exports tables into sequence files, question > is, what do I do with those seq. files to convert them to text? > > -Jack > > On Mon, Jun 6, 2011 at 4:56 PM, Bill Graham <[EMAIL PROTECTED]> wrote: >> You can do this in a few lines of Pig, check out the HBaseStorage class. >> You'll need to now the names of your column families, but besides that it >> could be done fairly generically. >> >> >> >> On Mon, Jun 6, 2011 at 3:57 PM, Jack Levin <[EMAIL PROTECTED]> wrote: >> >>> Hello, does anyone have any tools you could share that would take a >>> table, and dump the contents as TSV text format? We want it in tsv >>> for quick HIVE processing that we have in the another datamining >>> cluster. We do not want to write custom map-reduce jobs for hbase >>> because we already have an extensive hive framework that does >>> map-reduce processing. We could then use bulk import to load new >>> tables in, if we have to. >>> >>> Thanks. >>> >>> -Jack >>> >> >
-
Re: exporting from hbase as text (tsv)Jack Levin 2011-06-07, 04:09
> Can you hook hive to hbase?
Yes, we used hbase to hive and back before, but its not real flexible, especially going hbase -> hive route. Much better prefer bulk uploader tool for modified tables via hive map-reduce of tsv or csv. -Jack
-
Re: exporting from hbase as text (tsv)Jack Levin 2011-06-07, 17:20
That would be a real nice feature though, imagine going to the shell,
and requesting a dump of your table. > load into outfile/hdfs '/tmp/table_a' from table_a Or something similar. > load from outfile/hdfs -- could power bulk uploader -Jack On Mon, Jun 6, 2011 at 9:09 PM, Jack Levin <[EMAIL PROTECTED]> wrote: >> Can you hook hive to hbase? > > Yes, we used hbase to hive and back before, but its not real flexible, > especially going hbase -> hive route. Much better prefer bulk uploader > tool for modified tables via hive map-reduce of tsv or csv. > > -Jack >
-
Re: exporting from hbase as text (tsv)Stack 2011-06-07, 17:34
On Tue, Jun 7, 2011 at 10:20 AM, Jack Levin <[EMAIL PROTECTED]> wrote:
> That would be a real nice feature though, imagine going to the shell, > and requesting a dump of your table. > But it would only work for mickey mouse tables, no? If your input/output is of any substantial size, wouldn't you want to MR it? St.Ack
-
Re: exporting from hbase as text (tsv)Joey Echeverria 2011-06-07, 18:32
I think Jack was suggesting that the shell would launch the MR jobs.
-Joey On Jun 7, 2011 1:34 PM, "Stack" <[EMAIL PROTECTED]> wrote: > On Tue, Jun 7, 2011 at 10:20 AM, Jack Levin <[EMAIL PROTECTED]> wrote: >> That would be a real nice feature though, imagine going to the shell, >> and requesting a dump of your table. >> > > But it would only work for mickey mouse tables, no? If your > input/output is of any substantial size, wouldn't you want to MR it? > St.Ack
-
Re: exporting from hbase as text (tsv)Jack Levin 2011-06-07, 18:35
Yes, shell could launch MR jobs; I bet a lot of people will find this
feature very useful. -Jack On Tue, Jun 7, 2011 at 11:32 AM, Joey Echeverria <[EMAIL PROTECTED]> wrote: > I think Jack was suggesting that the shell would launch the MR jobs. > > -Joey > On Jun 7, 2011 1:34 PM, "Stack" <[EMAIL PROTECTED]> wrote: >> On Tue, Jun 7, 2011 at 10:20 AM, Jack Levin <[EMAIL PROTECTED]> wrote: >>> That would be a real nice feature though, imagine going to the shell, >>> and requesting a dump of your table. >>> >> >> But it would only work for mickey mouse tables, no? If your >> input/output is of any substantial size, wouldn't you want to MR it? >> St.Ack >
-
Re: exporting from hbase as text (tsv)Stack 2011-06-07, 21:04
Dunno. Implication is that there is a MR cluster out there just
sitting idle waiting on the occasional job of whimsy (if the data is 'big'). And, we'd have to make our load/dump at least as smart as the ? equivalent (fill in one from the list pig, hive, cascading, etc.)? Then why not just fire up the pig/hive/etc shells.... File an issue though lads. Above is just an opinion. St.Ack On Tue, Jun 7, 2011 at 11:35 AM, Jack Levin <[EMAIL PROTECTED]> wrote: > Yes, shell could launch MR jobs; I bet a lot of people will find this > feature very useful. > > -Jack > > On Tue, Jun 7, 2011 at 11:32 AM, Joey Echeverria <[EMAIL PROTECTED]> wrote: >> I think Jack was suggesting that the shell would launch the MR jobs. >> >> -Joey >> On Jun 7, 2011 1:34 PM, "Stack" <[EMAIL PROTECTED]> wrote: >>> On Tue, Jun 7, 2011 at 10:20 AM, Jack Levin <[EMAIL PROTECTED]> wrote: >>>> That would be a real nice feature though, imagine going to the shell, >>>> and requesting a dump of your table. >>>> >>> >>> But it would only work for mickey mouse tables, no? If your >>> input/output is of any substantial size, wouldn't you want to MR it? >>> St.Ack >> >
-
Re: exporting from hbase as text (tsv)Joey Echeverria 2011-06-08, 02:35
Filed as HBASE-3960 (https://issues.apache.org/jira/browse/HBASE-3960).
-Joey On Tue, Jun 7, 2011 at 5:04 PM, Stack <[EMAIL PROTECTED]> wrote: > Dunno. Implication is that there is a MR cluster out there just > sitting idle waiting on the occasional job of whimsy (if the data is > 'big'). And, we'd have to make our load/dump at least as smart as the > ? equivalent (fill in one from the list pig, hive, cascading, etc.)? > Then why not just fire up the pig/hive/etc shells.... > > File an issue though lads. Above is just an opinion. > > St.Ack > > > On Tue, Jun 7, 2011 at 11:35 AM, Jack Levin <[EMAIL PROTECTED]> wrote: >> Yes, shell could launch MR jobs; I bet a lot of people will find this >> feature very useful. >> >> -Jack >> >> On Tue, Jun 7, 2011 at 11:32 AM, Joey Echeverria <[EMAIL PROTECTED]> wrote: >>> I think Jack was suggesting that the shell would launch the MR jobs. >>> >>> -Joey >>> On Jun 7, 2011 1:34 PM, "Stack" <[EMAIL PROTECTED]> wrote: >>>> On Tue, Jun 7, 2011 at 10:20 AM, Jack Levin <[EMAIL PROTECTED]> wrote: >>>>> That would be a real nice feature though, imagine going to the shell, >>>>> and requesting a dump of your table. >>>>> >>>> >>>> But it would only work for mickey mouse tables, no? If your >>>> input/output is of any substantial size, wouldn't you want to MR it? >>>> St.Ack >>> >> > -- Joseph Echeverria Cloudera, Inc. 443.305.9434 |