|
|
Mohit Anchlia 2012-08-04, 21:34
I am prototyping flume ng for real time sink to hbase and hdfs for clickstream. I was wondering if Flume provides something that can read from HBase too. That is API request comes to our web server and after we receive the request we use flume to retrieve the result from HBase. Is flume ng meant for this type of scenario?
Hari Shreedharan 2012-08-04, 21:49
Mohit,
Flume NG does not really have an HBase source. More importantly our sources do not actually pull data. It waits for data to be pushed to it, at least that is what all the current sources do. I don't know if it is a good idea to have a source proactively pull stuff from a source. But Flume is designed to make practically everything pluggable, so you can write your own source that does this. Hari
-- Hari Shreedharan On Saturday, August 4, 2012 at 2:34 PM, Mohit Anchlia wrote:
> I am prototyping flume ng for real time sink to hbase and hdfs for clickstream. I was wondering if Flume provides something that can read from HBase too. That is API request comes to our web server and after we receive the request we use flume to retrieve the result from HBase. Is flume ng meant for this type of scenario? >
Patrick Wendell 2012-08-04, 22:28
Mohit,
This sounds like something where you would want your web tier to directly access HBase through its existing client API.
Do you need functionality not offered there?
- Patrick
On Sat, Aug 4, 2012 at 2:49 PM, Hari Shreedharan <[EMAIL PROTECTED]> wrote: > Mohit, > > Flume NG does not really have an HBase source. More importantly our sources > do not actually pull data. It waits for data to be pushed to it, at least > that is what all the current sources do. I don't know if it is a good idea > to have a source proactively pull stuff from a source. But Flume is designed > to make practically everything pluggable, so you can write your own source > that does this. > > > Hari > > -- > Hari Shreedharan > > On Saturday, August 4, 2012 at 2:34 PM, Mohit Anchlia wrote: > > I am prototyping flume ng for real time sink to hbase and hdfs for > clickstream. I was wondering if Flume provides something that can read from > HBase too. That is API request comes to our web server and after we receive > the request we use flume to retrieve the result from HBase. Is flume ng > meant for this type of scenario? > >
Mohit Anchlia 2012-08-04, 22:37
On Sat, Aug 4, 2012 at 3:28 PM, Patrick Wendell <[EMAIL PROTECTED]> wrote:
> Mohit, > > This sounds like something where you would want your web tier to > directly access HBase through its existing client API. > > Do you need functionality not offered there? Yes that's how I am currently testing but I just thought it would be nice to use one framework for both read and write.
> > - Patrick > > On Sat, Aug 4, 2012 at 2:49 PM, Hari Shreedharan > <[EMAIL PROTECTED]> wrote: > > Mohit, > > > > Flume NG does not really have an HBase source. More importantly our > sources > > do not actually pull data. It waits for data to be pushed to it, at least > > that is what all the current sources do. I don't know if it is a good > idea > > to have a source proactively pull stuff from a source. But Flume is > designed > > to make practically everything pluggable, so you can write your own > source > > that does this. > > > > > > Hari > > > > -- > > Hari Shreedharan > > > > On Saturday, August 4, 2012 at 2:34 PM, Mohit Anchlia wrote: > > > > I am prototyping flume ng for real time sink to hbase and hdfs for > > clickstream. I was wondering if Flume provides something that can read > from > > HBase too. That is API request comes to our web server and after we > receive > > the request we use flume to retrieve the result from HBase. Is flume ng > > meant for this type of scenario? > > > > >
Hari Shreedharan 2012-08-05, 01:25
Let me correct what I said. You could implement a source that implements PollableSource interface to actually pull data. We do have a source that does pull data, the ExecSource - though it pulls data from the output stream of a local process. You could write a source that connects to HBase and pulls data out. The implementation can be on the lines of the ExecSource, except that you should replace the polling logic.
Hope this helps. Thanks, Hari
-- Hari Shreedharan On Saturday, August 4, 2012 at 3:37 PM, Mohit Anchlia wrote:
> > > On Sat, Aug 4, 2012 at 3:28 PM, Patrick Wendell <[EMAIL PROTECTED] (mailto:[EMAIL PROTECTED])> wrote: > > Mohit, > > > > This sounds like something where you would want your web tier to > > directly access HBase through its existing client API. > > > > Do you need functionality not offered there? > > Yes that's how I am currently testing but I just thought it would be nice to use one framework for both read and write. > > > > - Patrick > > > > On Sat, Aug 4, 2012 at 2:49 PM, Hari Shreedharan > > <[EMAIL PROTECTED] (mailto:[EMAIL PROTECTED])> wrote: > > > Mohit, > > > > > > Flume NG does not really have an HBase source. More importantly our sources > > > do not actually pull data. It waits for data to be pushed to it, at least > > > that is what all the current sources do. I don't know if it is a good idea > > > to have a source proactively pull stuff from a source. But Flume is designed > > > to make practically everything pluggable, so you can write your own source > > > that does this. > > > > > > > > > Hari > > > > > > -- > > > Hari Shreedharan > > > > > > On Saturday, August 4, 2012 at 2:34 PM, Mohit Anchlia wrote: > > > > > > I am prototyping flume ng for real time sink to hbase and hdfs for > > > clickstream. I was wondering if Flume provides something that can read from > > > HBase too. That is API request comes to our web server and after we receive > > > the request we use flume to retrieve the result from HBase. Is flume ng > > > meant for this type of scenario? > > > > > > >
|
|