|
Ioan Eugen Stan
2012-01-25, 15:24
Ioan Eugen Stan
2012-01-25, 16:41
Doug Meil
2012-01-25, 16:44
Mike Spreitzer
2012-01-25, 17:05
Mike Spreitzer
2012-01-25, 21:04
Ulrich Staudinger
2012-01-25, 21:20
Doug Meil
2012-01-25, 21:23
Mike Spreitzer
2012-01-25, 21:28
Dalia Sobhy
2012-01-25, 23:22
Dalia Sobhy
2012-01-25, 23:29
Dalia Sobhy
2012-01-26, 08:57
Dalia Sobhy
2012-01-26, 10:13
Rohit Kelkar
2012-01-27, 07:06
|
-
Re: Important QuestionIoan Eugen Stan 2012-01-25, 15:24
Pe 25.01.2012 17:01, Dalia Sobhy a scris:
> > Dear all, > I am developing an API for medical use i.e Hospital admissions and all about patients, thus transactions and queries and realtime data is important here... > Therefore both real-time and analytical processing is a must.. > Therefore which best suits my application Hbase or Hive or another method ?? > Please reply quickly bec this is critical thxxx a million ;) HBase does Real time. Hive is more batch oriented. Please read each project's description. http://hive.apache.org/ http://hbase.apache.org/ -- Ioan Eugen Stan http://ieugen.blogspot.com
-
Re: Important QuestionIoan Eugen Stan 2012-01-25, 16:41
Pe 25.01.2012 18:30, Dalia Sobhy a scris:
> So what about HBQL?? > And if i had complex queries would i get stuck with HBase? Hbql seems to be unmaintained. Last update seems to be in jan 2011, one year ago. > > Also can anyone provide me with examples of a table in RDBMS transformed into hbase, realtime query and analytical processing.. > -- Ioan Eugen Stan http://ieugen.blogspot.com
-
Re: Important QuestionDoug Meil 2012-01-25, 16:44
Because you specifically cited the medical domain in your question, I think you might want talk to Explorys (disclaimer: I work there). Otherwise, you probably want to look at the HBase book. On 1/25/12 11:30 AM, "Dalia Sobhy" <[EMAIL PROTECTED]> wrote: >So what about HBQL?? >And if i had complex queries would i get stuck with HBase? > >Also can anyone provide me with examples of a table in RDBMS transformed >into hbase, realtime query and analytical processing.. > >Sent from my iPhone > >On 2012-01-25, at 6:15 PM, [EMAIL PROTECTED] wrote: > >> Real Time.. Definitely not hive. Go in for HBase, but don't expect >>Hbase to be as flexible as RDBMS. You need to choose your Row Key and >>Column Families wisely as per your requirements. >> For data mining and analytics you can mount Hive table over >>corresponding Hbase table and play on with SQL like queries. >> >> >> >> Regards >> Bejoy K S >> >> -----Original Message----- >> From: Dalia Sobhy <[EMAIL PROTECTED]> >> Date: Wed, 25 Jan 2012 17:01:08 >> To: <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> >> Reply-To: [EMAIL PROTECTED] >> Subject: Important Question >> >> >> Dear all, >> I am developing an API for medical use i.e Hospital admissions and all >>about patients, thus transactions and queries and realtime data is >>important here... >> Therefore both real-time and analytical processing is a must.. >> Therefore which best suits my application Hbase or Hive or another >>method ?? >> Please reply quickly bec this is critical thxxx a million ;) >> >
-
Re: Important QuestionMike Spreitzer 2012-01-25, 17:05
BTW, what do you mean by "realtime"? Do you mean you want to run some
non-trivial query quickly enough for some sort of interactive use? Can you give us a feel for the sort of queries that interest you? Thanks, Mike From: Dalia Sobhy <[EMAIL PROTECTED]> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> Cc: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> Date: 01/25/2012 11:34 AM Subject: Re: Important Question So what about HBQL?? And if i had complex queries would i get stuck with HBase? Also can anyone provide me with examples of a table in RDBMS transformed into hbase, realtime query and analytical processing.. Sent from my iPhone On 2012-01-25, at 6:15 PM, [EMAIL PROTECTED] wrote: > Real Time.. Definitely not hive. Go in for HBase, but don't expect Hbase to be as flexible as RDBMS. You need to choose your Row Key and Column Families wisely as per your requirements. > For data mining and analytics you can mount Hive table over corresponding Hbase table and play on with SQL like queries. > > > > Regards > Bejoy K S > > -----Original Message----- > From: Dalia Sobhy <[EMAIL PROTECTED]> > Date: Wed, 25 Jan 2012 17:01:08 > To: <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> > Reply-To: [EMAIL PROTECTED] > Subject: Important Question > > > Dear all, > I am developing an API for medical use i.e Hospital admissions and all about patients, thus transactions and queries and realtime data is important here... > Therefore both real-time and analytical processing is a must.. > Therefore which best suits my application Hbase or Hive or another method ?? > Please reply quickly bec this is critical thxxx a million ;)
-
RE: Important QuestionMike Spreitzer 2012-01-25, 21:04
Just a couple more questions. Your data will all be in one place, this is
not a federated architecture, right? How much data are we talking about? It sounds like you want to find/create/update/delete individual records and do simple aggregations over records identified by a conjunction of predicates on fields; is that right? Thanks, Mike (not on the hive mailing list)
-
Re: Important QuestionUlrich Staudinger 2012-01-25, 21:20
Hey everybody,
with the risk of being flamed and bbqued... to be absolutely honest, I think the NoSQL approach and with it HBase and all other alternatives don't fit your use case at all. You have a complex domain model, where it is very likely that you will want to search through your domain space by all possible attributes of your domain model. For example, patient has had diseases, prescriptions, etc. So, to make access into your data space fast, you want to have indices. You want to have sorting ascending and descending by all attributes. And preferably you don't want to have to think about building indexing logic yourself. Above all, you want to have referential integrity in your data space - patient data is not like wall messages where it really doesn't matter that much if one in a million is lost because something went awry. So transactions should be supported. On top of that, your patient data (not counting MRI scans or CT scans), is probably not going to be more then 10 mb per patient (if at all) - with 1 million users, you would have something like 10 terrabyte of data. With proper partitioning, you can easily manage that within an average database. but i maybe wrong and i am looking forward to hear another opinion. cheers, ulrich On Wed, Jan 25, 2012 at 10:04 PM, Mike Spreitzer <[EMAIL PROTECTED]>wrote: > Just a couple more questions. Your data will all be in one place, this is > not a federated architecture, right? How much data are we talking about? > It sounds like you want to find/create/update/delete individual records > and do simple aggregations over records identified by a conjunction of > predicates on fields; is that right? > > Thanks, > Mike (not on the hive mailing list) -- Ulrich Staudinger <http://goog_958005736>http://www.activequant.com Connect online: https://www.xing.com/profile/Ulrich_Staudinger
-
Re: Important QuestionDoug Meil 2012-01-25, 21:23
Hi there- As someone who works with medical data I take such analysis very seriously, but according to the World Health Organization there were 608 cases of measles reported in Egypt in 2011 (page 82). Granted, these are probably incidence and not prevalence statistics, but the order of magnitude of data in your use-case is relatively small. www.who.int/whosis/whostat/EN_WHS2011_Full.pdf Should you be considering something like MySQL? Or Microsoft Access? Or a spreadsheet? One of the things that the overview points out... http://hbase.apache.org/book.html#arch.overview ... is that HBase is really useful when you have a *lot* of data, but is also serious overkill and over-complexity if you don't. I'm saying this because I'd like to support your epidemiological research, and also because I'd like to prevent you from having a bad HBase experience especially when the use-case doesn't seem to warrant it. Doug On 1/25/12 3:56 PM, "Dalia Sobhy" <[EMAIL PROTECTED]> wrote: > >I will explain to u more Mike. >I am building a Software Oriented Architecture, I want my API to provide >some services such as Add/Delete Patients, Search for a patient by >name/ID, count the number of people who are suffering from measles in >Alexandria Egypt. >Something like that so I am wondering which best suits my API ?? > >> To: [EMAIL PROTECTED] >> CC: [EMAIL PROTECTED]; [EMAIL PROTECTED] >> Subject: Re: Important Question >> From: [EMAIL PROTECTED] >> Date: Wed, 25 Jan 2012 12:05:39 -0500 >> >> BTW, what do you mean by "realtime"? Do you mean you want to run some >> non-trivial query quickly enough for some sort of interactive use? Can >> you give us a feel for the sort of queries that interest you? >> >> Thanks, >> Mike >> >> >> >> From: Dalia Sobhy <[EMAIL PROTECTED]> >> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> >> Cc: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>, >> "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> >> Date: 01/25/2012 11:34 AM >> Subject: Re: Important Question >> >> >> >> So what about HBQL?? >> And if i had complex queries would i get stuck with HBase? >> >> Also can anyone provide me with examples of a table in RDBMS >>transformed >> into hbase, realtime query and analytical processing.. >> >> Sent from my iPhone >> >> On 2012-01-25, at 6:15 PM, [EMAIL PROTECTED] wrote: >> >> > Real Time.. Definitely not hive. Go in for HBase, but don't expect >>Hbase >> to be as flexible as RDBMS. You need to choose your Row Key and Column >> Families wisely as per your requirements. >> > For data mining and analytics you can mount Hive table over >> corresponding Hbase table and play on with SQL like queries. >> > >> > >> > >> > Regards >> > Bejoy K S >> > >> > -----Original Message----- >> > From: Dalia Sobhy <[EMAIL PROTECTED]> >> > Date: Wed, 25 Jan 2012 17:01:08 >> > To: <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> >> > Reply-To: [EMAIL PROTECTED] >> > Subject: Important Question >> > >> > >> > Dear all, >> > I am developing an API for medical use i.e Hospital admissions and >>all >> about patients, thus transactions and queries and realtime data is >> important here... >> > Therefore both real-time and analytical processing is a must.. >> > Therefore which best suits my application Hbase or Hive or another >> method ?? >> > Please reply quickly bec this is critical thxxx a million ;) >> >> >
-
RE: Important QuestionMike Spreitzer 2012-01-25, 21:28
A bit more grist for our mill: what transaction rate do you need to
support? Are you concerned with a lookup or aggregation query "correctly" including a record that is being concurrently updated? Thanks, Mike
-
RE: Important QuestionDalia Sobhy 2012-01-25, 23:22
So may be you are all right I found Hbase really complex..
So what are other alternatives I am already using Hadoop as my backend system? Kindly check apixio which is a similar medical system which adopts Hadoop so plz check and reply.. Bescause this is concerning my thesis part.. Thxx all for your sincere help :) > From: [EMAIL PROTECTED] > To: [EMAIL PROTECTED] > Subject: Re: Important Question > Date: Wed, 25 Jan 2012 21:23:13 +0000 > > > Hi there- > > As someone who works with medical data I take such analysis very > seriously, but according to the World Health Organization there were 608 > cases of measles reported in Egypt in 2011 (page 82). Granted, these are > probably incidence and not prevalence statistics, but the order of > magnitude of data in your use-case is relatively small. > > www.who.int/whosis/whostat/EN_WHS2011_Full.pdf > > > Should you be considering something like MySQL? Or Microsoft Access? Or > a spreadsheet? > > One of the things that the overview points out... > > http://hbase.apache.org/book.html#arch.overview > > > ... is that HBase is really useful when you have a *lot* of data, but is > also serious overkill and over-complexity if you don't. I'm saying this > because I'd like to support your epidemiological research, and also > because I'd like to prevent you from having a bad HBase experience > especially when the use-case doesn't seem to warrant it. > > Doug > > On 1/25/12 3:56 PM, "Dalia Sobhy" <[EMAIL PROTECTED]> wrote: > > > > >I will explain to u more Mike. > >I am building a Software Oriented Architecture, I want my API to provide > >some services such as Add/Delete Patients, Search for a patient by > >name/ID, count the number of people who are suffering from measles in > >Alexandria Egypt. > >Something like that so I am wondering which best suits my API ?? > > > >> To: [EMAIL PROTECTED] > >> CC: [EMAIL PROTECTED]; [EMAIL PROTECTED] > >> Subject: Re: Important Question > >> From: [EMAIL PROTECTED] > >> Date: Wed, 25 Jan 2012 12:05:39 -0500 > >> > >> BTW, what do you mean by "realtime"? Do you mean you want to run some > >> non-trivial query quickly enough for some sort of interactive use? Can > >> you give us a feel for the sort of queries that interest you? > >> > >> Thanks, > >> Mike > >> > >> > >> > >> From: Dalia Sobhy <[EMAIL PROTECTED]> > >> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> > >> Cc: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>, > >> "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> > >> Date: 01/25/2012 11:34 AM > >> Subject: Re: Important Question > >> > >> > >> > >> So what about HBQL?? > >> And if i had complex queries would i get stuck with HBase? > >> > >> Also can anyone provide me with examples of a table in RDBMS > >>transformed > >> into hbase, realtime query and analytical processing.. > >> > >> Sent from my iPhone > >> > >> On 2012-01-25, at 6:15 PM, [EMAIL PROTECTED] wrote: > >> > >> > Real Time.. Definitely not hive. Go in for HBase, but don't expect > >>Hbase > >> to be as flexible as RDBMS. You need to choose your Row Key and Column > >> Families wisely as per your requirements. > >> > For data mining and analytics you can mount Hive table over > >> corresponding Hbase table and play on with SQL like queries. > >> > > >> > > >> > > >> > Regards > >> > Bejoy K S > >> > > >> > -----Original Message----- > >> > From: Dalia Sobhy <[EMAIL PROTECTED]> > >> > Date: Wed, 25 Jan 2012 17:01:08 > >> > To: <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> > >> > Reply-To: [EMAIL PROTECTED] > >> > Subject: Important Question > >> > > >> > > >> > Dear all, > >> > I am developing an API for medical use i.e Hospital admissions and > >>all > >> about patients, thus transactions and queries and realtime data is > >> important here... > >> > Therefore both real-time and analytical processing is a must.. > >> > Therefore which best suits my application Hbase or Hive or another
-
RE: Important QuestionDalia Sobhy 2012-01-25, 23:29
Yes that's right!!
> To: [EMAIL PROTECTED] > CC: [EMAIL PROTECTED] > Subject: RE: Important Question > From: [EMAIL PROTECTED] > Date: Wed, 25 Jan 2012 16:04:57 -0500 > > Just a couple more questions. Your data will all be in one place, this is > not a federated architecture, right? How much data are we talking about? > It sounds like you want to find/create/update/delete individual records > and do simple aggregations over records identified by a conjunction of > predicates on fields; is that right? > > Thanks, > Mike (not on the hive mailing list)
-
RE: Important QuestionDalia Sobhy 2012-01-26, 08:57
what about Pig?? Please check this and tell me ur opinions.. http://hstreaming.com/docs/developer-guide/pig/ > To: [EMAIL PROTECTED] > CC: [EMAIL PROTECTED] > Subject: RE: Important Question > From: [EMAIL PROTECTED] > Date: Wed, 25 Jan 2012 16:28:42 -0500 > > A bit more grist for our mill: what transaction rate do you need to > support? Are you concerned with a lookup or aggregation query "correctly" > including a record that is being concurrently updated? > > Thanks, > Mike
-
Re: Important QuestionDalia Sobhy 2012-01-26, 10:13
Hii Doug,
How can i talk to you for the Explorsys may be it suits my application ?? Contact me asap.. Sent from my iPhone On 2012-01-25, at 6:45 PM, "Doug Meil" <[EMAIL PROTECTED]> wrote: > > Because you specifically cited the medical domain in your question, I > think you might want talk to Explorys (disclaimer: I work there). > > > Otherwise, you probably want to look at the HBase book. > > > On 1/25/12 11:30 AM, "Dalia Sobhy" <[EMAIL PROTECTED]> wrote: > >> So what about HBQL?? >> And if i had complex queries would i get stuck with HBase? >> >> Also can anyone provide me with examples of a table in RDBMS transformed >> into hbase, realtime query and analytical processing.. >> >> Sent from my iPhone >> >> On 2012-01-25, at 6:15 PM, [EMAIL PROTECTED] wrote: >> >>> Real Time.. Definitely not hive. Go in for HBase, but don't expect >>> Hbase to be as flexible as RDBMS. You need to choose your Row Key and >>> Column Families wisely as per your requirements. >>> For data mining and analytics you can mount Hive table over >>> corresponding Hbase table and play on with SQL like queries. >>> >>> >>> >>> Regards >>> Bejoy K S >>> >>> -----Original Message----- >>> From: Dalia Sobhy <[EMAIL PROTECTED]> >>> Date: Wed, 25 Jan 2012 17:01:08 >>> To: <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> >>> Reply-To: [EMAIL PROTECTED] >>> Subject: Important Question >>> >>> >>> Dear all, >>> I am developing an API for medical use i.e Hospital admissions and all >>> about patients, thus transactions and queries and realtime data is >>> important here... >>> Therefore both real-time and analytical processing is a must.. >>> Therefore which best suits my application Hbase or Hive or another >>> method ?? >>> Please reply quickly bec this is critical thxxx a million ;) >>> >> > >
-
Re: Important QuestionRohit Kelkar 2012-01-27, 07:06
Dalia,
You mentioned realtime, which of your use cases are realtime and whats an acceptable response time for it? You may want to try a combination of sql and nosql. Nosql to store your data for analytics purposes and the sql for realtime. I am assuming that your analytics needs would be based on huge amount of historical data which is not dependent on the data that is required in realtime. It would be very helpful if you could elaborate a typical analytics use case and a typical realtime use case that you want to be handled. - Rohit Kelkar On Thu, Jan 26, 2012 at 3:43 PM, Dalia Sobhy <[EMAIL PROTECTED]> wrote: > Hii Doug, > > How can i talk to you for the Explorsys may be it suits my application ?? > > Contact me asap.. > > Sent from my iPhone > > On 2012-01-25, at 6:45 PM, "Doug Meil" <[EMAIL PROTECTED]> wrote: > >> >> Because you specifically cited the medical domain in your question, I >> think you might want talk to Explorys (disclaimer: I work there). >> >> >> Otherwise, you probably want to look at the HBase book. >> >> >> On 1/25/12 11:30 AM, "Dalia Sobhy" <[EMAIL PROTECTED]> wrote: >> >>> So what about HBQL?? >>> And if i had complex queries would i get stuck with HBase? >>> >>> Also can anyone provide me with examples of a table in RDBMS transformed >>> into hbase, realtime query and analytical processing.. >>> >>> Sent from my iPhone >>> >>> On 2012-01-25, at 6:15 PM, [EMAIL PROTECTED] wrote: >>> >>>> Real Time.. Definitely not hive. Go in for HBase, but don't expect >>>> Hbase to be as flexible as RDBMS. You need to choose your Row Key and >>>> Column Families wisely as per your requirements. >>>> For data mining and analytics you can mount Hive table over >>>> corresponding Hbase table and play on with SQL like queries. >>>> >>>> >>>> >>>> Regards >>>> Bejoy K S >>>> >>>> -----Original Message----- >>>> From: Dalia Sobhy <[EMAIL PROTECTED]> >>>> Date: Wed, 25 Jan 2012 17:01:08 >>>> To: <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> >>>> Reply-To: [EMAIL PROTECTED] >>>> Subject: Important Question >>>> >>>> >>>> Dear all, >>>> I am developing an API for medical use i.e Hospital admissions and all >>>> about patients, thus transactions and queries and realtime data is >>>> important here... >>>> Therefore both real-time and analytical processing is a must.. >>>> Therefore which best suits my application Hbase or Hive or another >>>> method ?? >>>> Please reply quickly bec this is critical thxxx a million ;) >>>> >>> >> >> |