-Re: Best practice for storage of data that changes
bharath vissapragada 2012-11-25, 04:10
Please look at  . You can store your data in HBase tables and query them
normally just by mapping them to Hive tables. Regarding Cassandra support,
please follow JIRA , its not yet in the trunk I suppose!
On Sun, Nov 25, 2012 at 2:26 AM, jeff l <[EMAIL PROTECTED]> wrote:
> Hi All,
> I'm coming from the RDBMS world and am looking at hdfs for long term data
> storage and analysis.
> I've done some research and set up some smallish hdfs clusters with hive
> for testing but I'm having a little trouble understanding how everything
> fits together and was hoping someone could point me in the right direction.
> I'm looking at storing two types of data:
> 1. Append-only data - e.g. weblogs or user logins
> 2. Account/User data
> HDFS seems to be perfect for append-only data like #1, but I'm having
> trouble figuring out what to do with data that may change frequently.
> A simple example would be user data where various bits of information:
> email, etc may change from day to day. Would hbase or cassandra be the
> better way to go for this type of data, and can I overlay hive over all (
> hdfs, hbase, cassandra ) so that I can query the data through a single
> Thanks in advance for any help.