|
|
-
Re: Best practice for storage of data that changesbharath vissapragada 2012-11-25, 04:10
Hi Jeff,
Please look at [1] . You can store your data in HBase tables and query them normally just by mapping them to Hive tables. Regarding Cassandra support, please follow JIRA [2], its not yet in the trunk I suppose! [1] https://cwiki.apache.org/Hive/hbaseintegration.html [2] https://issues.apache.org/jira/browse/HIVE-1434 Thanks, On Sun, Nov 25, 2012 at 2:26 AM, jeff l <[EMAIL PROTECTED]> wrote: > Hi All, > > I'm coming from the RDBMS world and am looking at hdfs for long term data > storage and analysis. > > I've done some research and set up some smallish hdfs clusters with hive > for testing but I'm having a little trouble understanding how everything > fits together and was hoping someone could point me in the right direction. > > I'm looking at storing two types of data: > > 1. Append-only data - e.g. weblogs or user logins > 2. Account/User data > > HDFS seems to be perfect for append-only data like #1, but I'm having > trouble figuring out what to do with data that may change frequently. > > A simple example would be user data where various bits of information: > email, etc may change from day to day. Would hbase or cassandra be the > better way to go for this type of data, and can I overlay hive over all ( > hdfs, hbase, cassandra ) so that I can query the data through a single > interface? > > Thanks in advance for any help. > -- Regards, Bharath .V w:http://researchweb.iiit.ac.in/~bharath.v +
Mahesh Balija 2012-11-25, 12:52
+
anil gupta 2012-11-25, 21:11
+
Peyman Mohajerian 2012-11-24, 22:32
|