Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> Question HFile Insert Key or Update Key


Copy link to this message
-
Re: Question HFile Insert Key or Update Key
another question is how data locality is maintained while storing a file
data (considering we use hdfs as the layer between h/w and hbase). As per
bigtable, they store data as per column families... and I assume hbase to be
doing so. Is it like each col family is having a separate hfile(really?)...
If yes, how the relation with a row key is maintained with all such files.

[Zhu, I hope you don't mind stealing your thread... but my questions are
somewhat related to yours, so I did it that way (smile)...]

Thanks,
~Himanshu

On Sat, Aug 28, 2010 at 10:21 PM, zhixuan zhu <[EMAIL PROTECTED]>wrote:

>  Ryan,
>
> Thanks for your quick response.
>
> Since HFiles can not be modified in the HDFS once written, I guess the
> write
> buffer take all this modified data block in buffer and overwrite the whole
> HDFS data block corresponding to the HFile changed before.
>
> I  need reread the bigTable papers, always has questions...
>
> Thanks
>
>
>
> On Sat, Aug 28, 2010 at 11:58 PM, Ryan Rawson <[EMAIL PROTECTED]> wrote:
>
> > Hfiles are write once read many. Once written they cannot be modified so
> > there is way to move things around.
> >
> > Hbase deals with this by having a robust write buffer and writing large
> > files.
> >
> > For more architectural details check out the bigtable paper.
> >
> > On Aug 28, 2010 8:32 PM, "zhixuan zhu" <[EMAIL PROTECTED]> wrote:
> > > hey guys,
> > >
> > > I am studying the HFiles now and have a couple of questions.
> > >
> > > The keys in the HFiles are sorted. So when a key is inserted into a
> data
> > > block which is full and the key is smaller than the greatest key in
> this
> > > data block and greater than the smallest key in this data block. In
> this
> > > case, does the data block need reorganize? say keys greater than the
> > > inserted keys into next data block?.
> > >
> > > And if a value for a key needs update, how is this achieved in HFIle?
> > >
> > > Appreciate your time for answering my questions!
> > >
> > > Thanks
> > >
> > > Tim Zhu
> >
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB