This is a good point. Our user guide is vague on the subject, but under the
hood, we are actually storing in each cell an integer id that is assigned
to the writer schema. KijiSchema maintains the id-to-schema mappings in a
metadata table (also stored in HBase) and looks them up as needed. I have
logged https://jira.kiji.org/browse/DOCS-2 to note this improvement
On Wed, Nov 14, 2012 at 9:58 PM, Asaf Mesika <[EMAIL PROTECTED]> wrote:
> This looks great!
> I have a question regarding schema. It is written in the user guide that
> the schema of a cell is saved next to the data in the cell. I presume it
> Takes more spaces, as schema is duplicated for each row this cell is saved
> Makes reading records slower since it needs to parse the Avro Schema
> before reading each cell
> Did I manage to understand the guide correctly?
> On 15 בנוב 2012, at 00:18, Aaron Kimball <[EMAIL PROTECTED]> wrote:
> > HBase fans,
> > I’m writing to announce the first release of KijiSchema, a new project to
> > help developers build applications on HBase. You can download it at
> > www.kiji.org. It is open source and published under the Apache 2
> > KijiSchema simplifies the development of applications on HBase by
> > developer-friendly Java APIs for storing and managing typed data using
> > As an application grows, developers can gracefully evolve the application
> > schema at the cell level to handle new fields. These features are
> > particularly well suited for entity-centric data schemas where all
> > information about a given entity, including dimensional and transaction
> > data, is encoded within the same row.
> > Column names and associations of columns with schemas are maintained in a
> > data dictionary; developers don’t need to rely on reading source code to
> > remember where data is stored.
> > Table schemas can be defined in JSON or by using KijiSchema’s declarative
> > DDL. Developers can also easily run MapReduce over Kiji tables in HBase
> > using included MR Input- and OutputFormats.
> > KijiSchema is an open and highly modular system. It runs on top of an
> > existing HBase 0.92 (CDH4) cluster, and can be run entirely on the client
> > with no server-side daemons. KijiSchema can also be downloaded as part
> of a
> > Kiji BentoBox, which provides a clean install of a mini-cluster of
> > HBase and Kiji on your laptop in under 15 min. You do not need to have
> > Hadoop or HBase pre-installed to run the BentoBox.
> > KijiSchema is inspired by work we have done at WibiData developing
> > applications for recommendations and personalization on top of HBase. We
> > will be developing and releasing other components into the Kiji project
> > provide additional functionality enabling easy development of data
> > applications on HBase, including improvements for MapReduce support and
> > querying tools. We welcome feedback and contributions from the community
> > the Kiji Project at www.kiji.org.
> > Regards,
> > - Aaron Kimball