-Re: Announcing KijiSchema for HBase schema management
Lee Sheng 2012-11-15, 23:57
The row keys(EntityIds) in Kiji act as a translation layer between
some unique identifier(possibly, but not necessarily, derived from the
row data) and the HBase row keys. There isn't yet support for
composite row keys at this time. Depending on what you're trying to
do, some of the KijiRowFilters may suffice for doing partial matches
of the data for scanners.
On Thu, Nov 15, 2012 at 11:19 AM, Asaf Mesika <[EMAIL PROTECTED]> wrote:
> Thanks, that's great! Truly an awesome project.
> Is there a way to specify a composite row key composed of the fields
> specified in the table schema much like a definition of a primary key
> in oracle table?
> For example a rowkey can look like: (CustomerID)(StartTimeMs)(RequestId)
> Sent from my iPhone
> On 15 בנוב 2012, at 20:26, Aaron Kimball <[EMAIL PROTECTED]> wrote:
>> Hi Asaf,
>> This is a good point. Our user guide is vague on the subject, but under the
>> hood, we are actually storing in each cell an integer id that is assigned
>> to the writer schema. KijiSchema maintains the id-to-schema mappings in a
>> metadata table (also stored in HBase) and looks them up as needed. I have
>> logged https://jira.kiji.org/browse/DOCS-2 to note this improvement
>> - Aaron
>> On Wed, Nov 14, 2012 at 9:58 PM, Asaf Mesika <[EMAIL PROTECTED]> wrote:
>>> This looks great!
>>> I have a question regarding schema. It is written in the user guide that
>>> the schema of a cell is saved next to the data in the cell. I presume it
>>> Takes more spaces, as schema is duplicated for each row this cell is saved
>>> Makes reading records slower since it needs to parse the Avro Schema
>>> before reading each cell
>>> Did I manage to understand the guide correctly?
>>> On 15 בנוב 2012, at 00:18, Aaron Kimball <[EMAIL PROTECTED]> wrote:
>>>> HBase fans,
>>>> I’m writing to announce the first release of KijiSchema, a new project to
>>>> help developers build applications on HBase. You can download it at
>>>> www.kiji.org. It is open source and published under the Apache 2
>>>> KijiSchema simplifies the development of applications on HBase by
>>>> developer-friendly Java APIs for storing and managing typed data using
>>>> As an application grows, developers can gracefully evolve the application
>>>> schema at the cell level to handle new fields. These features are
>>>> particularly well suited for entity-centric data schemas where all
>>>> information about a given entity, including dimensional and transaction
>>>> data, is encoded within the same row.
>>>> Column names and associations of columns with schemas are maintained in a
>>>> data dictionary; developers don’t need to rely on reading source code to
>>>> remember where data is stored.
>>>> Table schemas can be defined in JSON or by using KijiSchema’s declarative
>>>> DDL. Developers can also easily run MapReduce over Kiji tables in HBase
>>>> using included MR Input- and OutputFormats.
>>>> KijiSchema is an open and highly modular system. It runs on top of an
>>>> existing HBase 0.92 (CDH4) cluster, and can be run entirely on the client
>>>> with no server-side daemons. KijiSchema can also be downloaded as part
>>> of a
>>>> Kiji BentoBox, which provides a clean install of a mini-cluster of
>>>> HBase and Kiji on your laptop in under 15 min. You do not need to have
>>>> Hadoop or HBase pre-installed to run the BentoBox.
>>>> KijiSchema is inspired by work we have done at WibiData developing
>>>> applications for recommendations and personalization on top of HBase. We
>>>> will be developing and releasing other components into the Kiji project
>>>> provide additional functionality enabling easy development of data
>>>> applications on HBase, including improvements for MapReduce support and
>>>> querying tools. We welcome feedback and contributions from the community