Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Announcing KijiSchema for HBase schema management


Copy link to this message
-
Re: Announcing KijiSchema for HBase schema management
Asaf Mesika 2012-11-15, 19:19
Thanks, that's great! Truly an awesome project.
Is there a way to specify a composite row key composed of the fields
specified in the table schema much like a definition of a primary key
in oracle table?
For example a rowkey can look like: (CustomerID)(StartTimeMs)(RequestId)

Sent from my iPhone

On 15 בנוב 2012, at 20:26, Aaron Kimball <[EMAIL PROTECTED]> wrote:

> Hi Asaf,
>
> This is a good point. Our user guide is vague on the subject, but under the
> hood, we are actually storing in each cell an integer id that is assigned
> to the writer schema. KijiSchema maintains the id-to-schema mappings in a
> metadata table (also stored in HBase) and looks them up as needed. I have
> logged https://jira.kiji.org/browse/DOCS-2 to note this improvement
>
> Cheers,
> - Aaron
>
>
> On Wed, Nov 14, 2012 at 9:58 PM, Asaf Mesika <[EMAIL PROTECTED]> wrote:
>
>> Hi,
>> This looks great!
>>
>> I have a question regarding schema. It is written in the user guide that
>> the schema of a cell is saved next to the data in the cell. I presume it
>> would:
>> Takes more spaces, as schema is duplicated for each row this cell is saved
>> at
>> Makes reading records slower since it needs to parse the Avro Schema
>> before reading each cell
>>
>> Did I manage to understand the guide correctly?
>>
>> Thanks!
>>
>> Asaf
>>
>>
>> On 15 בנוב 2012, at 00:18, Aaron Kimball <[EMAIL PROTECTED]> wrote:
>>
>>> HBase fans,
>>>
>>> I’m writing to announce the first release of KijiSchema, a new project to
>>> help developers build applications on HBase. You can download it at
>>> www.kiji.org. It is open source and published under the Apache 2
>> license.
>>>
>>> KijiSchema simplifies the development of applications on HBase by
>> providing
>>> developer-friendly Java APIs for storing and managing typed data using
>> Avro.
>>>
>>> As an application grows, developers can gracefully evolve the application
>>> schema at the cell level to handle new fields. These features are
>>> particularly well suited for entity-centric data schemas where all
>>> information about a given entity, including dimensional and transaction
>>> data, is encoded within the same row.
>>>
>>> Column names and associations of columns with schemas are maintained in a
>>> data dictionary; developers don’t need to rely on reading source code to
>>> remember where data is stored.
>>>
>>> Table schemas can be defined in JSON or by using KijiSchema’s declarative
>>> DDL. Developers can also easily run MapReduce over Kiji tables in HBase
>>> using included MR Input- and OutputFormats.
>>>
>>> KijiSchema is an open and highly modular system. It runs on top of an
>>> existing HBase 0.92 (CDH4) cluster, and can be run entirely on the client
>>> with no server-side daemons. KijiSchema can also be downloaded as part
>> of a
>>> Kiji BentoBox, which provides a clean install of a mini-cluster of
>> Hadoop,
>>> HBase and Kiji on your laptop in under 15 min. You do not need to have
>>> Hadoop or HBase pre-installed to run the BentoBox.
>>>
>>> KijiSchema is inspired by work we have done at WibiData developing
>>> applications for recommendations and personalization on top of HBase. We
>>> will be developing and releasing other components into the Kiji project
>> to
>>> provide additional functionality enabling easy development of data
>>> applications on HBase, including improvements for MapReduce support and
>>> querying tools. We welcome feedback and contributions from the community
>> to
>>> the Kiji Project at www.kiji.org.
>>>
>>> Regards,
>>> - Aaron Kimball
>>
>>