Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Schema design for filters


+
Kristoffer Sjögren 2013-06-27, 17:59
+
Michael Segel 2013-06-27, 19:21
+
Kristoffer Sjögren 2013-06-27, 21:41
+
Michael Segel 2013-06-27, 21:51
Copy link to this message
-
Re: Schema design for filters
I see your point. Everything is just bytes.

However, the schema is known and every row is formatted according to this
schema, although some columns may not exist, that is, no value exist for
this property on this row.

So if im able to apply these "typed comparators" to the right cell values
it may be possible? But I cant find a filter that target specific columns?

Seems like all filters scan every column/qualifier and there is no way of
knowing what column is currently being evaluated?
On Thu, Jun 27, 2013 at 11:51 PM, Michael Segel
<[EMAIL PROTECTED]>wrote:

> You have to remember that HBase doesn't enforce any sort of typing.
> That's why this can be difficult.
>
> You'd have to write a coprocessor to enforce a schema on a table.
> Even then YMMV if you're writing JSON structures to a column because while
> the contents of the structures could be the same, the actual strings could
> differ.
>
> HTH
>
> -Mike
>
> On Jun 27, 2013, at 4:41 PM, Kristoffer Sjögren <[EMAIL PROTECTED]> wrote:
>
> > I realize standard comparators cannot solve this.
> >
> > However I do know the type of each column so writing custom list
> > comparators for boolean, char, byte, short, int, long, float, double
> seems
> > quite straightforward.
> >
> > Long arrays, for example, are stored as a byte array with 8 bytes per
> item
> > so a comparator might look like this.
> >
> > public class LongsComparator extends WritableByteArrayComparable {
> >    public int compareTo(byte[] value, int offset, int length) {
> >        long[] values = BytesUtils.toLongs(value, offset, length);
> >        for (long longValue : values) {
> >            if (longValue == val) {
> >                return 0;
> >            }
> >        }
> >        return 1;
> >    }
> > }
> >
> > public static long[] toLongs(byte[] value, int offset, int length) {
> >    int num = (length - offset) / 8;
> >    long[] values = new long[num];
> >    for (int i = offset; i < num; i++) {
> >        values[i] = getLong(value, i * 8);
> >    }
> >    return values;
> > }
> >
> >
> > Strings are similar but would require charset and length for each string.
> >
> > public class StringsComparator extends WritableByteArrayComparable  {
> >    public int compareTo(byte[] value, int offset, int length) {
> >        String[] values = BytesUtils.toStrings(value, offset, length);
> >        for (String stringValue : values) {
> >            if (val.equals(stringValue)) {
> >                return 0;
> >            }
> >        }
> >        return 1;
> >    }
> > }
> >
> > public static String[] toStrings(byte[] value, int offset, int length) {
> >    ArrayList<String> values = new ArrayList<String>();
> >    int idx = 0;
> >    ByteBuffer buffer = ByteBuffer.wrap(value, offset, length);
> >    while (idx < length) {
> >        int size = buffer.getInt();
> >        byte[] bytes = new byte[size];
> >        buffer.get(bytes);
> >        values.add(new String(bytes));
> >        idx += 4 + size;
> >    }
> >    return values.toArray(new String[values.size()]);
> > }
> >
> >
> > Am I on the right track or maybe overlooking some implementation details?
> > Not really sure how to target each comparator to a specific column value?
> >
> >
> > On Thu, Jun 27, 2013 at 9:21 PM, Michael Segel <
> [EMAIL PROTECTED]>wrote:
> >
> >> Not an easy task.
> >>
> >> You first need to determine how you want to store the data within a
> column
> >> and/or apply a type constraint to a column.
> >>
> >> Even if you use JSON records to store your data within a column, does an
> >> equality comparator exist? If not, you would have to write one.
> >> (I kinda think that one may already exist...)
> >>
> >>
> >> On Jun 27, 2013, at 12:59 PM, Kristoffer Sjögren <[EMAIL PROTECTED]>
> wrote:
> >>
> >>> Hi
> >>>
> >>> Working with the standard filtering mechanism to scan rows that have
> >>> columns matching certain criterias.
> >>>
> >>> There are columns of numeric (integer and decimal) and string types.
> >> These
+
Michael Segel 2013-06-27, 22:58
+
Kristoffer Sjögren 2013-06-27, 23:39
+
James Taylor 2013-06-28, 01:55
+
Kristoffer Sjögren 2013-06-28, 09:24
+
Otis Gospodnetic 2013-06-28, 18:34
+
Kristoffer Sjögren 2013-06-28, 18:53
+
Otis Gospodnetic 2013-06-28, 18:58
+
Asaf Mesika 2013-06-28, 21:30
+
Michel Segel 2013-06-28, 23:45
+
Kristoffer Sjögren 2013-06-29, 11:29
+
Michael Segel 2013-06-28, 12:45
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB