Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase, mail # user - Cluster load


+
Mohit Anchlia 2012-07-26, 22:32
+
syed kather 2012-07-27, 00:53
+
Mohit Anchlia 2012-07-27, 01:07
+
Khang Pham 2012-07-27, 07:16
+
Alex Baranau 2012-07-27, 14:21
+
syed kather 2012-07-27, 14:52
+
Alex Baranau 2012-07-27, 18:07
+
syed kather 2012-07-27, 18:21
+
Alex Baranau 2012-07-27, 18:48
+
Mohit Anchlia 2012-07-27, 23:24
+
Alex Baranau 2012-07-27, 23:51
+
Mohit Anchlia 2012-07-28, 00:43
+
Alex Baranau 2012-07-28, 01:03
+
Mohit Anchlia 2012-07-28, 18:07
+
Suraj Varma 2012-07-29, 04:38
+
Mohit Anchlia 2012-07-30, 17:56
+
Alex Baranau 2012-07-30, 18:58
Copy link to this message
-
Re: Cluster load
Mohit Anchlia 2012-07-30, 19:37
On Mon, Jul 30, 2012 at 11:58 AM, Alex Baranau <[EMAIL PROTECTED]>wrote:

> Glad to hear that answers & suggestions helped you!
>
> The format you are seeing is the output of
> org.apache.hadoop.hbase.util.Bytes.toStringBinary(..) method [1]. As you
> can see below, for "printable characters" it outputs the character itself,
> while for "non-printable" characters it outputs data in format "\xNN" (e.g.
> "\x00").
>
> I.e. in your case "\x00\x00\x00:\x00\x01\x7F\xFF\xFE\xC7'\x05\x11\xBF" ->
> "\x00\x00\x00" + ":" + "\x00\x01\x7F\xFF\xFE\xC7" + "'" + "\xBF", which is
> 3+1+6+1+1=12 bytes.
>
> I'd better use Bytes.toBytesBinary(String) method, which converts back to
> byte array. Or, if you are using ResultScanner API for fetching data, just
> invoke Result.getRow().length.
>
>
Thanks! Really appreciate your help.
> Alex Baranau
> ------
> Sematext :: http://blog.sematext.com/ :: Hadoop - HBase - ElasticSearch -
> Solr
>
> [1]
>
>   /**
>    * Write a printable representation of a byte array. Non-printable
>    * characters are hex escaped in the format \\x%02X, eg:
>    * \x00 \x05 etc
>    *
>    * @param b array to write out
>    * @param off offset to start at
>    * @param len length to write
>    * @return string output
>    */
>   public static String toStringBinary(final byte [] b, int off, int len) {
>     StringBuilder result = new StringBuilder();
>     try {
>       String first = new String(b, off, len, "ISO-8859-1");
>       for (int i = 0; i < first.length() ; ++i ) {
>         int ch = first.charAt(i) & 0xFF;
>         if ( (ch >= '0' && ch <= '9')
>             || (ch >= 'A' && ch <= 'Z')
>             || (ch >= 'a' && ch <= 'z')
>             || " `~!@#$%^&*()-_=+[]{}\\|;:'\",.<>/?".indexOf(ch) >= 0 ) {
>           result.append(first.charAt(i));
>         } else {
>           result.append(String.format("\\x%02X", ch));
>         }
>       }
>     } catch (UnsupportedEncodingException e) {
>       LOG.error("ISO-8859-1 not supported?", e);
>     }
>     return result.toString();
>   }
>
>
> On Mon, Jul 30, 2012 at 1:56 PM, Mohit Anchlia <[EMAIL PROTECTED]
> >wrote:
>
> > On Fri, Jul 27, 2012 at 6:03 PM, Alex Baranau <[EMAIL PROTECTED]
> > >wrote:
> >
> > > Yeah, your row keys start with \x00 which is = (byte) 0. This is not
> the
> > > same as "0" (which is = (byte) 48). You know what to fix now ;)
> > >
> > >
> >
> > I made required changes and it seems to be load balancing it pretty
> well. I
> > do have a follow up question around how to intrepret the output of hbase
> > shell. If I want to visually calculate the length of the row key can I
> > assume that \x00\x00 is equal to 2 bytes? I am just trying to get my head
> > around understanding hex format displayed on the shell.
> >
> >  \x00\x00\x00:\x00\x01\x7F\xFF\xFE\xC7'\x05\x11
> > column=S_T_MTX:\x00\x00?\xB8, timestamp=1343670017892,
> value=1343670136312
> >  \xBF
> >
> >
> > > Alex Baranau
> > > ------
> > > Sematext :: http://blog.sematext.com/ :: Hadoop - HBase -
> ElasticSearch
> > -
> > > Solr
> > >
> > >
> > > On Fri, Jul 27, 2012 at 8:43 PM, Mohit Anchlia <[EMAIL PROTECTED]
> > > >wrote:
> > >
> > > > On Fri, Jul 27, 2012 at 4:51 PM, Alex Baranau <
> > [EMAIL PROTECTED]
> > > > >wrote:
> > > >
> > > > > Can you scan your table and show one record?
> > > > >
> > > > > I guess you might be confusing Bytes.toBytes("0") vs byte[] {(byte)
> > 0}
> > > > that
> > > > > I mentioned in the other thread. I.e. looks like first region holds
> > > > records
> > > > > which key starts with any byte up to "0", which is (byte) 48.
> Hence,
> > if
> > > > you
> > > > > set first byte of your key to anything from (byte) 0 - (byte) 9,
> all
> > of
> > > > > them will fall into first regions which holds records with prefixes
> > > > (byte)
> > > > > 0 - (byte) 48.
> > > > >
> > > > > Could you check that?
> > > > >
> > > > >
> > > > I thought that if I give Bytes.toBytes("0") it really means that the
> > row
> > > > keys starting with "0" will go in that region. Here is my code that