Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Problem in filters


+
Omkar Joshi 2013-04-17, 11:51
+
Ted Yu 2013-04-17, 13:16
+
Omkar Joshi 2013-04-17, 13:27
+
Jean-Marc Spaggiari 2013-04-17, 13:32
Copy link to this message
-
Re: Problem in filters
Omkar,

Have you considered using Phoenix (https://github.com/forcedotcom/phoenix), a SQL skin over HBase to execute your SQL directly? That'll save you from learning all the nuances of HBase filters and give you as good or better performance.

Once you've downloaded and installed Phoenix, here's what you'd need to do:

// One time DDL statement
Connection conn = DriverManager.getConnection("jdbc:phoenix:your-zookeeper-quorum-host");
conn.createStatement().execute("CREATE VIEW ORDERS(\n" +
    // Not sure what the PK is, so I added this column
    "ORDER_DETAILS.ORDER_DETAILS_ID VARCHAR NOT NULL PRIMARY KEY,\n" +
    // If you have fixed length IDs, then use CHAR(xxx)
    "ORDER_DETAILS.CUSTOMER_ID VARCHAR,\n" +
    "ORDER_DETAILS.PRODUCT_ID VARCHAR,\n" +
    "ORDER_DETAILS.REQUEST_DATE DATE,\n" +
    "ORDER_DETAILS.PRODUCT_QUANTITY INTEGER,\n" +
    "ORDER_DETAILS.PRICE DECIMAL(10,2),\n" +
     // not sure on the type here, but this might map to an Enum
    "ORDER_DETAILS.PAYMENT_MODE CHAR(1)\n" +
    ")");

// Running the query:
Connection conn = DriverManager.getConnection("jdbc:phoenix:your-zookeeper-quorum-host");
PreparedStatement stmt = conn.prepareStatement("SELECT ORDER_ID,CUSTOMER_ID,PRODUCT_ID,QUANTITY\n" +
    "FROM ORDERS WHERE QUANTITY >= ? and PRODUCT_ID=?");
stmt.setInt(1,16);
stmt.setString(2,"P60337998");
ResultSet rs = stmt.executeQuery();
while (rs.next()) {
    System.out.println("ORDER_ID=" + rs.getString("ORDER_ID") + ",CUSTOMER_ID=" + rs.getString("CUSTOMER_ID")+
        ",PRODUCT_ID=" + rs.getString("PRODUCT_ID") + ",QUANTITY=" + rs.getInt("QUANTITY"));
}

There are different trade-offs for the make up of the columns in your PK, depending on your access patterns. Getting this right could prevent full table scans and make your query execute much faster. Also, there are performance trade-offs for using a VIEW versus a TABLE.

Ian
On Apr 17, 2013, at 8:32 AM, Jean-Marc Spaggiari wrote:

Hi Omkar,

Using the shell, can you scan the few first lines from your table to make
sure it's store with the expected format? Don't forget the limit the number
of rows retrieved.

JM
2013/4/17 Omkar Joshi <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>

Hi Ted,

I tried using only productIdFilter without FilterList but still no output.

public void executeOrdersQuery() {
               /*
                * SELECT ORDER_ID,CUSTOMER_ID,PRODUCT_ID,QUANTITY FROM
ORDERS WHERE
                * QUANTITY >=16 and PRODUCT_ID='P60337998'
                */
               String tableName = "ORDERS";

               String family = "ORDER_DETAILS";
               int quantity = 16;
               String productId = "P60337998";

               SingleColumnValueFilter quantityFilter = new
SingleColumnValueFilter(
                               Bytes.toBytes(family),
Bytes.toBytes("PRODUCT_QUANTITY"),
                               CompareFilter.CompareOp.GREATER_OR_EQUAL,
                               Bytes.toBytes(quantity));

               SingleColumnValueFilter productIdFilter = new
SingleColumnValueFilter(
                               Bytes.toBytes(family),
Bytes.toBytes("PRODUCT_ID"),
                               CompareFilter.CompareOp.EQUAL,
Bytes.toBytes(productId));

               FilterList filterList = new FilterList(
                               FilterList.Operator.MUST_PASS_ALL);
               // filterList.addFilter(quantityFilter);
               filterList.addFilter(productIdFilter);

               Scan scan = new Scan();
               scan.addColumn(Bytes.toBytes(family),
Bytes.toBytes("ORDER_ID"));
               scan.addColumn(Bytes.toBytes(family),
Bytes.toBytes("CUSTOMER_ID"));
               scan.addColumn(Bytes.toBytes(family),
Bytes.toBytes("PRODUCT_ID"));
               scan.addColumn(Bytes.toBytes(family),
Bytes.toBytes("QUANTITY"));

               // scan.setFilter(filterList);
               scan.setFilter(productIdFilter);

               HTableInterface tbl hTablePool.getTable(Bytes.toBytes(tableName));
               ResultScanner scanResults = null;
               try {
                       scanResults = tbl.getScanner(scan);

                       System.out.println("scanResults : ");

                       for (Result result : scanResults) {
                               System.out.println("The result is " +
result);
                       }

               } catch (IOException e) {
                       // TODO Auto-generated catch block
                       e.printStackTrace();
               } finally {
                       try {
                               tbl.close();
                       } catch (IOException e) {
                               // TODO Auto-generated catch block
                               e.printStackTrace();
                       }
               }

       }

Regards,
Omkar Joshi
From: Ted Yu [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, April 17, 2013 6:46 PM
To: [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>
Cc: [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>
Subject: Re: Problem in filters

If you specify producIdFilter without using FilterList, what would you get
?

Thanks

On Apr 17, 2013, at 4:51 AM, Omkar Joshi <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
wrote:

Hi,

I'm having the a table named ORDERS with 1000851 rows:

rowkey :                       ORDER_ID

column family : ORDER_DETAILS
          columns : CUSTOMER_ID
                                  PRODUCT_ID
                                  REQUEST_DATE
                                  PRODUCT_QUANTITY
                                  PRICE
                                  PAYMENT_MODE

I'm using the following code to access the data :

public void executeOrdersQuery() {
          /*
          * SELECT ORDER_ID,CUSTOMER_ID,PRODUCT_ID,QUANTITY FROM ORDERS
WHERE
          * QUANTITY >=16 and PRODUCT_ID='P60337998'
          */
          String tableName = "ORDERS";

          String
+
Omkar Joshi 2013-04-19, 09:59
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB