John 2013-09-13, 14:31
Pradeep Gollakota 2013-09-13, 16:25
-Re: Sort Order in HBase with Pig/Piglatin in Java
John 2013-09-13, 16:29
Hi, thanks for your quick answer! I figured it out by my self since the
mailing server was down the last 2hours?! Btw. I did option 1. But I used
a LinkedHashMap insead. Do you knows whats the better choice? TreeMap
Anyway thanks :)
2013/9/13 Pradeep Gollakota <[EMAIL PROTECTED]>
> Thats a great observation John! The problem is that HBaseStorage maps
> columns families into a HashMap, so the sort ordering is completely lost.
> You have two options:
> 1. Modify HBaseStorage to use a SortedMap data structure (i.e. TreeMap) and
> use the modified HBaseStorage. (or make it configurable)
> 2. Since you convert the map to a bag, you can sort the bag in a nested
> foreach statement.
> I prefer option 1 myself because it would be more performant than option 2.
> On Fri, Sep 13, 2013 at 7:31 AM, John <[EMAIL PROTECTED]> wrote:
> > I have created a HBase Table in the hbase shell and added some data. In
> > http://hbase.apache.org/book/dm.sort.html is written that the datasets
> > first sorted by the rowkey and then the column. So I tried something in
> > HBase Shell: http://pastebin.com/gLVAX0rJ
> > Everything looks fine. I got the right order a -> c -> d like expected.
> > Now I tried the same with Apache Pig in Java:
> > I got this result:
> > (key1,[c#val,d#val,a#val])
> > So, now the order is c -> d -> a. That seems a little odd to me,
> > it be the same like in HBase? It's important for me to get the right
> > because I transform the map afterwards into a bag and then join it with
> > other tables. If both inputs are sorted I could use a merge join without
> > sorting these two datasets. So does anyone know how it is possible to get
> > the sorted map (or bag) of the columns?
> > thanks
Shahab Yunus 2013-09-13, 16:45
John 2013-09-13, 16:50
Shahab Yunus 2013-09-13, 16:55
Pradeep Gollakota 2013-09-13, 16:44