Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Implement Binary Search in PIG


Copy link to this message
-
Re: Implement Binary Search in PIG
Bags can be very large might not fit into memory, and in such cases some
or all of the bag might have to be stored on disk. In such cases, it is
not efficient to do random access on the bag. That is why the DataBag
interface does not support it.

As Prashant suggested, storing it in a tuple would be a good
alternative, if you want to have random access to do binary search.

-Thejas
On 12/12/11 7:54 PM, 唐亮 wrote:
> Hi all,
> How can I implement a binary search in pig?
>
> In one relation, there exists a bag whose items are sorted.
> And I want to check there exists a specific item in the bag.
>
> In UDF, I can't random access items in DataBag container.
> So I have to transfer the items in DataBag to an ArrayList, and this is
> time consuming.
>
> How can I implement the binary search efficiently in pig?
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB