Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Best way to work on Cassandra Data?


Copy link to this message
-
Re: Best way to work on Cassandra Data?

On 9/26/10 8:46 AM, "Christian Decker" <[EMAIL PROTECTED]> wrote:

> It's been some while since I started using Cassandra in combination with
> Pig, but I still haven't figured out the best way to work with the data. I
> wrote some Index Readers based on the format that the contributed
> CassandraStorage introduced (a tuple of the key and a databag which in turn
> holds key-value tuples for the columns), but my Pig scripts by using this
> method my scripts are riddled with $-signs:
>
>
>> C1 = order rows by $1.$4.$1;
>
>
> which continually break down as soon as I add some more columns to my CF. I
> was working on a converter that takes the above mentioned format and
> converted it to a Map<String, Object> but to no avail since when trying to
> work on it, I bump my head against these:
>
> java.lang.ClassCastException: java.util.HashMap cannot be cast to
>> org.apache.pig.data.Tuple
>
>
>  as soon as I make this:
>
> crows = FOREACH rows GENERATE CassandraConvert($0,$1)
>
> scrows = FOREACH crows GENERATE $0.uid;

Does CassandraConvert return Map<String,Object> ? In that case your pig
statement should be -

 scrows = FOREACH crows GENERATE $0#'uid';

-Thejas
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB