Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - Best way to work on Cassandra Data?


Copy link to this message
-
Re: Best way to work on Cassandra Data?
Thejas M Nair 2010-09-27, 13:38

On 9/26/10 8:46 AM, "Christian Decker" <[EMAIL PROTECTED]> wrote:

> It's been some while since I started using Cassandra in combination with
> Pig, but I still haven't figured out the best way to work with the data. I
> wrote some Index Readers based on the format that the contributed
> CassandraStorage introduced (a tuple of the key and a databag which in turn
> holds key-value tuples for the columns), but my Pig scripts by using this
> method my scripts are riddled with $-signs:
>
>
>> C1 = order rows by $1.$4.$1;
>
>
> which continually break down as soon as I add some more columns to my CF. I
> was working on a converter that takes the above mentioned format and
> converted it to a Map<String, Object> but to no avail since when trying to
> work on it, I bump my head against these:
>
> java.lang.ClassCastException: java.util.HashMap cannot be cast to
>> org.apache.pig.data.Tuple
>
>
>  as soon as I make this:
>
> crows = FOREACH rows GENERATE CassandraConvert($0,$1)
>
> scrows = FOREACH crows GENERATE $0.uid;

Does CassandraConvert return Map<String,Object> ? In that case your pig
statement should be -

 scrows = FOREACH crows GENERATE $0#'uid';

-Thejas