Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> how can i get the column value? Need help!.. cassandra 1.28 and pig 0.11.1


Copy link to this message
-
how can i get the column value? Need help!.. cassandra 1.28 and pig 0.11.1
hi all:
I,m testing the new CqlStorage() with cassandra 1.28 and pig 0.11.1
I am using this sample data test:
http://frommyworkshop.blogspot.com.es/2013/07/hadoop-map-reduce-with-cassandra.html

And I load and dump data Righ with this script:

*rows = LOAD
'cql://keyspace1/test?page_size=1&split_size=4&where_clause=age%3D30' USING
CqlStorage();*
*
*
*dump rows;*
*describe rows;*
*
*

*resutls:

((id,6),(age,30),(title,QA))

((id,5),(age,30),(title,QA))

rows: {id: chararray,age: int,title: chararray}
*
But i can not  get  the column values

I try to define   another schemas in Load like I used with
cassandraStorage()

http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Cassandra-and-Pig-how-to-get-column-values-td5641158.html
example:

*rows = LOAD
'cql://keyspace1/test?page_size=1&split_size=4&where_clause=age%3D30' USING
CqlStorage() AS (columns: bag {T: tuple(name, value)});*
and I get this error:

*2013-08-22 12:24:45,426 [main] ERROR org.apache.pig.tools.grunt.Grunt -
ERROR 1031: Incompatable schema: left is
"columns:bag{T:tuple(name:bytearray,value:bytearray)}", right is
"id:chararray,age:int,title:chararray"*
I try to use, FLATTEN, SUBSTRING, SPLIT UDF`s but i have not get good
result:

Example:
   - when I flatten , I get a set of tuples like

*(title,QA)*

*(title,QA)*

*2013-08-22 12:42:20,673 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total
input paths to process : 1*

*A: {title: chararray}*

but i can get value QA

Sustring only works with title

example:

*B = FOREACH A GENERATE SUBSTRING(title,2,5);*
*
*
*dump B;*
*describe B;*
*
*
*
*

*results:*
*
*

*(tle)*
*(tle)*
*B: {chararray}*
i try, this like ERIC LEE inthe other mail  and have the same results:
 Anyways, what I really what is the column value, not the name. Is there a
way to do that? I listed all of the failed attempts I made below.

   - colnames = FOREACH cols GENERATE $1 and was told $1 was out of bounds.
   - casted = FOREACH cols GENERATE (tuple(chararray, chararray))$0; but
   all I got back were empty tuples
   - values = FOREACH cols GENERATE $0.$1; but I got an error telling me
   data byte array can't be casted to tuple
Please, I will appreciate any help
Regards
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB