Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - how can i get the column value? Need help!.. cassandra 1.28 and pig 0.11.1


Copy link to this message
-
Re: how can i get the column value? Need help!.. cassandra 1.28 and pig 0.11.1
Miguel Angel Martin junqu... 2013-08-28, 06:42
hi all:
Regards

Still i can resolve this issue. .....

does anybody have this issue or try to test this simple example?
i am stumped I can not find a solution working.

I appreciate any comment or help
2013/8/22 Miguel Angel Martin junquera <[EMAIL PROTECTED]>

> hi all:
>
>
>
>
> I,m testing the new CqlStorage() with cassandra 1.28 and pig 0.11.1
>
>
> I am using this sample data test:
>
>
> http://frommyworkshop.blogspot.com.es/2013/07/hadoop-map-reduce-with-cassandra.html
>
> And I load and dump data Righ with this script:
>
> *rows = LOAD
> 'cql://keyspace1/test?page_size=1&split_size=4&where_clause=age%3D30' USING
> CqlStorage();*
> *
> *
> *dump rows;*
> *describe rows;*
> *
> *
>
> *resutls:
>
> ((id,6),(age,30),(title,QA))
>
> ((id,5),(age,30),(title,QA))
>
> rows: {id: chararray,age: int,title: chararray}
>
>
> *
>
>
> But i can not  get  the column values
>
> I try to define   another schemas in Load like I used with
> cassandraStorage()
>
>
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Cassandra-and-Pig-how-to-get-column-values-td5641158.html
>
>
> example:
>
> *rows = LOAD
> 'cql://keyspace1/test?page_size=1&split_size=4&where_clause=age%3D30' USING
> CqlStorage() AS (columns: bag {T: tuple(name, value)});*
>
>
> and I get this error:
>
> *2013-08-22 12:24:45,426 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> ERROR 1031: Incompatable schema: left is
> "columns:bag{T:tuple(name:bytearray,value:bytearray)}", right is
> "id:chararray,age:int,title:chararray"*
>
>
>
>
> I try to use, FLATTEN, SUBSTRING, SPLIT UDF`s but i have not get good
> result:
>
> Example:
>
>
>    - when I flatten , I get a set of tuples like
>
> *(title,QA)*
>
> *(title,QA)*
>
> *2013-08-22 12:42:20,673 [main] INFO
>  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total
> input paths to process : 1*
>
> *A: {title: chararray}*
>
>
>
> but i can get value QA
>
> Sustring only works with title
>
>
>
> example:
>
> *B = FOREACH A GENERATE SUBSTRING(title,2,5);*
> *
> *
> *dump B;*
> *describe B;*
> *
> *
> *
> *
>
> *results:*
> *
> *
>
> *(tle)*
> *(tle)*
> *B: {chararray}*
>
>
>
>
> i try, this like ERIC LEE inthe other mail  and have the same results:
>
>
>  Anyways, what I really what is the column value, not the name. Is there a
> way to do that? I listed all of the failed attempts I made below.
>
>    - colnames = FOREACH cols GENERATE $1 and was told $1 was out of
>    bounds.
>    - casted = FOREACH cols GENERATE (tuple(chararray, chararray))$0; but
>    all I got back were empty tuples
>    - values = FOREACH cols GENERATE $0.$1; but I got an error telling me
>    data byte array can't be casted to tuple
>
>
> Please, I will appreciate any help
>
>
> Regards
>
>
>
>
>
>
>
--

Miguel Angel Martín Junquera
Analyst Engineer.
[EMAIL PROTECTED]
Tel. / Fax: (+34) 91 485 56 66
*http://www.brainsins.com*
Smart eCommerce
*Madrid*: http://goo.gl/4B5kv
*London*: http://goo.gl/uIXdv
*Barcelona*: http://goo.gl/NZslW

Antes de imprimir este e-mail, piense si es necesario.
La legislación española ampara el secreto de las comunicaciones. Este
correo electrónico es estrictamente confidencial y va dirigido
exclusivamente a su destinatario/a. Si no es Ud., le rogamos que no difunda
ni copie la transmisión y nos lo notifique cuanto antes.