Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Seeing DataByteArray values for chararray field in 0.8.0


Copy link to this message
-
Re: Seeing DataByteArray values for chararray field in 0.8.0

On 1/19/11 3:18 AM, "Kaluskar, Sanjay" <[EMAIL PROTECTED]> wrote:

> I have script as follows:
>
>
>
> register lookup.jar;
>
> a = load 'lookupfile.dat' as(emp_id: chararray);
>
> b = foreach a generate flatten(com.mycompany.pig.lookup());

The udf in above statement does not have an argument, I assume you meant -
"b = foreach a generate flatten(com.mycompany.pig.lookup(emp_id));"

> My UDF works as expected in versions 0.5.0, 0.6.0 and 0.7.0. In version
> 0.8.0, I notice that the input tuple "input" has 1 field with value of
> type DataByteArray, whereas in earlier versions the value is of type
> String (as expected). Why is this different? I am assuming this is an
> intentional change in 0.8.0. Is there some way to force conversion from
> the raw data before the UDF is invoked, i.e., the old behaviour? What is
> the recommended approach in 0.8.0 for EvalFunc UDFs?

The tuple should contain field of type CHARARRAY in 0.8 as well. I looked at
the explain plan of a similar query and it seemed to be correct.
Can you please open a jira and attach a simplified form of your udf that
reproduces this problem ?
Thanks,
Thejas
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB