Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Python UDF got problems converting Strings to Integers


Copy link to this message
-
Python UDF got problems converting Strings to Integers
Hi together,

i got a UDF that  sums up histograms in form of tuples. The function i
wrote looks like this:

@outputSchema("res_histo:tuple()")
def aggHisto(aHistogramSet):
                 if aHistogramSet is None: return None;
                 hist_len = len(aHistogramSet[0])
                 result=[0]*hist_len

                 for aHistogram in aHistogramSet:
                         for i in range(0,hist_len):
                                 value =
int(''.join(map(str,aHistogram[i])));
                                 result[i] = result[i] + (value)
                 return tuple(result)

So for the following input {(1,23,45),(0,0,0)} i SHOULD get the
following output: (1,23,45)
But instead i get: (49,5051,52,5353)
I played around with this for some time and found out this program does
the following:
The line "value = int(''.join(map(str,aHistogram[i])));" does not
convert the "23" to 23, but it does the following:
It takes every single digit starting with the most siginificant one and
adds 48 to it: 2+48=50 and 3+48=51 resulting in 5051

Why does this happen? Can anybody help me here?

Best regards,
Elmar
+
Cheolsoo Park 2012-10-31, 04:59
+
Björn-Elmar Macek 2012-10-31, 09:36
+
Björn-Elmar Macek 2012-10-31, 10:49
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB