Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> What is the best way to output string values?


Copy link to this message
-
What is the best way to output string values?
Hi,

in my MapReduce job I parse documents' metadata, so about each document I
know things like author=john, and last_printed=3/3/10, and so on.

What is the best way to put all this into a text line of output in the
reducer? In other words, I want to imitate HBase, putting (qualifier, value)
pairs with the same rowkey - but in text output. I will probably need to
base64 encode them, because Hadoop does not like any characters other than
ASCII.

So is "Key=Value", "Key=Value" the best practice?

What do I do with that values? Eventually I want to create a CSV file, with
all the metadata values in a table format - running this as the next
MapReduce job.

Thank you,
Mark
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB