Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> What is the best way to output string values?


Copy link to this message
-
What is the best way to output string values?
Hi,

in my MapReduce job I parse documents' metadata, so about each document I
know things like author=john, and last_printed=3/3/10, and so on.

What is the best way to put all this into a text line of output in the
reducer? In other words, I want to imitate HBase, putting (qualifier, value)
pairs with the same rowkey - but in text output. I will probably need to
base64 encode them, because Hadoop does not like any characters other than
ASCII.

So is "Key=Value", "Key=Value" the best practice?

What do I do with that values? Eventually I want to create a CSV file, with
all the metadata values in a table format - running this as the next
MapReduce job.

Thank you,
Mark