Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - What is the best way to output string values?


Copy link to this message
-
What is the best way to output string values?
Mark Kerzner 2011-03-10, 06:15
Hi,

in my MapReduce job I parse documents' metadata, so about each document I
know things like author=john, and last_printed=3/3/10, and so on.

What is the best way to put all this into a text line of output in the
reducer? In other words, I want to imitate HBase, putting (qualifier, value)
pairs with the same rowkey - but in text output. I will probably need to
base64 encode them, because Hadoop does not like any characters other than
ASCII.

So is "Key=Value", "Key=Value" the best practice?

What do I do with that values? Eventually I want to create a CSV file, with
all the metadata values in a table format - running this as the next
MapReduce job.

Thank you,
Mark