Just a warning if you are using Text output format then you will have some hard time with "\n" inside your logs like stackTrace for example.
Also, text file will either be non-compressed or non-splittable.
On 11/19/10 9:30 AM, "Eric Yang" <[EMAIL PROTECTED]> wrote:
On 11/19/10 12:37 AM, "Ying Tang" <[EMAIL PROTECTED]> wrote:
Hi all ,
1. I have install 2 nodes chukwa for testing , one agent and one collector . And also i have an hdfs , but i found the log collected by the collector in hdfs , the file name is
time's format is yyyyddHHmmssSSS , there is no month ? And this is been written in the code .
I need the month , so i must change the code and recompile it ?
2. And another question , the log content in the log file(in the hdfs) , the metadata is messy code , the log content from the agent is ok.
My adaptor is UTF8 , how to solve this?
1. Looks like a mistake on the temp filename. Please open a jira and we will fix it.
2. The data is recorded in sequence file format to make the data easier to process with mapreduce. If you are expecting plain text of the log content, you will need to write a map/reduce job with output format to text output format and channel the log files types according.