|
|
-
Reg: parsing all files & file append
Manoj Babu 2012-09-09, 15:58
Hi All,
I have two questions, providing info on it will be helpful.
1, I am using hadoop to analyze and to find top n search term metric's from logs. If any new log file is added to HDFS then again we are running the job to find the metrics. Daily we will be getting log files and we are parsing the whole file and getting the metric's. All the log file's are parsed daily to get the latest metric's is there any way is there any way to avoid this?
2, Does file append is production stable?
Cheers! Manoj.
-
Re: Reg: parsing all files & file append
Bejoy KS 2012-09-09, 16:49
Hi Manoj
You can load daily logs into a individual directories in hdfs and process them daily. Keep those results in hdfs or hbase or dbs etc. Every day do the processing, get the results and aggregate the same with the previously aggregated results till date. Regards Bejoy KS
Sent from handheld, please excuse typos.
-----Original Message----- From: Manoj Babu <[EMAIL PROTECTED]> Date: Sun, 9 Sep 2012 21:28:54 To: <[EMAIL PROTECTED]> Reply-To: [EMAIL PROTECTED] Subject: Reg: parsing all files & file append
Hi All,
I have two questions, providing info on it will be helpful.
1, I am using hadoop to analyze and to find top n search term metric's from logs. If any new log file is added to HDFS then again we are running the job to find the metrics. Daily we will be getting log files and we are parsing the whole file and getting the metric's. All the log file's are parsed daily to get the latest metric's is there any way is there any way to avoid this?
2, Does file append is production stable?
Cheers! Manoj.
-
Re: Reg: parsing all files & file append
Manoj Babu 2012-09-10, 05:33
Thank you Bejoy.
Does file append is production stable? Cheers! Manoj.
On Sun, Sep 9, 2012 at 10:19 PM, Bejoy KS <[EMAIL PROTECTED]> wrote:
> ** > Hi Manoj > > You can load daily logs into a individual directories in hdfs and process > them daily. Keep those results in hdfs or hbase or dbs etc. Every day do > the processing, get the results and aggregate the same with the previously > aggregated results till date. > > Regards > Bejoy KS > > Sent from handheld, please excuse typos. > ------------------------------ > *From: * Manoj Babu <[EMAIL PROTECTED]> > *Date: *Sun, 9 Sep 2012 21:28:54 +0530 > *To: *<[EMAIL PROTECTED]> > *ReplyTo: * [EMAIL PROTECTED] > *Subject: *Reg: parsing all files & file append > > Hi All, > > I have two questions, providing info on it will be helpful. > > 1, I am using hadoop to analyze and to find top n search term metric's > from logs. > If any new log file is added to HDFS then again we are running the job to > find the metrics. > Daily we will be getting log files and we are parsing the whole file and > getting the metric's. > All the log file's are parsed daily to get the latest metric's is there > any way is there any way to avoid this? > > 2, Does file append is production stable? > > Cheers! > Manoj. > >
-
Re: Reg: parsing all files & file append
Bejoy Ks 2012-09-10, 08:06
Hi Manoj
>From my limited knowledge on file appends in hdfs , i have seen more recommendations to use sync() in the latest releases than using append(). Let us wait for some commiter to authoritatively comment on 'the production readiness of append()' . :)
Regards Bejoy KS
On Mon, Sep 10, 2012 at 11:03 AM, Manoj Babu <[EMAIL PROTECTED]> wrote:
> Thank you Bejoy. > > Does file append is production stable? > > > Cheers! > Manoj. > > > > On Sun, Sep 9, 2012 at 10:19 PM, Bejoy KS <[EMAIL PROTECTED]> wrote: > >> ** >> Hi Manoj >> >> You can load daily logs into a individual directories in hdfs and process >> them daily. Keep those results in hdfs or hbase or dbs etc. Every day do >> the processing, get the results and aggregate the same with the previously >> aggregated results till date. >> >> Regards >> Bejoy KS >> >> Sent from handheld, please excuse typos. >> ------------------------------ >> *From: * Manoj Babu <[EMAIL PROTECTED]> >> *Date: *Sun, 9 Sep 2012 21:28:54 +0530 >> *To: *<[EMAIL PROTECTED]> >> *ReplyTo: * [EMAIL PROTECTED] >> *Subject: *Reg: parsing all files & file append >> >> Hi All, >> >> I have two questions, providing info on it will be helpful. >> >> 1, I am using hadoop to analyze and to find top n search term metric's >> from logs. >> If any new log file is added to HDFS then again we are running the job to >> find the metrics. >> Daily we will be getting log files and we are parsing the whole file and >> getting the metric's. >> All the log file's are parsed daily to get the latest metric's is there >> any way is there any way to avoid this? >> >> 2, Does file append is production stable? >> >> Cheers! >> Manoj. >> >> >
-
Re: Reg: parsing all files & file append
Manoj Babu 2012-09-10, 09:08
Thank you Bejoy.
Cheers! Manoj.
On Mon, Sep 10, 2012 at 1:36 PM, Bejoy Ks <[EMAIL PROTECTED]> wrote:
> Hi Manoj > > From my limited knowledge on file appends in hdfs , i have seen more > recommendations to use sync() in the latest releases than using append(). > Let us wait for some commiter to authoritatively comment on 'the production > readiness of append()' . :) > > Regards > Bejoy KS > > > On Mon, Sep 10, 2012 at 11:03 AM, Manoj Babu <[EMAIL PROTECTED]> wrote: > >> Thank you Bejoy. >> >> Does file append is production stable? >> >> >> Cheers! >> Manoj. >> >> >> >> On Sun, Sep 9, 2012 at 10:19 PM, Bejoy KS <[EMAIL PROTECTED]> wrote: >> >>> ** >>> Hi Manoj >>> >>> You can load daily logs into a individual directories in hdfs and >>> process them daily. Keep those results in hdfs or hbase or dbs etc. Every >>> day do the processing, get the results and aggregate the same with the >>> previously aggregated results till date. >>> >>> Regards >>> Bejoy KS >>> >>> Sent from handheld, please excuse typos. >>> ------------------------------ >>> *From: * Manoj Babu <[EMAIL PROTECTED]> >>> *Date: *Sun, 9 Sep 2012 21:28:54 +0530 >>> *To: *<[EMAIL PROTECTED]> >>> *ReplyTo: * [EMAIL PROTECTED] >>> *Subject: *Reg: parsing all files & file append >>> >>> Hi All, >>> >>> I have two questions, providing info on it will be helpful. >>> >>> 1, I am using hadoop to analyze and to find top n search term metric's >>> from logs. >>> If any new log file is added to HDFS then again we are running the job >>> to find the metrics. >>> Daily we will be getting log files and we are parsing the whole file and >>> getting the metric's. >>> All the log file's are parsed daily to get the latest metric's is there >>> any way is there any way to avoid this? >>> >>> 2, Does file append is production stable? >>> >>> Cheers! >>> Manoj. >>> >>> >> >
|
|