|
|
Stan Rosenberg 2012-04-09, 21:25
Hi,
I just came across a use case requiring CombineFileInputFormat under hadoop 0.20.2. I was surprised that the API does not provide a default implementation. A precursory check against newer APIs also returned the same result. What's the rationale? I ended up writing my own implementation. However, it struck me that this might be a common use case, so why not provide an implementation?
Thanks,
stan
+
Stan Rosenberg 2012-04-09, 21:25
-
Re: CombineFileInputFormat
Deepak Nettem 2012-04-09, 21:29
Hi Stan,
Just out of curiosity, care to explain the use case a bit?
On Mon, Apr 9, 2012 at 5:25 PM, Stan Rosenberg <[EMAIL PROTECTED]>wrote:
> Hi, > > I just came across a use case requiring CombineFileInputFormat under > hadoop 0.20.2. I was surprised that the API does not provide a > default > implementation. A precursory check against newer APIs also returned > the same result. > What's the rationale? I ended up writing my own implementation. > However, it struck me that this might be a common use case, so why not > provide > an implementation? > > Thanks, > > stan >
Deepak Nettem
+
Deepak Nettem 2012-04-09, 21:29
-
Re: CombineFileInputFormat
Stan Rosenberg 2012-04-09, 21:35
On Mon, Apr 9, 2012 at 5:29 PM, Deepak Nettem <[EMAIL PROTECTED]> wrote: > Hi Stan, > > Just out of curiosity, care to explain the use case a bit? >
Very simply: lots of reasonably small files which I can't control, i.e., changing block size is not an option. Note that this is not an issue in pig or hive, both of which come with their own implementation of CombineFileInputFormat.
+
Stan Rosenberg 2012-04-09, 21:35
|
|