Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS, mail # dev - Feature request to provide DFSInputStream subclassing mechanism


Copy link to this message
-
Re: Feature request to provide DFSInputStream subclassing mechanism
Colin McCabe 2013-08-08, 19:10
There is work underway to decouple the block layer and the namespace
layer of HDFS from each other.  Once this is done, block behaviors
like the one you describe will be easy to implement.  It's a use case
very similar to the hierarchical storage management (HSM) use case
that we've discussed before.  Check out HDFS-2832.  Hopefully there
will be a design document posted soon.

cheers,
Colin
On Thu, Aug 8, 2013 at 11:52 AM, Matevz Tadel <[EMAIL PROTECTED]> wrote:
> Hi everybody,
>
> I'm jumping in as Jeff is away due to an unexpected annoyance involving
> Californian wildlife.
>
>
> On 8/7/13 7:47 PM, Andrew Wang wrote:
>>
>> Blocks are supposed to be an internal abstraction within HDFS, and aren't
>> an
>> inherent part of FileSystem (the user-visible class used to access all
>> Hadoop
>> filesystems).
>
>
> Yes, but it's a really useful abstraction :) Do you really believe the
> blocks could be abandoned in the next couple of years? I mean, it's such a
> simple and effective solution ...
>
>
>> Is it possible to instead deal with files and offsets? On a read failure,
>> you
>> could open a stream to the same file on the backup filesystem, seek to the
>> old
>> file position, and retry the read. This feels like it's possible via
>> wrapping.
>
>
> As Jeff briefly mentioned, all USCMS sites export their data via XRootd (not
> all of them use HDFS!) and we developed a specialization of XRootd caching
> proxy that can fetch only requested blocks (block size is passed from our
> input stream class to XRootd client (via JNI) and on to the proxy server)
> and keep them in a local cache. This allows as to do three things:
>
> 1. the first time we notice a block is missing, a whole block is fetched
> from elsewhere and further access requests from the same process get
> fulfilled with zero latency;
>
> 2. later requests from other processes asking for this block are fulfilled
> immediately (well, after the initial 3 retries);
>
> 3. we have a list of blocks that were fetched and we could (this is what we
> want to look into in the near future) re-inject them into HDFS if the data
> loss turns out to be permanent (bad disk vs. node that was
> offline/overloaded for a while).
>
> Handling exceptions at the block level thus gives us just what we need. As
> input stream is the place where these errors become known it is, I think,
> also the easiest place to handle them.
>
> I'll understand if you find opening-up of the interfaces in the central
> repository unacceptable. We can always apply the patch at the OSG level
> where rpms for all our deployments get built.
>
> Thanks & Best regards,
> Matevz
>
>>
>> On Wed, Aug 7, 2013 at 3:29 PM, Jeff Dost <[EMAIL PROTECTED]
>> <mailto:[EMAIL PROTECTED]>> wrote:
>>
>>     Thank you for the suggestion, but we don't see how simply wrapping a
>>     FileSystem object would be sufficient in our use case.  The reason why
>> is we
>>     need to catch and handle read exceptions at the block level.  There
>> aren't
>>     any public methods available in the high level FileSystem abstraction
>> layer
>>     that would give us the fine grained control we need at block level
>> read
>>     failures.
>>
>>     Perhaps if I outline the steps more clearly it will help explain what
>> we are
>>     trying to do.  Without our enhancements, suppose a user opens a file
>> stream
>>     and starts reading the file from Hadoop. After some time, at some
>> position
>>     into the file, if there happen to be no replicas available for a
>> particular
>>     block for whatever reason, datanodes have gone down due to disk
>> issues, etc.
>>     the stream will throw an IOException (BlockMissingException or
>> similar) and
>>     the read will fail.
>>
>>     What we are doing is rather than letting the stream fail, we have
>> another
>>     stream queued up that knows how to fetch the blocks elsewhere outside
>> of our
>>     Hadoop cluster that couldn't be retrieved.  So we need to be able to
>> catch
>>     the exception at this point, and these externally fetched bytes then