Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS, mail # user - Hadoop HDFS Backup/Restore Solutions


Copy link to this message
-
Re: Hadoop HDFS Backup/Restore Solutions
Ted Dunning 2012-01-03, 22:07
MapR provides this out of the box in a completely Hadoop compatible
environment.

Doing this with straight Hadoop involves a fair bit of baling wire.

On Tue, Jan 3, 2012 at 1:10 PM, alo alt <[EMAIL PROTECTED]> wrote:

> Hi Mac,
>
> hdfs has at the moment no solution for an complete backup- and restore
> process like ITL or ISO9000. An strategy could be to "park" the data from
> hdfs do you want to backup on tape with "distcp" to another backup cluster
> and snapshot from them with SAN mechanism. Here the DN store has to be
> located on the SAN box.
>
> - Alex
>
> On Tuesday, January 3, 2012, Mac Noland <[EMAIL PROTECTED]> wrote:
> > Good day,
> >
> > I’m guessing this question been asked a myriad of times, but
> > we’re about to get serious with some of our Hadoop implementations so I
> wanted
> > to re-ask to see if I’m missing anything, or if others happen to know if
> this might
> > be on a future road map.
> >
> > For our current storage offerings (e.g. NAS or SAN), we give
> > businesses the opportunity to choose 7, 14, or 45 day “backups” for their
> > storage.   The purpose of the backup isn’t
> > so much as they are worried about losing their current data (we’re
> RAID’ed
> > and  have some stuff mirrored to remote
> > datacenters), but more so if they were to delete some data today, they
> can
> > recover from yesterday’s backup.  Or the
> > day before’s backup, or the day before that, etc.  And to be honest,
> business units buy a good portion of their backups to make people feel
> better and fulfill custom contracts.
> >
> >
> > So far with HDFS we haven’t found too many formalized
> > offerings for this specific feature.  While I haven’t done a ton of
> research, the best solution I’ve found is an
> > idea where we’d schedule a job to pull the data locally to a mount that
> is
> > backed up via our traditional methods.  See Michael Segel’s first post
> on this site
> http://lucene.472066.n3.nabble.com/Backing-up-HDFS-td1019184.html
> >
> > Though we’d have to work through the details of what this
> > would look like for our support folks, it looks like something that could
> > potentially fit into our current model.  We’d basically need to allocate
> the same amount of SAN or NAS disk as we
> > have for HDFS, then coordinate a snap on the the SAN or NAS via our
> traditional
> > methods.  Not sure what a restore would
> > look like, other than we could give the end users read access to the NAS
> or SAN
> > mounts so they can pick through what they need to recover and let them
> figure
> > out how to get it back into HDFS.
> >
> > For use cases like ours where we’d need multi-day backups to
> > fulfill business needs, is this kind of what people are thinking or
> doing?  Moreover, are there any things in the Hadoop
> > HDFS road map for providing, for lack of a better word, an “enterprise”
> > backup/restore solution?
> >
> > Thanks in advance,
> >
> > Mac Noland – Thomson Reuters
> >
>
> --
> Alexander Lorenz
> http://mapredit.blogspot.com
>
> *P **Think of the environment: please don't print this email unless you
> really need to.*
>
>
>