-Re: ExportSnapshot very slow. bug?
Bryan Beaudreault 2013-11-08, 19:19
Ok thanks, I guess I am paying the cost of too many regions, which when
multiplied by store files results in many thousand small files. Is there
any reason I couldn't modify this to parallelize it a little?
On Fri, Nov 8, 2013 at 2:06 PM, Matteo Bertozzi <[EMAIL PROTECTED]>wrote:
> The first copy doesn't resolve the links, so you're copying empty files.
> The data copy is only on "step 2" with the MR job
> On Fri, Nov 8, 2013 at 10:54 AM, Bryan Beaudreault <
> [EMAIL PROTECTED]
> > wrote:
> > Hello all. I'm trying out the ExportSnapshot tool and it is extremely
> > slow. I took a look at the code and I think I know why.
> > In step 1 it is for some reason copying from fs1 to fs2. This basically
> > means in a single threaded process we are copying an entire hbase table
> > another cluster. I can understand wanting to copy from fs1 to fs1 (i.e.
> > different path on same fs), so as to dereference all the soft links of
> > snapshots. But why between filesystems?
> > In step 2 you finally do the MR job, which makes much more sense, but as
> > far as I can tell all of the files would already exist, as FileUtils.copy
> > just does a recursive copy of all paths in a tree.
> > Am I missing something? I appreciate any input.
> > - Bryan