|
Rita
2013-03-04, 11:50
Manish Bhoge
2013-03-04, 12:00
Kevin O'dell
2013-03-04, 12:10
Rita
2013-03-05, 00:18
Leonid Fedotov
2013-03-05, 21:52
Damien Hardy
2013-03-05, 22:15
Asaf Mesika
2013-04-30, 16:07
Suraj Varma
2013-04-30, 16:14
|
-
discp versus exportRita 2013-03-04, 11:50
is it better to do a distcp or export table?
distcp is much faster but it seems export is preferred, is that correct? -- --- Get your facts first, then you can distort them as you please.--
-
Re: discp versus exportManish Bhoge 2013-03-04, 12:00
Export and distcp has different application. Use discp when you need to move data across clusters. Do you want to export table data outside your cluster? If not then export table is better.
Sent from HTC via Rocket! excuse typo.
-
Re: discp versus exportKevin O'dell 2013-03-04, 12:10
DistCP is typically used for HDFS level back up jobs. It can be used for
HBase but can be quite tricky. I would recommend using Export, CopyTable, or Replication. These are tools designed for HBase backup. What is the end goal? On Mon, Mar 4, 2013 at 7:00 AM, Manish Bhoge <[EMAIL PROTECTED]>wrote: > Export and distcp has different application. Use discp when you need to > move data across clusters. Do you want to export table data outside your > cluster? If not then export table is better. > > Sent from HTC via Rocket! excuse typo. > > -- Kevin O'Dell Customer Operations Engineer, Cloudera
-
Re: discp versus exportRita 2013-03-05, 00:18
the end goal is to have a backup of our hbase tables.
On Mon, Mar 4, 2013 at 7:10 AM, Kevin O'dell <[EMAIL PROTECTED]>wrote: > DistCP is typically used for HDFS level back up jobs. It can be used for > HBase but can be quite tricky. I would recommend using Export, CopyTable, > or Replication. These are tools designed for HBase backup. What is the > end goal? > > On Mon, Mar 4, 2013 at 7:00 AM, Manish Bhoge <[EMAIL PROTECTED]>wrote: > >> Export and distcp has different application. Use discp when you need to >> move data across clusters. Do you want to export table data outside your >> cluster? If not then export table is better. >> >> Sent from HTC via Rocket! excuse typo. >> >> > > > -- > Kevin O'Dell > Customer Operations Engineer, Cloudera > -- --- Get your facts first, then you can distort them as you please.--
-
Re: discp versus exportLeonid Fedotov 2013-03-05, 21:52
Rita,
it seems like replication will be the best option for you. Take a look on this doc: http://hbase.apache.org/replication.html Thank you! Sincerely, Leonid Fedotov On Mar 4, 2013, at 4:18 PM, Rita wrote: > the end goal is to have a backup of our hbase tables. > > > On Mon, Mar 4, 2013 at 7:10 AM, Kevin O'dell <[EMAIL PROTECTED]>wrote: > >> DistCP is typically used for HDFS level back up jobs. It can be used for >> HBase but can be quite tricky. I would recommend using Export, CopyTable, >> or Replication. These are tools designed for HBase backup. What is the >> end goal? >> >> On Mon, Mar 4, 2013 at 7:00 AM, Manish Bhoge <[EMAIL PROTECTED]>wrote: >> >>> Export and distcp has different application. Use discp when you need to >>> move data across clusters. Do you want to export table data outside your >>> cluster? If not then export table is better. >>> >>> Sent from HTC via Rocket! excuse typo. >>> >>> >> >> >> -- >> Kevin O'Dell >> Customer Operations Engineer, Cloudera >> > > > > -- > --- Get your facts first, then you can distort them as you please.--
-
Re: discp versus exportDamien Hardy 2013-03-05, 22:15
IMO the easier would be hbase export. For long term offline backup (for
disaster recovery). It can even be stored on a different hdfs storage than the one used by hbase using a full hdfs:// url as destination directory. Le 5 mars 2013 22:52, "Leonid Fedotov" <[EMAIL PROTECTED]> a écrit : > Rita, > it seems like replication will be the best option for you. > Take a look on this doc: > http://hbase.apache.org/replication.html > > Thank you! > > Sincerely, > Leonid Fedotov > On Mar 4, 2013, at 4:18 PM, Rita wrote: > > > the end goal is to have a backup of our hbase tables. > > > > > > On Mon, Mar 4, 2013 at 7:10 AM, Kevin O'dell <[EMAIL PROTECTED] > >wrote: > > > >> DistCP is typically used for HDFS level back up jobs. It can be used > for > >> HBase but can be quite tricky. I would recommend using Export, > CopyTable, > >> or Replication. These are tools designed for HBase backup. What is the > >> end goal? > >> > >> On Mon, Mar 4, 2013 at 7:00 AM, Manish Bhoge < > [EMAIL PROTECTED]>wrote: > >> > >>> Export and distcp has different application. Use discp when you need to > >>> move data across clusters. Do you want to export table data outside > your > >>> cluster? If not then export table is better. > >>> > >>> Sent from HTC via Rocket! excuse typo. > >>> > >>> > >> > >> > >> -- > >> Kevin O'Dell > >> Customer Operations Engineer, Cloudera > >> > > > > > > > > -- > > --- Get your facts first, then you can distort them as you please.-- > >
-
Re: discp versus exportAsaf Mesika 2013-04-30, 16:07
The replication.html reference appears to contain a reference to a bug
(2611) which was solved two years ago :) On Wed, Mar 6, 2013 at 12:15 AM, Damien Hardy <[EMAIL PROTECTED]> wrote: > IMO the easier would be hbase export. For long term offline backup (for > disaster recovery). It can even be stored on a different hdfs storage than > the one used by hbase using a full hdfs:// url as destination directory. > Le 5 mars 2013 22:52, "Leonid Fedotov" <[EMAIL PROTECTED]> a écrit > : > > > Rita, > > it seems like replication will be the best option for you. > > Take a look on this doc: > > http://hbase.apache.org/replication.html > > > > Thank you! > > > > Sincerely, > > Leonid Fedotov > > On Mar 4, 2013, at 4:18 PM, Rita wrote: > > > > > the end goal is to have a backup of our hbase tables. > > > > > > > > > On Mon, Mar 4, 2013 at 7:10 AM, Kevin O'dell <[EMAIL PROTECTED] > > >wrote: > > > > > >> DistCP is typically used for HDFS level back up jobs. It can be used > > for > > >> HBase but can be quite tricky. I would recommend using Export, > > CopyTable, > > >> or Replication. These are tools designed for HBase backup. What is > the > > >> end goal? > > >> > > >> On Mon, Mar 4, 2013 at 7:00 AM, Manish Bhoge < > > [EMAIL PROTECTED]>wrote: > > >> > > >>> Export and distcp has different application. Use discp when you need > to > > >>> move data across clusters. Do you want to export table data outside > > your > > >>> cluster? If not then export table is better. > > >>> > > >>> Sent from HTC via Rocket! excuse typo. > > >>> > > >>> > > >> > > >> > > >> -- > > >> Kevin O'Dell > > >> Customer Operations Engineer, Cloudera > > >> > > > > > > > > > > > > -- > > > --- Get your facts first, then you can distort them as you please.-- > > > > >
-
Re: discp versus exportSuraj Varma 2013-04-30, 16:14
Read this: http://blog.sematext.com/2011/03/11/hbase-backup-options/ for
the high level difference between export and distcp. The key factor here is the data in memstore that has not been flushed out to disk yet ... and the resultant inconsistency if you just do distcp. --Suraj On Tue, Apr 30, 2013 at 9:07 AM, Asaf Mesika <[EMAIL PROTECTED]> wrote: > The replication.html reference appears to contain a reference to a bug > (2611) which was solved two years ago :) > > > On Wed, Mar 6, 2013 at 12:15 AM, Damien Hardy <[EMAIL PROTECTED]> > wrote: > > > IMO the easier would be hbase export. For long term offline backup (for > > disaster recovery). It can even be stored on a different hdfs storage > than > > the one used by hbase using a full hdfs:// url as destination directory. > > Le 5 mars 2013 22:52, "Leonid Fedotov" <[EMAIL PROTECTED]> a > écrit > > : > > > > > Rita, > > > it seems like replication will be the best option for you. > > > Take a look on this doc: > > > http://hbase.apache.org/replication.html > > > > > > Thank you! > > > > > > Sincerely, > > > Leonid Fedotov > > > On Mar 4, 2013, at 4:18 PM, Rita wrote: > > > > > > > the end goal is to have a backup of our hbase tables. > > > > > > > > > > > > On Mon, Mar 4, 2013 at 7:10 AM, Kevin O'dell < > [EMAIL PROTECTED] > > > >wrote: > > > > > > > >> DistCP is typically used for HDFS level back up jobs. It can be > used > > > for > > > >> HBase but can be quite tricky. I would recommend using Export, > > > CopyTable, > > > >> or Replication. These are tools designed for HBase backup. What is > > the > > > >> end goal? > > > >> > > > >> On Mon, Mar 4, 2013 at 7:00 AM, Manish Bhoge < > > > [EMAIL PROTECTED]>wrote: > > > >> > > > >>> Export and distcp has different application. Use discp when you > need > > to > > > >>> move data across clusters. Do you want to export table data outside > > > your > > > >>> cluster? If not then export table is better. > > > >>> > > > >>> Sent from HTC via Rocket! excuse typo. > > > >>> > > > >>> > > > >> > > > >> > > > >> -- > > > >> Kevin O'Dell > > > >> Customer Operations Engineer, Cloudera > > > >> > > > > > > > > > > > > > > > > -- > > > > --- Get your facts first, then you can distort them as you please.-- > > > > > > > > > |