Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop >> mail # dev >> Review Request: SQOOP-683 Documenting sqoop.mysql.export.sleep.ms - easy throttling feature for direct MySQL exports


Copy link to this message
-
Re: Review Request: SQOOP-683 Documenting sqoop.mysql.export.sleep.ms - easy throttling feature for direct MySQL exports

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7880/#review13260
-----------------------------------------------------------

Ship it!
Thank you for your changes Zoltan. Please upload your patch to the JIRA (as a file) and I'll commit it.

Jarcec

- Jarek Cecho
On Nov. 8, 2012, 6:35 p.m., Zoltán Tóth-Czifra wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/7880/
> -----------------------------------------------------------
>
> (Updated Nov. 8, 2012, 6:35 p.m.)
>
>
> Review request for Sqoop.
>
>
> Description
> -------
>
> Code review for SQOOP-683, see https://issues.apache.org/jira/browse/SQOOP-683.
>
>
> Diffs
> -----
>
>   src/docs/user/compatibility.txt 3576fd7
>
> Diff: https://reviews.apache.org/r/7880/diff/
>
>
> Testing
> -------
>
> Converted to XML with asciidoc, the affected part:
>
> <simpara>Sometimes you need to export large data with Sqoop to a live MySQL cluster that
> is under a high load serving random queries from the users of our product.
> While data consistency issues during the export can be easily solved with a
> staging table, there is still a problem: the performance impact caused by the
> heavy export.</simpara>
> <simpara>First off, the resources of MySQL dedicated to the import process can affect
> the performance of the live product, both on the master and on the slaves.
> Second, even if the servers can handle the import with no significant
> performance impact (mysqlimport should be relatively "cheap"), importing big
> tables can cause serious replication lag in the cluster risking data
> inconsistency.</simpara>
> <simpara>With <literal>-D sqoop.mysql.export.sleep.ms=time</literal>, where <emphasis>time</emphasis> is a value in
> milliseconds, you can let the server relax between checkpoints and the replicas
> catch up by pausing the export process after transferring the number of bytes
> specified in <literal>sqoop.mysql.export.checkpoint.bytes</literal>. Experiment with different
> settings of these two parameters to archieve an export pace that doesn’t
> endanger the stability of your MySQL cluster.</simpara>
> <important><simpara>Note that any arguments to Sqoop that are of the form <literal>-D
> parameter=value</literal> are Hadoop <emphasis>generic arguments</emphasis> and must appear before
> any tool-specific arguments (for example, <literal>--connect</literal>, <literal>--table</literal>, etc).
> Don’t forget that these parameters only work with the <literal>--direct</literal> flag set.</simpara></important>
>
>
> Thanks,
>
> Zoltán Tóth-Czifra
>
>