Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop >> mail # dev >> Review Request: SQOOP-683 Documenting sqoop.mysql.export.sleep.ms - easy throttling feature for direct MySQL exports


Copy link to this message
-
Review Request: SQOOP-683 Documenting sqoop.mysql.export.sleep.ms - easy throttling feature for direct MySQL exports

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7880/
-----------------------------------------------------------

Review request for Sqoop.
Description
-------

Code review for SQOOP-683, see https://issues.apache.org/jira/browse/SQOOP-683.
Diffs
-----

  src/docs/user/compatibility.txt 3576fd7

Diff: https://reviews.apache.org/r/7880/diff/
Testing
-------

Converted to XML with asciidoc, the affected part:

<simpara>Sometimes you need to export large data with Sqoop to a live MySQL cluster that
is under a high load serving random queries from the users of our product.
While data consistency issues during the export can be easily solved with a
staging table, there is still a problem: the performance impact caused by the
heavy export.</simpara>
<simpara>First off, the resources of MySQL dedicated to the import process can affect
the performance of the live product, both on the master and on the slaves.
Second, even if the servers can handle the import with no significant
performance impact (mysqlimport should be relatively "cheap"), importing big
tables can cause serious replication lag in the cluster risking data
inconsistency.</simpara>
<simpara>With <literal>-D sqoop.mysql.export.sleep.ms=time</literal>, where <emphasis>time</emphasis> is a value in
milliseconds, you can let the server relax between checkpoints and the replicas
catch up by pausing the export process after transferring the number of bytes
specified in <literal>sqoop.mysql.export.checkpoint.bytes</literal>. Experiment with different
settings of these two parameters to archieve an export pace that doesn’t
endanger the stability of your MySQL cluster.</simpara>
<important><simpara>Note that any arguments to Sqoop that are of the form <literal>-D
parameter=value</literal> are Hadoop <emphasis>generic arguments</emphasis> and must appear before
any tool-specific arguments (for example, <literal>--connect</literal>, <literal>--table</literal>, etc).
Don’t forget that these parameters only work with the <literal>--direct</literal> flag set.</simpara></important>
Thanks,

Zoltán Tóth-Czifra