Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop >> mail # dev >> Review Request 14240: SQOOP-1032: Add the --bulk-load-dir option to support the HBase doBulkLoad function


Copy link to this message
-
Re: Review Request 14240: SQOOP-1032: Add the --bulk-load-dir option to support the HBase doBulkLoad function

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14240/#review26576
-----------------------------------------------------------
Thank you Alexandre for taking up this ticket!

Would you mind adding automated tests to ensure that the functionality is indeed working?
src/java/org/apache/sqoop/SqoopOptions.java
<https://reviews.apache.org/r/14240/#comment51816>

    Considering we already have output directory and we are using it as a temporary directory in case of a Hive import, what about converting this to boolean property and using the usual output directory for storing the data? This way we will be consistent with the rest of Sqoop.

src/java/org/apache/sqoop/SqoopOptions.java
<https://reviews.apache.org/r/14240/#comment51819>

    I would also recommend to make validations that this parameter is used only with hbase import job.

src/java/org/apache/sqoop/mapreduce/HBaseBulkImportJob.java
<https://reviews.apache.org/r/14240/#comment51820>

    This code seems to be to very high extent duplicated from ImportJobBase.runImport(). I'm wondering if we can just use the usual handlers to add the required functionality? Otherwise, I would much rather add new handlers if needed than copy and past the code.

src/java/org/apache/sqoop/mapreduce/HBaseBulkImportJob.java
<https://reviews.apache.org/r/14240/#comment51818>

    This seems to be very dangerous as at some point everyone can read the files, wouldn't be much secure to simply change the owner to the hbase user?
Jarcec

- Jarek Cecho
On Sept. 28, 2013, 12:35 a.m., Alexandre Normand wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/14240/
> -----------------------------------------------------------
>
> (Updated Sept. 28, 2013, 12:35 a.m.)
>
>
> Review request for Sqoop, Jarek Cecho and vasanthkumar.
>
>
> Repository: sqoop-trunk
>
>
> Description
> -------
>
> This is the patch to address some of the comments on Zhancheng Deng's review request (https://reviews.apache.org/r/13052/) which was basically the change to add the --bulk-load-dir option to support the HBase doBulkLoad function.
>
>
> Diffs
> -----
>
>   src/java/org/apache/sqoop/SqoopOptions.java 01805f9
>   src/java/org/apache/sqoop/hbase/HBasePutProcessor.java 9ceb5bd
>   src/java/org/apache/sqoop/hbase/ToStringPutTransformer.java 5ccf311
>   src/java/org/apache/sqoop/manager/SqlManager.java 2a4992d
>   src/java/org/apache/sqoop/mapreduce/HBaseBulkImportJob.java PRE-CREATION
>   src/java/org/apache/sqoop/mapreduce/HBaseBulkImportMapper.java PRE-CREATION
>   src/java/org/apache/sqoop/tool/BaseSqoopTool.java ebb1857
>
> Diff: https://reviews.apache.org/r/14240/diff/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Alexandre Normand
>
>