Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce, mail # user - Re: Database insertion by HAdoop


+
Hemanth Yamijala 2013-02-19, 01:14
+
Mohammad Tariq 2013-02-19, 09:41
+
Masoud 2013-02-19, 11:04
Copy link to this message
-
Re: Database insertion by HAdoop
Hemanth Yamijala 2013-02-19, 15:52
Sqoop can be used to export as well.

Thanks
Hemanth

On Tuesday, February 19, 2013, Masoud wrote:

>  Dear Tariq
>
> No, exactly in opposite way, actually we compute the similarity between
> documents and insert them in database, in every table almost 2/000/000
> records.
>
> Best Regards
>
> On 02/19/2013 06:41 PM, Mohammad Tariq wrote:
>
>  Hello Masoud,
>
>        So you want to pull your data from SQL server to your Hadoop
> cluster first and then do the processing. Please correct me if I am wrong.
> You can do that using Sqoop as mention by Hemanth sir. BTW, what exactly is
> the kind of processing which you are planning to do on your data.
>
>  Warm Regards,
> Tariq
> https://mtariq.jux.com/
>  cloudfront.blogspot.com
>
>
> On Tue, Feb 19, 2013 at 6:44 AM, Hemanth Yamijala <
> [EMAIL PROTECTED]> wrote:
>
> Hi,
>
>  You could consider using sqoop. http://sqoop.apache.org/ there seemed to
> be a SQL connector from Microsoft.
> http://www.microsoft.com/en-gb/download/details.aspx?id=27584
>
> Thanks
>  Hemanth
>
> On Tuesday, February 19, 2013, Masoud wrote:
>
>  Hello Tariq,
>
> Our database is sql server 2008,
> and we dont need to develop a professional app, we just need to develop it
> fast and make our experiment result soon.
> Thanks
>
>
> On 02/18/2013 11:58 PM, Hemanth Yamijala wrote:
>
> What database is this ? Was hbase mentioned ?
>
> On Monday, February 18, 2013, Mohammad Tariq wrote:
>
> Hello Masoud,
>
>           You can use the Bulk Load feature. You might find it more
> efficient than normal client APIs or using the TableOutputFormat.
>
>  The bulk load feature uses a MapReduce job to output table data
> in HBase's internal data format, and then directly loads the
> generated StoreFiles into a running cluster. Using bulk load will use
> less CPU and network resources than simply using the HBase API.
>
>  For a detailed info you can go here :
> http://hbase.apache.org/book/arch.bulk.load.html
>
>  Warm Regards,
> Tariq
> https://mtariq.jux.com/
>  cloudfront.blogspot.com
>
>