Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> DBStorage incompatibility with other Storage in pig Script


+
Hardik Shah 2013-05-27, 09:47
+
Hardik Shah 2013-05-27, 09:48
+
Hardik Shah 2013-05-27, 12:41
+
Shahab Yunus 2013-05-28, 02:03
+
Hardik Shah 2013-05-28, 05:30
+
Shahab Yunus 2013-05-28, 12:23
Copy link to this message
-
Re: DBStorage incompatibility with other Storage in pig Script
I'd recommend to try Sqoop for RDBMS-related tasks.
On Mon, May 27, 2013 at 4:41 PM, Hardik Shah <[EMAIL PROTECTED]> wrote:

> DBStorage is not working with other storage in pig script. means DBStorage
> is not working with multiple storage statement.
>
> What I was trying for: 1) I was trying to Store one output using DBStorage
> And was trying to store same or different output using simple Store to file
> system 2) I also tried to store using DBStorage and using my custom store
> function
>
> But in both cases it not storing the data to database. If I comment out
> another storage than DBStorage is working properly.
>
> Even its not throwing any exception or error on reducer's machine..
>
> Can anyone point out the problem?
>
> DBStorage is not working with Simple Store to file system. Its only working
> if I put only DBStorage no other Store Statement..
>
> pv_by_industry = GROUP profile_view by viewee_industry_id
>
> pv_avg_by_industry = FOREACH pv_by_industry GENERATE
>     group as viewee_industry_id, AVG(profie_view) AS average_pv;
>
> STORE pv_avg_by_industry INTO '/tmp/hardik';
>
> STORE pv_avg_by_industry into /tmp/hardik/db' INTO
>     DBStorage('com.mysql.jdbc.Driver',
>        'dbc:mysql://hostname/dbname', 'user',
>        'pass',
>        'INSERT INTO table (viewee_industry_id,average_pv) VALUES(?,?)');
>
>
>
>
> Few things are came into picture when I was debugging it.
>
> DBStorage is setting Auto commit to False.
> So when the batch is executed it will not be auto committed.
>
> After executing batch OutputCommiter's method commitTask in DBStorage
> (inline class' method) was called in which commit is written
>
>  if (ps != null) {
>             try {System.out.println("Executing Batch in commitTask");
>               ps.executeBatch();
>               con.commit();
>               ps.close();
>               con.close();
>               ps = null;
>               con = null;
>             } catch (SQLException e) {System.out.println("Exception in
> commitTask");
>               log.error("ps.close", e);
>               throw new IOException("JDBC Error", e);
>             }
>
> and this method is called by PigOutputCommiter
>
> public void commitTask(TaskAttemptContext context) throws IOException {
>         if(HadoopShims.isMap(context.getTaskAttemptID())) {
>             for (Pair<OutputCommitter, POStore> mapCommitter :
>                 mapOutputCommitters) {
>                 if (mapCommitter.first!=null) {
>                     TaskAttemptContext updatedContext > setUpContext(context,
>                             mapCommitter.second);
>                     mapCommitter.first.commitTask(updatedContext);
>                 }
>             }
>         } else {
>             for (Pair<OutputCommitter, POStore> reduceCommitter :
>                 reduceOutputCommitters) {
>                 if (reduceCommitter.first!=null) {
>                     TaskAttemptContext updatedContext > setUpContext(context,
>                             reduceCommitter.second);
>                     reduceCommitter.first.commitTask(updatedContext);
>                 }
>             }
>         }
>     }
>
>
> But when this commitTask is called its connection and preparedStatment
> Objects become null... so it ll not commit so data is not available in
> Database.....
>
>
>
> But if you write only DBStorage without any other Store statement in script
> it will work properly..
>
>
> Any clues???
>
>
>
>
>
>
>
> --
> Regards,
> Hardik Shah.
>