Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - DBStorage incompatibility with other Storage in pig Script


Copy link to this message
-
Re: DBStorage incompatibility with other Storage in pig Script
Shahab Yunus 2013-05-28, 12:23
The multi-query option should work without the code change that you
mentioned. But be aware of the fact that you will then lose the benefits of
multi-query execution. SO it is a trade-off that you need to decide
according to your needs.

Regards,
Shahab
On Tue, May 28, 2013 at 1:30 AM, Hardik Shah <[EMAIL PROTECTED]> wrote:

> @Shahab
>
> One of the thing I tried is
> 1) I set auto commit to True
> 2) reduced the default batch size to 1 and change condition from > to >> for executing it (count>=batchsize)
>
> and this help me in this way
> as I set batch size to 1 each query is executed instead of pile up it into
> the batch and as auto-commit is true it will commit it immediately after
> execution.
>
> And this work for me.... *Dont know the pros and cons of it..... *
>
>
>
> Now will try for no_multiquery option.
>
>
>
>
> On Tue, May 28, 2013 at 7:33 AM, Shahab Yunus <[EMAIL PROTECTED]
> >wrote:
>
> > @Hardik,
> >
> > Try to run your run script with 'no_multiquery ' option:
> >
> > pig -no_multiquery myscript.pig
> >
> > Regards,
> > Shahab
> >
> >
> >
> > On Mon, May 27, 2013 at 8:41 AM, Hardik Shah <[EMAIL PROTECTED]>
> > wrote:
> >
> > > DBStorage is not working with other storage in pig script. means
> > DBStorage
> > > is not working with multiple storage statement.
> > >
> > > What I was trying for: 1) I was trying to Store one output using
> > DBStorage
> > > And was trying to store same or different output using simple Store to
> > file
> > > system 2) I also tried to store using DBStorage and using my custom
> store
> > > function
> > >
> > > But in both cases it not storing the data to database. If I comment out
> > > another storage than DBStorage is working properly.
> > >
> > > Even its not throwing any exception or error on reducer's machine..
> > >
> > > Can anyone point out the problem?
> > >
> > > DBStorage is not working with Simple Store to file system. Its only
> > working
> > > if I put only DBStorage no other Store Statement..
> > >
> > > pv_by_industry = GROUP profile_view by viewee_industry_id
> > >
> > > pv_avg_by_industry = FOREACH pv_by_industry GENERATE
> > >     group as viewee_industry_id, AVG(profie_view) AS average_pv;
> > >
> > > STORE pv_avg_by_industry INTO '/tmp/hardik';
> > >
> > > STORE pv_avg_by_industry into /tmp/hardik/db' INTO
> > >     DBStorage('com.mysql.jdbc.Driver',
> > >        'dbc:mysql://hostname/dbname', 'user',
> > >        'pass',
> > >        'INSERT INTO table (viewee_industry_id,average_pv)
> VALUES(?,?)');
> > >
> > >
> > >
> > >
> > > Few things are came into picture when I was debugging it.
> > >
> > > DBStorage is setting Auto commit to False.
> > > So when the batch is executed it will not be auto committed.
> > >
> > > After executing batch OutputCommiter's method commitTask in DBStorage
> > > (inline class' method) was called in which commit is written
> > >
> > >  if (ps != null) {
> > >             try {System.out.println("Executing Batch in commitTask");
> > >               ps.executeBatch();
> > >               con.commit();
> > >               ps.close();
> > >               con.close();
> > >               ps = null;
> > >               con = null;
> > >             } catch (SQLException e) {System.out.println("Exception in
> > > commitTask");
> > >               log.error("ps.close", e);
> > >               throw new IOException("JDBC Error", e);
> > >             }
> > >
> > > and this method is called by PigOutputCommiter
> > >
> > > public void commitTask(TaskAttemptContext context) throws IOException {
> > >         if(HadoopShims.isMap(context.getTaskAttemptID())) {
> > >             for (Pair<OutputCommitter, POStore> mapCommitter :
> > >                 mapOutputCommitters) {
> > >                 if (mapCommitter.first!=null) {
> > >                     TaskAttemptContext updatedContext > > > setUpContext(context,
> > >                             mapCommitter.second);
> > >                     mapCommitter.first.commitTask(updatedContext);