Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> DBStorage incompatibility with other Storage in pig Script


Copy link to this message
-
Re: DBStorage incompatibility with other Storage in pig Script
The multi-query option should work without the code change that you
mentioned. But be aware of the fact that you will then lose the benefits of
multi-query execution. SO it is a trade-off that you need to decide
according to your needs.

Regards,
Shahab
On Tue, May 28, 2013 at 1:30 AM, Hardik Shah <[EMAIL PROTECTED]> wrote:

> @Shahab
>
> One of the thing I tried is
> 1) I set auto commit to True
> 2) reduced the default batch size to 1 and change condition from > to >> for executing it (count>=batchsize)
>
> and this help me in this way
> as I set batch size to 1 each query is executed instead of pile up it into
> the batch and as auto-commit is true it will commit it immediately after
> execution.
>
> And this work for me.... *Dont know the pros and cons of it..... *
>
>
>
> Now will try for no_multiquery option.
>
>
>
>
> On Tue, May 28, 2013 at 7:33 AM, Shahab Yunus <[EMAIL PROTECTED]
> >wrote:
>
> > @Hardik,
> >
> > Try to run your run script with 'no_multiquery ' option:
> >
> > pig -no_multiquery myscript.pig
> >
> > Regards,
> > Shahab
> >
> >
> >
> > On Mon, May 27, 2013 at 8:41 AM, Hardik Shah <[EMAIL PROTECTED]>
> > wrote:
> >
> > > DBStorage is not working with other storage in pig script. means
> > DBStorage
> > > is not working with multiple storage statement.
> > >
> > > What I was trying for: 1) I was trying to Store one output using
> > DBStorage
> > > And was trying to store same or different output using simple Store to
> > file
> > > system 2) I also tried to store using DBStorage and using my custom
> store
> > > function
> > >
> > > But in both cases it not storing the data to database. If I comment out
> > > another storage than DBStorage is working properly.
> > >
> > > Even its not throwing any exception or error on reducer's machine..
> > >
> > > Can anyone point out the problem?
> > >
> > > DBStorage is not working with Simple Store to file system. Its only
> > working
> > > if I put only DBStorage no other Store Statement..
> > >
> > > pv_by_industry = GROUP profile_view by viewee_industry_id
> > >
> > > pv_avg_by_industry = FOREACH pv_by_industry GENERATE
> > >     group as viewee_industry_id, AVG(profie_view) AS average_pv;
> > >
> > > STORE pv_avg_by_industry INTO '/tmp/hardik';
> > >
> > > STORE pv_avg_by_industry into /tmp/hardik/db' INTO
> > >     DBStorage('com.mysql.jdbc.Driver',
> > >        'dbc:mysql://hostname/dbname', 'user',
> > >        'pass',
> > >        'INSERT INTO table (viewee_industry_id,average_pv)
> VALUES(?,?)');
> > >
> > >
> > >
> > >
> > > Few things are came into picture when I was debugging it.
> > >
> > > DBStorage is setting Auto commit to False.
> > > So when the batch is executed it will not be auto committed.
> > >
> > > After executing batch OutputCommiter's method commitTask in DBStorage
> > > (inline class' method) was called in which commit is written
> > >
> > >  if (ps != null) {
> > >             try {System.out.println("Executing Batch in commitTask");
> > >               ps.executeBatch();
> > >               con.commit();
> > >               ps.close();
> > >               con.close();
> > >               ps = null;
> > >               con = null;
> > >             } catch (SQLException e) {System.out.println("Exception in
> > > commitTask");
> > >               log.error("ps.close", e);
> > >               throw new IOException("JDBC Error", e);
> > >             }
> > >
> > > and this method is called by PigOutputCommiter
> > >
> > > public void commitTask(TaskAttemptContext context) throws IOException {
> > >         if(HadoopShims.isMap(context.getTaskAttemptID())) {
> > >             for (Pair<OutputCommitter, POStore> mapCommitter :
> > >                 mapOutputCommitters) {
> > >                 if (mapCommitter.first!=null) {
> > >                     TaskAttemptContext updatedContext > > > setUpContext(context,
> > >                             mapCommitter.second);
> > >                     mapCommitter.first.commitTask(updatedContext);
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB