Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Count empty relation after filtering


Copy link to this message
-
Re: Count empty relation after filtering
So basically this means that we were trying to look at this from RDBMS' SQL
perspective where 'SELECT COUNT(*) FROM TABLE' returns 0 even if there is
nothing in the result set and that is why we ignored the possibility that
FOREACH might not being executed at all (which could be by design)?

-Shahab
On Wed, May 29, 2013 at 10:13 AM, Marco Brinkmann
<[EMAIL PROTECTED]>wrote:

> Thanks, but this does not change anything. My personal guess (and I only
> work for a few days with pig) is that FOREACH will never be executed,
> because the relation 'test' is empty.
>
>
> 2013/5/29 Shahab Yunus <[EMAIL PROTECTED]>
>
> > Try COUNT_STAR.
> >
> > -Shahab
> >
> >
> > On Wed, May 29, 2013 at 9:55 AM, Marco Brinkmann <
> [EMAIL PROTECTED]
> > >wrote:
> >
> > > Hi everybody,
> > >
> > > I have a rather simple question and scenario, but still I could not
> find
> > an
> > > answer in the documention or in other resource:
> > >
> > > id, valid
> > > (1, false)
> > > (2, false)
> > >
> > > records = LOAD 'test.csv' USING PigStorage(',') AS (id:long,
> > > valid:boolean);
> > >
> > > test = FILTER records BY valid == true;
> > > test_count = FOREACH (GROUP test ALL) GENERATE COUNT(test);
> > >
> > > DUMP test_count;
> > >
> > >
> > > I would expect that 'valid_count' nows contains '0'. But the dump is
> > > completely empty (with 'valid == false' I get '(2)' as expected). I use
> > pig
> > > 0.11.1.
> > >
> > > Could someone point me in the right direction?
> > >
> > > Cheers, Marco
> > >
> >
>