Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Drill, mail # dev - Review Request 14027: Adding Limit operator end to end


Copy link to this message
-
Re: Review Request 14027: Adding Limit operator end to end
Jacques Nadeau 2013-09-19, 01:34


> On Sept. 18, 2013, 6:39 p.m., Jacques Nadeau wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/limit/LimitRecordBatch.java, line 14
> > <https://reviews.apache.org/r/14027/diff/2/?file=350250#file350250line14>
> >
> >     Two questions/concerns here:
> >    
> >     It is possible that the limit might somewhere other than the first record batch isn't long enough to contain the records needed but a subsequent batch contains the requested records.  For example, if the requested limit is 100..200 but each batch only carries 50 records, you will return zero records as far as I can see.
> >    
> >     Secondly, since you're not necessarily consuming the entire upstream record batch, you should tell it to stop using kill so that those don't just hang/remain and stop reading upstream data.
>
> Timothy Chen wrote:
>     I was thinking about cross batches myself and I guess I didn't follow up after seeing it working with just one batch.
>     How do you usually carry state between batches then? Essentially I need to remember how many more records I need to filter between batches and potentially can even drop further batches once the limit is already fulfilled.

Your LimitBatch object should be able to hold this information fine.  You don't actually have to hold on to the batches just return them until you get to the last one.

It is better to stop further batches as opposed to read and drop them otherwise a simple select * limit 10 would still read all the data.
> On Sept. 18, 2013, 6:39 p.m., Jacques Nadeau wrote:
> > sqlparser/src/main/java/org/apache/drill/optiq/DrillSortRule.java, line 38
> > <https://reviews.apache.org/r/14027/diff/2/?file=350260#file350260line38>
> >
> >     This seems weird/wrong.  Maybe it is just the comment.  Should the comment read: // this rule only supports conversion of sorts that don't have offset and fetch.
> >    
> >     I guess the part that is weird is rules should be independent.  And there is no guarantee of what order they are applied in so referring to another rule in a rule seems wrong.  Am I missing something here?
>
> Julian Hyde wrote:
>     Jacques, yes you're missing something. It's simpler if a rule just does a small amount of work, and relies on work done by other rules. It would be wrong if an Optiq rule called another rule explicitly (and Optiq doesn't allow that). Rules are independent in the same way that targets in a make file are independent; they may rely on output from previous targets.
>
> Jacques Nadeau wrote:
>     I'm confused.  I feel like you just said the same thing I said.  But you said I'm missing something...
>    
>     I was specifically responding to the in code comment "//Sort already handled by DrillLimitRule".
>    
>     I was saying that it shouldn't talk about other rules and simply should state that it only does a transformation on sorts that have no limit or offset.  Or am I misinterpreting what you are saying, too?
>
> Timothy Chen wrote:
>     I think my comment is probably misleading as it was referring to other rules, I was just thinking to point there is already another rule covering the case when offset or fetch is defined.
>     I can remove this comment and I don't think there is any other concern about this right?

Got it, yes.
- Jacques
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14027/#review26231
-----------------------------------------------------------
On Sept. 10, 2013, 8:29 a.m., Timothy Chen wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/14027/
> -----------------------------------------------------------
>
> (Updated Sept. 10, 2013, 8:29 a.m.)
>
>
> Review request for drill and Jacques Nadeau.
>
>
> Repository: drill-git
>
>
> Description
> -------
>
> Adding Limit operator end to end