Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Re: LIKE filter pushdown for tables and partitions


Copy link to this message
-
Re: LIKE filter pushdown for tables and partitions
sorry to be dumb-ass but what does that translate into in the HSQL dialect?

Judging from the name you use, getPartitionsByFilter, you're saying you
want to remove the use case of using like clause on a partition column?

if so, um, yeah, i would think that's surely used.

On Mon, Aug 26, 2013 at 7:48 PM, Sergey Shelukhin <[EMAIL PROTECTED]>wrote:

> Adding user list. Any objections to removing LIKE support from
> getPartitionsByFilter?
>
> On Mon, Aug 26, 2013 at 2:54 PM, Ashutosh Chauhan <[EMAIL PROTECTED]
> >wrote:
>
> > Couple of questions:
> >
> > 1. What about LIKE operator for Hive itself? Will that continue to work
> > (presumably because there is an alternative path for that).
> > 2. This will nonetheless break other direct consumers of metastore client
> > api (like HCatalog).
> >
> > I see your point that we have a buggy implementation, so whats out there
> is
> > not safe to use. Question than really is shall we remove this code,
> thereby
> > breaking people for whom current buggy implementation is good enough (or
> > you can say salvage them from breaking in future). Or shall we try to fix
> > it now?
> > My take is if there are no users of this anyways, then there is no point
> > fixing it for non-existing users, but if there are we probably have to. I
> > will suggest you to send an email to users@hive to ask if there are
> users
> > for this.
> >
> > Thanks,
> > Ashutosh
> >
> >
> >
> > On Mon, Aug 26, 2013 at 2:08 PM, Sergey Shelukhin <
> [EMAIL PROTECTED]
> > >wrote:
> >
> > > Since there's no response I am assuming nobody cares about this code...
> > > Jira is HIVE-5134, I will attach a patch with removal this week.
> > >
> > > On Wed, Aug 21, 2013 at 2:28 PM, Sergey Shelukhin <
> > [EMAIL PROTECTED]
> > > >wrote:
> > >
> > > > Hi.
> > > >
> > > > I think there are issues with the way hive can currently do LIKE
> > > > operator JDO pushdown and it the code should be removed for
> partitions
> > > > and tables.
> > > > Are there objections to removing LIKE from Filter.g and related
> areas?
> > > > If no I will file a JIRA and do it.
> > > >
> > > > Details:
> > > > There's code in metastore that is capable of pushing down LIKE
> > > > expression into JDO for string partition keys, as well as tables.
> > > > The code for tables doesn't appear used, and partition code
> definitely
> > > > doesn't run in Hive proper because metastore client doesn't send LIKE
> > > > expressions to server. It may be used in e.g. HCat and other places,
> > > > but after asking some people here, I found out it probably isn't.
> > > > I was trying to make it run and noticed some problems:
> > > > 1) For partitions, Hive sends SQL patterns in a filter for like, e.g.
> > > > "%foo%", whereas metastore passes them into matches() JDOQL method
> > > > which expects Java regex.
> > > > 2) Converting the pattern to Java regex via UDFLike method, I found
> > > > out that not all regexes appear to work in DN. ".*foo" seems to work
> > > > but anything complex (such as escaping the pattern using
> > > > Pattern.quote, which UDFLike does) breaks and no longer matches
> > > > properly.
> > > > 3) I tried to implement common cases using JDO methods
> > > > startsWith/endsWith/indexOf (I will file a JIRA), but when I run
> tests
> > > > on Derby, they also appear to have problems with some strings (for
> > > > example, partition with backslash in the name cannot be matched by
> > > > LIKE "%\%" (single backslash in a string), after being converted to
> > > > .indexOf(param) where param is "\" (escaping the backslash once again
> > > > doesn't work either, and anyway there's no documented reason why it
> > > > shouldn't work properly), while other characters match correctly,
> even
> > > > e.g. "%".
> > > >
> > > > For tables, there's no SQL-like, it expects Java regex, but I am not
> > > > convinced all Java regexes are going to work.
> > > >
> > > > So, I think that for future correctness sake it's better to remove
> this
> > > > code.