|
|
+
prakash sejwani 2010-03-08, 16:31
+
Amr Awadallah 2010-03-09, 02:51
+
김영우 2010-03-09, 04:10
-
Re: regex_extract in hiveNick Dimiduk 2010-03-10, 19:37
The parse_url UDF works in general but the common use case is querying
apache logs which do not include the protocol or host portions - you need to include a concat() call. Also, the docs on parse_url are wrong around the query parameter parsing feature. The describe statement above shows the actual syntax. In your case, you'll likely want something like: SELECT parse_url( concat("http://www.foo.com/", request), 'QUERY', 'tag') FROM log_table; Cheers, -Nick On Mon, Mar 8, 2010 at 8:10 PM, 김영우 <[EMAIL PROTECTED]> wrote: > Hi Prakash, > > You can extract query string from url using 'parse_url' udf. > > hive> describe function parse_url; > OK > parse_url(url, partToExtract[, key]) - extracts a part from a URL > Time taken: 0.024 seconds > hive> > hive> select parse_url(' > http://www.example.com/searches/tagged_with?company=2-Opico&page=8&product=36154-7653-BACKUP-PLATE-F-STORAGE-STAND&tag=demco', > 'QUERY', 'tag') from r; > . > . > Ended Job = job_200911251712_2046 > OK > demco > Time taken: 19.405 seconds > hive> > > > Regards, > Youngwoo > > > 2010/3/9 prakash sejwani <[EMAIL PROTECTED]> > > Hi All, >> i have a query below >> SELECT regexp_extract(resource,'/\&tag=([^\&]+)/') FROM a_log; >> >> it gives black result >> >> the sample resource string is like this >> "/searches/tagged_with?company=2-Opico&page=8&product=36154-7653-BACKUP-PLATE-F-STORAGE-STAND&tag=demco" >> i want to extract demco out of it >> >> please help me with this >> >> >> thanks' >> prakash >> Econify Infotech >> Mumbai >> >> > |