Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> parse_url returning NULL


Copy link to this message
-
Re: parse_url returning NULL
Hello Edward,

       Thank you so much for the quick response. I'll try it out. But I
would like to know, is it something Hive specific?Links do work without a
scheme, like *hive.apache.org*.

Thank again.

Warm Regards,
Tariq
cloudfront.blogspot.com
On Tue, Jun 11, 2013 at 3:40 AM, Edward Capriolo <[EMAIL PROTECTED]>wrote:

> It is not a valid URL if it does not have a scheme and can not be parsed.
>
> SELECT if (column like 'http%', column, concat( 'http://', column) ) as
> column might do what you need.
>
>
> On Mon, Jun 10, 2013 at 5:59 PM, Mohammad Tariq <[EMAIL PROTECTED]>wrote:
>
>> Hello list,
>>
>>          I have a file stored in my HDFS which contains some urls. File
>> looks like this :
>> abc.in
>> xyz.net
>> http://tariq.com
>> http://tariq.in/sompath
>>
>> And i'm trying to get the hostnames from these urls using *parse_url*.
>> It works fine except for the urls which do not contain any scheme. So when
>> I issue
>>
>> hive> select parse_url(url, 'HOST') from url;
>>
>> it gives me :
>>
>> NULL
>> NULL
>> tariq.com
>> tariq.in
>>
>> Could someone please point out the mistake? Many thanks.
>>
>>  Warm Regards,
>> Tariq
>> cloudfront.blogspot.com
>>
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB