Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> about hive limit optimization settings


+
Wu, James C. 2013-01-24, 23:12
+
Nitin Pawar 2013-01-25, 06:13
Copy link to this message
-
Re: about hive limit optimization settings
Hi James.

Basically if we have a table called table A which is mapped to a directory in hive /data/a . And n is the number of the files under /data/a  with each row size s.

hive -e "select * from a limit 10"

To show the result very fast

hive.limit.optimize.limit.file < n
in this case will be 10

and the The  hive.limit.row.max.size = s which may vary according the actual data.

Hope this helps.

The  hive.limit.row.max.size control the size of each
Hortonworks, Inc.
Technical Support Engineer
Abdelrahman Shettia
[EMAIL PROTECTED]
Office phone: (708) 689-9609
How am I doing?   Please feel free to provide feedback to my manager Rick Morris at [EMAIL PROTECTED]
On Jan 24, 2013, at 3:12 PM, "Wu, James C." <[EMAIL PROTECTED]> wrote:

> Hi,
>  
> Do anyone know the meaning of these hive settings? The description of them are not clear to me. If someone can give me an example of how they shall be used, it would be great!
>  
> <property>
>   <name>hive.limit.row.max.size</name>
>   <value>100000</value>
>   <description>When trying a smaller subset of data for simple LIMIT, how much size we need to guarantee
>    each row to have at least.</description>
> </property>
>  
> <property>
>   <name>hive.limit.optimize.limit.file</name>
>   <value>10</value>
>   <description>When trying a smaller subset of data for simple LIMIT, maximum number of files we can
>    sample.</description>
> </property>
>  
> Regards,
>  
> James
>  
>  

NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB