Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> only one mapper

Copy link to this message
Re: only one mapper

Try this setting in your hive query

SET mapreduce.input.fileinputformat.split.maxsize=<some bytes>;

If u set this value "low" then the MR job will use this size to split the input LZO files and u will get multiple mappers (and make sure the input LZO files are indexed I.e. .LZO.INDEX files are created)

From: Edward Capriolo <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Date: Wednesday, August 21, 2013 10:43 AM
Subject: Re: only one mapper

LZO files are only splittable if you index them. Sequence files compresses with LZO are splittable without being indexed.

Snappy + SequenceFile is a better option then LZO.
On Wed, Aug 21, 2013 at 1:39 PM, Igor Tatarinov <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
LZO files are combinable so check your max split setting.
http://mail-archives.apache.org/mod_mbox/hive-user/201107.mbox/%[EMAIL PROTECTED]%3E


On Wed, Aug 21, 2013 at 2:17 AM, 闫昆 <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
hi all when i use hive
hive job make only one mapper actually my file split 18 block my block size is 128MB and data size 2GB
i use lzo compression and create file.lzo and make index file.lzo.index
i use hive 0.10.0

Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Cannot run job locally: Input Size (= 2304560827) is larger than hive.exec.mode.local.auto.inputbytes.max (= 134217728)
Starting Job = job_1377071515613_0003, Tracking URL = http://hydra0001:8088/proxy/application_1377071515613_0003/
Kill Command = /opt/module/hadoop-2.0.0-cdh4.3.0/bin/hadoop job  -kill job_1377071515613_0003
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2013-08-21 16:44:30,237 Stage-1 map = 0%,  reduce = 0%
2013-08-21 16:44:40,495 Stage-1 map = 2%,  reduce = 0%, Cumulative CPU 6.81 sec
2013-08-21 16:44:41,710 Stage-1 map = 2%,  reduce = 0%, Cumulative CPU 6.81 sec
2013-08-21 16:44:42,919 Stage-1 map = 2%,  reduce = 0%, Cumulative CPU 6.81 sec
2013-08-21 16:44:44,117 Stage-1 map = 3%,  reduce = 0%, Cumulative CPU 9.95 sec
2013-08-21 16:44:45,333 Stage-1 map = 3%,  reduce = 0%, Cumulative CPU 9.95 sec
2013-08-21 16:44:46,530 Stage-1 map = 5%,  reduce = 0%, Cumulative CPU 13.0 sec


In the Hadoop world, I am just a novice, explore the entire Hadoop ecosystem, I hope one day I can contribute their own code

=====================This email message and any attachments are for the exclusive use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message along with any attachments, from your computer system. If you are the intended recipient, please be advised that the content of this message is subject to access, review and disclosure by the sender's Email System Administrator.