Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Set block size of output


+
Johannes Schwenk 2012-10-15, 10:04
+
Joe Crobak 2012-10-22, 02:01
Copy link to this message
-
Re: Set block size of output
Am 22.10.2012 04:01, schrieb Joe Crobak:
> Hi Johannes,
>
> HDFS block size is controlled by the property 'dfs.blocksize'. You should
> be able to use `set` to control this within your pig script:
> http://pig.apache.org/docs/r0.10.0/cmds.html#set I think that it should
> also work to pass that in via PIG_OPTS, e.g.
> PIG_OPTS='-Ddfs.blocksize=1048576'

Hi Joe,

thanks, this works well. It's dfs.block.size by the way.

Now, is it possible to set this on a per STORE statement basis? If I
have two STORE statements and want the first of them use the default
block size and the second a very small block size, this should be
possible like this:
[...]
STORE a INTO '/user/schwenk/out/a';
SET dfs.block.size 2048;
STORE b INTO '/user/schwenk/out/b';
To my surprise, the files in out/a also had a blocksize of only 2KB!

What can I do? Do I have to write my own storage function for this?

Thanks,
Johannes

> HTH,
> Joe
>
> On Mon, Oct 15, 2012 at 6:04 AM, Johannes Schwenk <
> [EMAIL PROTECTED]> wrote:
>
>> Hi,
>>
>> I would like to set the HDFS block size of my pig scripts output files.
>> How do I do that? I tried to use
>>
>> PIG_OPTS="-Dpig.path.block.size=1048576";
>>
>> which seemed to me the only appropriate option I could find.
>>
>> Thanks for any hints!
>> Johannes Schwenk
>>
>> --
>> Softwareentwickler (Reporting)
>> ________________________________________________________
>>
>> ADITION technologies AG
>> Schwarzwaldstraße 78b
>> 79117 Freiburg
>>
>> http://www.adition.com
>>
>> T +49 / (0)761 / 88147 - 30
>> F +49 / (0)761 / 88147 - 77
>> SUPPORT +49  / (0)1805 - ADITION
>>
>> (Festnetzpreis 14 ct/min; Mobilfunkpreise maximal 42 ct/min)
>>
>> Eingetragen beim Amtsgericht Düsseldorf unter HRB 54076
>> Vorstände: Andreas Kleiser, Jörg Klekamp, Tihomir Perkovic, Marcus Schlüter
>> Aufsichtsratsvorsitzender: Rechtsanwalt Daniel Raimer
>> UStIDNr.: DE 218 858 434
>>
>>
>

Johannes Schwenk

--
Softwareentwickler (Reporting)
________________________________________________________

ADITION technologies AG
Schwarzwaldstraße 78b
79117 Freiburg

http://www.adition.com

T +49 / (0)761 / 88147 - 30
F +49 / (0)761 / 88147 - 77
SUPPORT +49  / (0)1805 - ADITION

(Festnetzpreis 14 ct/min; Mobilfunkpreise maximal 42 ct/min)

Eingetragen beim Amtsgericht Düsseldorf unter HRB 54076
Vorstände: Andreas Kleiser, Jörg Klekamp, Tihomir Perkovic, Marcus Schlüter
Aufsichtsratsvorsitzender: Rechtsanwalt Daniel Raimer
UStIDNr.: DE 218 858 434

+
Johannes Schwenk 2012-10-15, 11:53