Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop, mail # user - Creating compressed data with scooop


Copy link to this message
-
Re: Creating compressed data with scooop
Jarek Jarcec Cecho 2013-01-07, 12:31
Hi Santosh,
Sqoop is delegating compression/decompression to mapreduce framework. Thus Sqoop options might be overridden by your Mapreduce configuration (for example by setting that mapreduce output can't be compressed).

Would you mind sharing with us your:

* Hadoop version
* Sqoop version
* TaskTracker configuration file (mapred-site.xml)
* Job XML (~ configuration file) for job generated by Sqoop

I'm particularly looking for:

* io.compression.codecs - Must contain codes you're using with Sqoop
* mapred.compress.map.output - Must be set to true on Job XML level and must not be set to final false in TaskTracker configuration

Jarcec

On Mon, Jan 07, 2013 at 07:34:35PM +0800, Santosh Achhra wrote:
> Hello,
>
> I am trying to import data from table, and I would like final data to be
> compressed on HDFS which would help save some space.
>
> I am executing below mentioned command.
> This command completes successfully and I dont see any error reported
> however when see the final data using  hadoop ls command is not in
> compressed format in HDFS
>
> sqoop --options-file /export/home/sqoop/connect.parm --table TEST
>  --split-by F1  --compression-codec
> org.apache.hadoop.io.compress.SnappyCodec -z
>
> Am I missing something ?
>
> Also I would like to know if I can import data into hive in compressed
> format. I executed below mentioned command, in this case to data into HDFS
> is not in compressed format and describe table command in hive  says that
> table is not compressed
>
> sqoop --options-file /export/home/sqoop/connect.parm --table TEST
>  --split-by F1  --hive-import   -m 1  --compress --compression-codec
> org.apache.hadoop.io.compress.GzipCodec
>
> Good wishes,always !
> Santosh