Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop >> mail # user >> /tmp dir for import configurable?


Copy link to this message
-
Re: /tmp dir for import configurable?
Hi Jarcec,

I am running the command on the CLI of a cluster node. It appears to run a
local MR job writing the results to /tmp before sending it to S3:

[..]
[hostaddress] out: 13/04/02 01:52:49 INFO mapreduce.MySQLDumpMapper:
Beginning mysqldump fast path import
[hostaddress] out: 13/04/02 01:52:49 INFO mapreduce.MySQLDumpMapper:
Performing import of table image from database some_db
[hostaddress] out: 13/04/02 01:52:49 INFO mapreduce.MySQLDumpMapper:
Converting data to use specified delimiters.
[hostaddress] out: 13/04/02 01:52:49 INFO mapreduce.MySQLDumpMapper: (For
the fastest possible import, use
[hostaddress] out: 13/04/02 01:52:49 INFO mapreduce.MySQLDumpMapper:
--mysql-delimiters to specify the same field
[hostaddress] out: 13/04/02 01:52:49 INFO mapreduce.MySQLDumpMapper:
delimiters as are used by mysqldump.)
[hostaddress] out: 13/04/02 01:52:54 INFO mapred.LocalJobRunner:
[hostaddress] out: 13/04/02 01:52:55 INFO mapred.JobClient:  map 100%
reduce 0%
[hostaddress] out: 13/04/02 01:52:57 INFO mapred.LocalJobRunner:
[..]
[hostaddress] out: 13/04/02 01:53:03 INFO mapred.LocalJobRunner:
[hostaddress] out: 13/04/02 01:54:42 INFO mapreduce.MySQLDumpMapper:
Transfer loop complete.
[hostaddress] out: 13/04/02 01:54:42 INFO mapreduce.MySQLDumpMapper:
Transferred 668.9657 MB in 113.0105 seconds (5.9195 MB/sec)
[hostaddress] out: 13/04/02 01:54:42 INFO mapred.LocalJobRunner:
[hostaddress] out: 13/04/02 01:54:42 INFO s3native.NativeS3FileSystem:
OutputStream for key
'some_table/_temporary/_attempt_local555455791_0001_m_000000_0/part-m-00000'
closed. Now beginning upload
[hostaddress] out: 13/04/02 01:54:42 INFO mapred.LocalJobRunner:
[hostaddress] out: 13/04/02 01:54:45 INFO mapred.LocalJobRunner:
[hostaddress] out: 13/04/02 01:55:31 INFO s3native.NativeS3FileSystem:
OutputStream for key
'some_table/_temporary/_attempt_local555455791_0001_m_000000_0/part-m-00000'
upload complete
[hostaddress] out: 13/04/02 01:55:31 INFO mapred.Task:
Task:attempt_local555455791_0001_m_000000_0 is done. And is in the process
of commiting
[hostaddress] out: 13/04/02 01:55:31 INFO mapred.LocalJobRunner:
[hostaddress] out: 13/04/02 01:55:31 INFO mapred.Task: Task
attempt_local555455791_0001_m_000000_0 is allowed to commit now
[hostaddress] out: 13/04/02 01:55:36 INFO mapred.LocalJobRunner:
[hostaddress] out: 13/04/02 01:56:03 WARN output.FileOutputCommitter:
Failed to delete the temporary output directory of task:
attempt_local555455791_0001_m_000000_0 - s3n://secret@bucketsomewhere
/some_table/_temporary/_attempt_local555455791_0001_m_000000_0
[hostaddress] out: 13/04/02 01:56:03 INFO output.FileOutputCommitter: Saved
output of task 'attempt_local555455791_0001_m_000000_0' to
s3n://secret@bucketsomewhere/some_table
[hostaddress] out: 13/04/02 01:56:03 INFO mapred.LocalJobRunner:
[hostaddress] out: 13/04/02 01:56:03 INFO mapred.Task: Task
'attempt_local555455791_0001_m_000000_0' done.
[hostaddress] out: 13/04/02 01:56:03 INFO mapred.LocalJobRunner: Finishing
task: attempt_local555455791_0001_m_000000_0
[hostaddress] out: 13/04/02 01:56:03 INFO mapred.LocalJobRunner: Map task
executor complete.
[hostaddress] out: 13/04/02 01:56:03 INFO s3native.NativeS3FileSystem:
OutputStream for key 'some_table/_SUCCESS' writing to tempfile '*
/tmp/hadoop-jenkins/s3/output-1400873345908825433.tmp*'
[hostaddress] out: 13/04/02 01:56:03 INFO s3native.NativeS3FileSystem:
OutputStream for key 'some_table/_SUCCESS' closed. Now beginning upload
[hostaddress] out: 13/04/02 01:56:03 INFO s3native.NativeS3FileSystem:
OutputStream for key 'some_table/_SUCCESS' upload complete
[...deleting cached jars...]
[hostaddress] out: 13/04/02 01:56:03 INFO mapred.JobClient: Job complete:
job_local555455791_0001
[hostaddress] out: 13/04/02 01:56:03 INFO mapred.JobClient: Counters: 23
[hostaddress] out: 13/04/02 01:56:03 INFO mapred.JobClient:   File System
Counters
[hostaddress] out: 13/04/02 01:56:03 INFO mapred.JobClient:     FILE:
Number of bytes read=6471451
[hostaddress] out: 13/04/02 01:56:03 INFO mapred.JobClient:     FILE:
Number of bytes written=6623109
[hostaddress] out: 13/04/02 01:56:03 INFO mapred.JobClient:     FILE:
Number of read operations=0
[hostaddress] out: 13/04/02 01:56:03 INFO mapred.JobClient:     FILE:
Number of large read operations=0
[hostaddress] out: 13/04/02 01:56:03 INFO mapred.JobClient:     FILE:
Number of write operations=0
[hostaddress] out: 13/04/02 01:56:03 INFO mapred.JobClient:     HDFS:
Number of bytes read=0
[hostaddress] out: 13/04/02 01:56:03 INFO mapred.JobClient:     HDFS:
Number of bytes written=0
[hostaddress] out: 13/04/02 01:56:03 INFO mapred.JobClient:     HDFS:
Number of read operations=0
[hostaddress] out: 13/04/02 01:56:03 INFO mapred.JobClient:     HDFS:
Number of large read operations=0
[hostaddress] out: 13/04/02 01:56:03 INFO mapred.JobClient:     HDFS:
Number of write operations=0
[hostaddress] out: 13/04/02 01:56:03 INFO mapred.JobClient:     S3N: Number
of bytes read=0
[hostaddress] out: 13/04/02 01:56:03 INFO mapred.JobClient:     S3N: Number
of bytes written=773081963
[hostaddress] out: 13/04/02 01:56:03 INFO mapred.JobClient:     S3N: Number
of read operations=0
[hostaddress] out: 13/04/02 01:56:03 INFO mapred.JobClient:     S3N: Number
of large read operations=0
[hostaddress] out: 13/04/02 01:56:03 INFO mapred.JobClient:     S3N: Number
of write operations=0
[hostaddress] out: 13/04/02 01:56:03 INFO mapred.JobClient:   Map-Reduce
Framework
[hostaddress] out: 13/04/02 01:56:03 INFO mapred.JobClient:     Map input
records=1
[hostaddress] out: 13/04/02 01:56:03 INFO mapred.JobClient:     Map output
records=14324124
[hostaddress] out: 13/04/02 01:56:03 INFO mapred.JobClient:     Input split
bytes=87
[hostaddress] out: 13/04/02 01:56:03 INFO mapred.JobClient:     Spilled
Records=0
[hostaddress] out: 13/04/02 01:56:03 INFO mapred.JobClient:     CPU time
spent (ms)=0
[hostaddress] out: 13/04/02 01:56:03 INFO mapred.JobClient:     Physical
memory (bytes) snapshot=0
[hostaddress] out: 13/04/02
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB