Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop >> mail # user >> Events get lost. File encode issue


Copy link to this message
-
Events get lost. File encode issue
Dear all,

I have a very simple configuration:

agent.sources = source
agent.channels = channel
agent.sinks = sink

agent.sources.source.type = org.apache.flume.source.SpoolDirectorySource

agent.sources.source.spoolDir = /home/data
agent.sources.source.channels = channel
agent.sources.source.fileHeader = true
agent.sources.source.deletionPolicy = never
agent.sources.source.batchSize= 1000

agent.channels.channel.type = file
agent.channels.channel.checkpointDir = /checkpoint
agent.channels.channel.dataDirs = /data

agent.sinks.sink.channel = channel
agent.sinks.sink.type = file_roll
agent.sinks.sink.sink.directory = /tmp/output
agent.sinks.sink.batchSize= 67108864

and I am missing events. For instance, for a given file of 4000 events, I am only receiving 3000. If I edit the data file and I go to the 3000 line and I delete it, then  I am able to receive all the remained records(3999 events received). If I remove the end of line for the line containing the 3001 event (from vi editor) and I insert again(from vi editor) then also it works(4000 events received).
My data file:
> file –b data.csv
UTF-8 Unicode text, with CRLF line terminators
> grep -c $'\r\n' data.csv
4000

It looks like all the lines have the same end line.
I have several files coming from the same repository with the same problem(each with different line error number).
Any idea?

Thanks,

Zoraida.-

De: Venkat Ranganathan <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Responder a: "[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>" <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Fecha: martes, 30 de julio de 2013 17:40
Para: "[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>" <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Asunto: Re: Sqoop Connect with oracle

You need the Oracle T4 JDBC driver only to use Sqoop - not the complete client installation.  The JDBC URL has the format  jdbc:oracle:thin:@host:port:siid or jdbc:oracle:thin:@//host:port/service

Thanks

Venkat
On Tue, Jul 30, 2013 at 4:53 AM, Manickam P <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
Hi Experts,

Do we need to install oracle client to connect with oracle database from sqoop? bcoz we need to mention the service name in import command. Will it work without having the oracle client?

Please tell me.
Thanks,
Manickam P
________________________________

Este mensaje se dirige exclusivamente a su destinatario. Puede consultar nuestra política de envío y recepción de correo electrónico en el enlace situado más abajo.
This message is intended exclusively for its addressee. We only send and receive email on the basis of the terms set out at:
http://www.tid.es/ES/PAGINAS/disclaimer.aspx
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB