Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume, mail # user - ExecSource copy does not match original. Thoughts please?


Copy link to this message
-
ExecSource copy does not match original. Thoughts please?
Chris Neal 2013-07-31, 18:48
Hi all.

I have an ExecSource doing a tail -F on a log4J log file for an app,
copying it into HDFS.  I get no errors/warnings/exceptions from the Flume
nodes, but when I went to make sure that indeed the contents of the files
matched, I found that they did not. :(  I tested several days worth of
files, and none matched.  I'm not sure where to even start looking at this
discrepancy. Does anyone have any thoughts?

If I would have come across some errors somewhere, I would understand some
differences, but for everything to appear to work fine, and then not match
up, that concerns me.

Thank you very much for any input.
Chris

In HDFS from Flume, file size in lines:
[root@hadoopnn01 ~]# time sudo -u hdfs hadoop fs -text
/pegs/logs/udprodae01/d1c1_udprodae01/UD/UDTrans/2013-07-27/UDXMLTrans.log.*
| wc -l

2812850

Actual source file size in lines:
cneal@pegslog14[504]:/pegs/logcabin01/udprodae01/pegs/logs/udprodae01/d1c1_udprodae01/UD>
time wc -l UDXMLTrans.log.2013-07-27

 2812843 UDXMLTrans.log.2013-07-27

The source file:
cneal@pegslog14[505]:/pegs/logcabin01/udprodae01/pegs/logs/udprodae01/d1c1_udprodae01/UD>
ls -l UDXMLTrans.log.2013-07-27
-rw-r--r--   1 logger   other    19228787343 Jul 28 00:00
UDXMLTrans.log.2013-07-27

The files in HDFS:
[root@hadoopnn01 ~]# time sudo -u hdfs hadoop fs -ls
/pegs/logs/udprodae01/d1c1_udprodae01/UD/UDTrans/2013-07-27/UDXMLTrans.log.*
Found 1 items
-rw-r--r--   3 flume supergroup  200021549 2013-07-28 00:00
/pegs/logs/udprodae01/d1c1_udprodae01/UD/UDTrans/2013-07-27/UDXMLTrans.log.2013-07-27_1.1374883211499.gz
Found 1 items
-rw-r--r--   3 flume supergroup  195398211 2013-07-28 00:00
/pegs/logs/udprodae01/d1c1_udprodae01/UD/UDTrans/2013-07-27/UDXMLTrans.log.2013-07-27_10.1374883210982.gz
Found 1 items
-rw-r--r--   3 root  supergroup  193557330 2013-07-28 00:00
/pegs/logs/udprodae01/d1c1_udprodae01/UD/UDTrans/2013-07-27/UDXMLTrans.log.2013-07-27_13.1374883212709.gz
Found 1 items
-rw-r--r--   3 root  supergroup  194163091 2013-07-28 00:00
/pegs/logs/udprodae01/d1c1_udprodae01/UD/UDTrans/2013-07-27/UDXMLTrans.log.2013-07-27_14.1374883212712.gz
Found 1 items
-rw-r--r--   3 flume supergroup  192546288 2013-07-28 00:00
/pegs/logs/udprodae01/d1c1_udprodae01/UD/UDTrans/2013-07-27/UDXMLTrans.log.2013-07-27_2.1374883211446.gz
Found 1 items
-rw-r--r--   3 root  supergroup  191863735 2013-07-28 00:00
/pegs/logs/udprodae01/d1c1_udprodae01/UD/UDTrans/2013-07-27/UDXMLTrans.log.2013-07-27_5.1374883208056.gz
Found 1 items
-rw-r--r--   3 root  supergroup  196733297 2013-07-28 00:00
/pegs/logs/udprodae01/d1c1_udprodae01/UD/UDTrans/2013-07-27/UDXMLTrans.log.2013-07-27_6.1374883208056.gz
Found 1 items
-rw-r--r--   3 flume supergroup  193451845 2013-07-28 00:00
/pegs/logs/udprodae01/d1c1_udprodae01/UD/UDTrans/2013-07-27/UDXMLTrans.log.2013-07-27_9.1374883210989.gz