Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Format of Kafka storage on disk


Copy link to this message
-
Re: Format of Kafka storage on disk
The DumpLogSegments should do that for you
https://github.com/apache/kafka/blob/0.8/core/src/main/scala/kafka/tools/DumpLogSegments.scala

bin/kafka-run-class.sh kafka.tools.DumpLogSegments

Option                                  Description

------                                  -----------

--deep-iteration                        if set, uses deep instead of
shallow
                                          iteration

--files <file1, file2, ...>             REQUIRED: The comma separated list
of
                                          data and index log files to be
dumped
--max-message-size <Integer: size>      Size of largest message. (default:

                                          5242880)

--print-data-log                        if set, printing the messages
content
                                          when dumping data logs

--verify-index-only                     if set, just verify the index log

                                          without printing its content

or use the code as entry point for whatever you want to-do :)
/*******************************************
 Joe Stein
 Founder, Principal Consultant
 Big Data Open Source Security LLC
 http://www.stealth.ly
 Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop>
********************************************/
On Fri, Jan 3, 2014 at 5:10 PM, Subbu Srinivasan <[EMAIL PROTECTED]>wrote:

> Is there any place where I can know about the internal structure of
> the log file where kafka stores the data. A topic has a .index and a .log
> file.
>
> I want to read the entire log file and parse the contents out.
>
> Thanks
> Subbu
>