Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka >> mail # dev >> [jira] [Updated] (KAFKA-739) Handle null values in Message payload


Copy link to this message
-
[jira] [Updated] (KAFKA-739) Handle null values in Message payload

     [ https://issues.apache.org/jira/browse/KAFKA-739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jay Kreps updated KAFKA-739:
----------------------------

    Attachment: KAFKA-739-v1.patch

This patch is more extensive than I expected because I found a hole in the logic when handling deletes in the log compactor. The changes are as follows:

1. Handle null properly in Message.scala and miscellaneous other places.
2. Fix the logic for handling deletes. Previously we guaranteed that we would retain delete records only in the dirty section of the log. This is not sufficient, because a bootstrapping consumer might see a message, but the subsequent delete message might be gc'd before the consumer sees it.
3. OffsetMap.scala: make the map exact using a probing scheme. This means that the tail of the log is actually now fully deduplicated. The motivation for this is making delete-handling easier since to remove a delete tombstone you need to ensure that there are no prior occurrences of that message. Also added a counter on the number of collisions, just to help with any debugging.
4. Added a new configuration log.cleaner.delete.retention.ms that controls the length of time for which delete records are retained. This is implicitly a limit on the amount of time the consumer can spend bootstrapping and still get a consistent bootstrap. Once the topic-level config patch goes in, this will be made available at the topic level and can be set with the create topic tool
5. Added a peek() method to iterator template. Didn't end up using it, but it is a useful feature
6. Changed the integration test tool to issue deletes and changed the verification to handle delete records properly. Redid testing now with deletes included.
7. Added a variety of unit tests for null messages

                
> Handle null values in Message payload
> -------------------------------------
>
>                 Key: KAFKA-739
>                 URL: https://issues.apache.org/jira/browse/KAFKA-739
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Jay Kreps
>            Assignee: Jay Kreps
>             Fix For: 0.8.1
>
>         Attachments: KAFKA-739-v1.patch
>
>
> Add tests for null message payloads in producer, server, and consumer.
> Ensure log cleaner treats these as deletes.
> Test that null keys are rejected on dedupe logs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

 
+
Jay Kreps 2013-03-08, 19:22