Issues
- Kafka Producer deadlockNMS-13499Resolved issue: NMS-13499
- Reflected XSS in webapp notice wizardNMS-13496Resolved issue: NMS-13496Jeff Gehlbach
- Migrate foreign source content from Discourse to docsNMS-13482Resolved issue: NMS-13482
- Update table formatting in docs.NMS-13472Resolved issue: NMS-13472Bonnie Robinson
- Document search panelNMS-13408Resolved issue: NMS-13408Bonnie Robinson
Kafka Producer deadlock
Description
Acceptance / Success Criteria
Lucidchart Diagrams
Details
Assignee
UnassignedUnassignedReporter
Jesse WhiteJesse WhiteAffects versions
Priority
Minor
Details
Details
Assignee
Reporter
Affects versions
Priority
PagerDuty
PagerDuty
PagerDuty
Activity
Jesse WhiteAugust 16, 2021 at 6:48 PM
Duplicate of
Jesse WhiteAugust 16, 2021 at 6:47 PM
After some further investigation it looks like it can take a long time (8 minutes+) to load some of the HwEntity trees. This could explain the observed behavior.
Jesse WhiteAugust 10, 2021 at 12:57 PM
As a workaround the nodeTopic
can be set to an empty string so that no node data is forward, and this mapping does not occur.
Jesse WhiteAugust 10, 2021 at 12:56 PM
Here is the line in the Kafka Producer code where it appears to be stuck:
https://github.com/OpenNMS/opennms/blob/opennms-28.0.1-1/features/kafka/producer/src/main/java/org/opennms/features/kafka/producer/ProtobufMapper.java#L148
There may be an issue with loading this specific HwEntity tree, or perhaps a deadlock with the database.
In these calls, the Alarmd already has a R/W DB transaction open, and another nested R/O transaction is opened by the NodeCache here: https://github.com/OpenNMS/opennms/blob/opennms-28.0.1-1/features/kafka/producer/src/main/java/org/opennms/features/kafka/producer/NodeCache.java#L77
Updating the code in the NodeCache to conditionally open the transaction (skip if one is already open) may help the problem.
In a production deployment we have noticed that the Kafka Producer gets stuck in an apparent deadlock, which results in most event processing to halt.
The following stack trace was observed when in a such a state: