Scriptd consumes CPU even when it does nothing
Activity
Show:
Zoë Knox March 16, 2022 at 6:26 PM
Zoë Knox March 15, 2022 at 5:56 PM
It is simple enough to disable Scriptd and Actiond when there are no scripts or actions configured. It saves a small amount of CPU, and may help under high event loads. Before the changes, at 2000 ev/s:
2022-03-15T12:42:06.970-0400 Process summary
process cpu=183.72%
application cpu=159.19% (user=79.22% sys=79.97%)
other: cpu=24.53%
thread count: 871
GC time=0.92% (young=0.92%, old=0.00%)
heap allocation rate 74mb/s
safe point rate: 0.9 (events/s) avg. safe point pause: 11.79ms
safe point sync time: 0.03% processing time: 1.03% (wallclock time)
[000476] user= 4.64% sys= 5.22% alloc= 4086kb/s - Scriptd:BroadcastEventProcessor-Thread
[000475] user= 3.47% sys= 4.06% alloc= 3429kb/s - Actiond:BroadcastEventProcessor-Thread
[000479] user= 2.07% sys= 2.38% alloc= 2003kb/s - Scriptd-Executor-Thread
and with Scriptd auto-disabled for having no scripts configured:
2022-03-15T13:52:01.574-0400 Process summary
process cpu=175.57%
application cpu=145.14% (user=74.02% sys=71.12%)
other: cpu=30.43%
thread count: 877
GC time=0.69% (young=0.69%, old=0.00%)
heap allocation rate 77mb/s
safe point rate: 1.0 (events/s) avg. safe point pause: 8.25ms
safe point sync time: 0.03% processing time: 0.81% (wallclock time)
[000479] user= 3.57% sys= 2.74% alloc= 4151kb/s - Actiond:BroadcastEventProcessor-Thread
So is it worth it to disable scriptd when not configured? (Detecting whether Actiond has a config is harder and possibly not a "quick win").
Alberto November 18, 2021 at 12:40 AMEdited
@Alejandro Galue
I'm new to OpenNMS and tried to follow the same steps.
Started a clean instance 27.1.2
Started monitoring scripd
Started stress-events for 2000 events/s
Couldn't replicate the CPU usage problem
Running the command
sudo java -jar sjk-plus-0.17.jar ttop --pid $(cat /var/log/opennms/opennms.pid) --filter 'Scriptd' --verbose
The highest values found were:
2021-11-17T19:29:30.184-0500 Process summary
process cpu=47.82%
application cpu=34.97% (user=26.89% sys=8.07%)
other: cpu=12.86%
thread count: 2
GC time=0.25% (young=0.25%, old=0.00%)
heap allocation rate 16mb/s
safe point rate: 7.0 (events/s) avg. safe point pause: 9.95ms
safe point sync time: 0.54% processing time: 6.40% (wallclock time)
[000386] user= 0.15% sys= 0.06% alloc= 56kb/s - Scriptd:BroadcastEventProcessor-Thread
[000389] user= 0.11% sys= 0.03% alloc= 27kb/s - Scriptd-Executor-Th
Maybe there are other steps I should have followed to be able to replicate?
Fixed
Details
Details
Assignee
Benjamin Reed
Benjamin ReedReporter
Alejandro Galue
Alejandro GalueLabels
HB Grooming Date
Mar 29, 2021
HB Backlog Status
NB
Components
Sprint
None
Affects versions
Priority
PagerDuty
PagerDuty Incident
PagerDuty

PagerDuty Incident
Created March 23, 2021 at 2:54 PM
Updated March 29, 2023 at 1:27 PM
Resolved March 29, 2023 at 1:25 PM
The default configuration for Scriptd is empty, meaning it should do nothing.
However, the CPU usage of the Scriptd threads increases proportionally to the events injection rate (and fluctuates around some average). That means, on a busy system that is processing thousands of events per second, the amount of CPU taken by Scriptd can decrease the overall performance of OpenNMS, preventing other features from working properly.
I think it would be useful that Scriptd analyses the configuration and inhibit itself from listening to events when there is no configuration requiring that. And when there is a need for a listener, make sure it won't overwhelm the rest of the JVM.
On the system on which I observed this the first time, Actiond was also behaving similarly. I've never seen a customer using Actiond before, but certainly, Scriptd is more widely used, which is why I focused this issue on it.
I'm targeting M2020 and the latest H27 because before the refactoring to use Immutable Events, the impact on CPU was not that high, which makes me believe that code change might be related.
I used jvm-tools to analyze a clean system running 27.1.0:
sudo java -jar sjk-plus-0.17.jar ttop --pid $(cat /var/log/opennms/opennms.pid) --filter '*Scriptd*' --verbose
Also when using stress-events via Karaf Shell to generate 2000 events per second, I can see:
2021-03-23T10:51:45.525-0400 Process summary process cpu=96.07% application cpu=94.14% (user=74.36% sys=19.77%) other: cpu=1.93% thread count: 2 GC time=0.79% (young=0.79%, old=0.00%) heap allocation rate 173mb/s safe point rate: 1.1 (events/s) avg. safe point pause: 7.68ms safe point sync time: 0.01% processing time: 0.84% (wallclock time) [000356] user= 9.35% sys= 2.74% alloc= 44mb/s - Scriptd-Executor-Thread [000355] user= 2.31% sys= 0.56% alloc= 5847kb/s - Scriptd:BroadcastEventProcessor-Thread
I believe that's excessive for something that is not being in use.