Hellinger Distance for ALEC: Implementation for DBSCAN

Description

Code implementation of the Hellinger Distance measure into ALEC so that DBSCAN can use such measure for clustering.

This is necessary so that we can make an assessment on the performance of this new measure (Hellinger Distance) and its impact in ALEC. Also, this distance measure is available in code it would be possible to make it selectable via user interaction. 

DONE CRITERIA:

  • Make "first-event-time" timestamp of an alarm available for Hellinger Distance calculation.

  • Parameterize "distance measure", "epsilon", "alpha" and "beta" so that we can modify them via Karaf and eventually a settings page. 

  • Default values:

    • Distance measure: Default/original formula. (in code is called AlarmInSpaceTimeDistanceMeasure)

    • Epsilon: 100d

    • Alpha: 144.47117699d

    • Beta: 0.55257784d

  • Having Hellinger Distance use as a parameter that affects original formula

    • Have another parameter for DBSCAN clustering to choose original formula or original+hellinger.

      • original formula:
        |alpha * ( beta * (Math.abs(timeA - timeB) / 1000d / 60d) + (1-beta) * spatialDistance / DEFAULT_WEIGHT)|

  •  

    •  

      • original+hellinger:
        |(alpha * ( beta * (Math.abs(timeA - timeB) / 1000d / 60d) + (1-beta) * spatialDistance / DEFAULT_WEIGHT)) * (1 + hellinger_h)|

  •  

    •  

      • Implement the calculation of hellinger_h.

This doesn't include:

  • Screen that allows user to select the engine and its distance measure. This may come later.

Acceptance / Success Criteria

None

Attachments

1

Lucidchart Diagrams

Activity

Show:

Benjamin Janssens July 29, 2022 at 2:22 AM

Done

Details

Assignee

Reporter

Sprint

Priority

PagerDuty

Created June 7, 2022 at 10:37 PM
Updated August 8, 2023 at 2:28 PM
Resolved July 29, 2022 at 2:22 AM