Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Content]: Anomalies in the "Anomaly Detection" SQL Examples #5465

Open
jslacks opened this issue Sep 30, 2023 · 0 comments
Open

[Content]: Anomalies in the "Anomaly Detection" SQL Examples #5465

jslacks opened this issue Sep 30, 2023 · 0 comments
Assignees
Labels
content When the bug is about content that needs to get fixed triaged

Comments

@jslacks
Copy link

jslacks commented Sep 30, 2023

What article on segment.com/docs is affected?

https://segment.com/docs/protocols/apis-and-extensions/anomaly_detection/#anomaly-detection-solutions

What part(s) of the article would you like to see updated?

On the Anomaly Detection page, in the section titled Create customized Anomaly Detection dashboards in a BI tool there are a handful of sample queries.

The first query in the section is listed as Source-level event to violation count comparison and it seems like some column names were transposed... the query as written is almost certainly wrong.

The issues seems to be the final step that join the CTEs:

      select  v.date,
              t.total_event_count "Total Violation Count",
              nvl(v.violation_count, 0) "Total Event Count",
              nvl(v.violation_count, 0)::float/t.total_event_count::float as "Violations Per Event"
        from  total_track_event_volume t
   left join  total_violations v
          on  t.date = v.date
    order by  v.date desc
  1. In the first line, v.date should probably be t.date. Typically that would allow the track events to populate the "date spine" with all of the days where tracking was live - then the violations can be joined to the daily tracks with a reasonably safe assumption that there will won't be days omitted. Assuming there were no other errors, this would only return daily metrics for days with violations... any non-violation days end up with NULL dates. (Strictly speaking, I suppose it's possible to have mostly violations and only the occasional compliant track event... but I don't think that was the point here)

  2. The more glaring field transpositions take place in lines 2-3 where the event column is labeled as "violations" and the violations are labeled as "events". Good news is that the next line where the ratio is calculated looks ok!

It appears like the rest of the query is correct. I glanced at the remainder of the queries briefly and they seem to be free of any clerical errors.

Screenshot 2023-09-30 at 5 51 13 AM

Additional information

No response

@jslacks jslacks added content When the bug is about content that needs to get fixed triage labels Sep 30, 2023
@jslacks jslacks changed the title [Content]: [Content]: Anomalies in the "Anomaly Detection" SQL Examples Sep 30, 2023
@forstisabella forstisabella self-assigned this Oct 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
content When the bug is about content that needs to get fixed triaged
Projects
None yet
Development

No branches or pull requests

2 participants