Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[metrics-1] added basic metrics #445

Merged
merged 3 commits into from
Mar 16, 2023
Merged

[metrics-1] added basic metrics #445

merged 3 commits into from
Mar 16, 2023

Conversation

CommanderStorm
Copy link
Member

Adds these mertics:

  • how often each detail (/api/get/) has been requested (2 Categories: Bot Traffic/Human Traffic)
  • how often each detail (/api/legacy_redirect/) has been requested
  • how long geting map images for the preview API took

From #440

Proposed Changes (include Screenshots if possible)

added the following metrics:

  • navigatum_APPNAME_incoming_requests (labels: endpoint, method, status): the total number of HTTP requests handled by the actix HttpServer.
  • navigatum_APPNAME_response_code (labels: endpoint, method, statuscode, type): Response codes of all HTTP requests handled by the actix HttpServer.
  • navigatum_APPNAME_response_time (labels: endpoint, method, status): Total the request duration of all HTTP requests handled by the actix HttpServer.

How to test this PR

  1. Launch one of the servers
  2. Do some requests
  3. curl http://localhost:8080/metrics
    (note that for example a browser does not work, as these are not meant to be consumed by a browser)

How has this been tested?

  • curl request

Checklist:

  • I have updated the documentation / No need to update the documentation
  • I have run the linter

@CommanderStorm CommanderStorm self-assigned this Mar 14, 2023
@github-actions
Copy link
Contributor

👋 Thank you for contributing. A staging environment for this PR for this change will be available shortly

github-actions bot added a commit that referenced this pull request Mar 14, 2023
@CommanderStorm
Copy link
Member Author

CommanderStorm commented Mar 14, 2023

This opens us up to a trivial DOS possiblity ⇒ is thus unaceptable
See nlopes/actix-web-prom#46 for further details

This gives me the motivation needed to look if migrating to rocket and https://github.com/sd2k/rocket_prometheus is a better plan of action (this repo does not open us up to this vulnerability and has lazy instantiation for custom metrics)

github-actions bot added a commit that referenced this pull request Mar 14, 2023
@github-actions
Copy link
Contributor

👋 Thank you for contributing. A staging environment for this PR for this change will be available shortly

@CommanderStorm
Copy link
Member Author

I have checked out rocket and the features.
While they are really nice and some of the notation is way better, this comes at a cost:
This notation is achieved via procedural macros, which don't work sooo great with my IDE ⇒ not an option.

I have checked if this is really exploitable and have come to this conclusion:
While this is definitely exploitable, the effect is very limited.
The main reason is, that we filter URLs in our central ingress ⇒ employability is limited.

I have looked into, if we can provide a patch to nlopes/actix-web-prom or atomix-team/actix-web-prometheus: This is not possible due to the architecture of actix (what routes are available is not exposed to middlewares as far as I can tell)

Furthermore, the impact of this is, that Prometheus gets taken offline ⇒ this is picked up by Grafana ⇒ we are alerted and could fix this
Impact is only limited to one of our supporting services ⇒ not production

Thus, while not being ideal, I think this is a barely acceptabe risk (I am obviously not happy, but this will work)

Copy link
Contributor

@octycs octycs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a typo but apart from that seems all good. If I understand this correctly, including the Cargo.lock files is better. I was not sure about that.

Since we do not store permanent data in any service that cannot be reretrived, I only see a risk of downtime but no data loss in the case someone tries to spam our metrics until storage space runs out.

server/calendar/src/main.rs Outdated Show resolved Hide resolved
server/main-api/src/main.rs Outdated Show resolved Hide resolved
server/feedback/src/main.rs Outdated Show resolved Hide resolved
Co-authored-by: octycs <octycs@users.noreply.github.com>
@CommanderStorm CommanderStorm enabled auto-merge (squash) March 16, 2023 12:46
@CommanderStorm CommanderStorm merged commit 02121c2 into main Mar 16, 2023
@CommanderStorm CommanderStorm deleted the basic_metrics branch March 16, 2023 12:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants