chore: script that discovers minimum tested versions #9418

emmettbutler · 2024-05-28T21:09:54Z

This script discovers the minimum version of every package referenced in riotfile.py's Venv tree (excluding those envs that do not execute a pytest command) and writes that information to a CSV file. It also takes into account the installation dependencies in pyproject.toml.

For use in #9323

Checklist

Change(s) are motivated and described in the PR description
Testing strategy is described if automated tests are not included in the PR
Risks are described (performance impact, potential for breakage, maintainability)
Change is maintainable (easy to change, telemetry, documentation)
Library release note guidelines are followed or label changelog/no-changelog is set
Documentation is included (in-code, generated user docs, public corp docs)
Backport labels are set (if applicable)
If this PR changes the public interface, I've notified @DataDog/apm-tees.

Reviewer Checklist

Title is accurate
All changes are related to the pull request's stated goal
Description motivates each change
Avoids breaking API changes
Testing strategy adequately addresses listed risks
Change is maintainable (easy to change, telemetry, documentation)
Release note makes sense to a user of the library
Author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment
Backport labels are set in a manner that is consistent with the release branch maintenance policy

brettlangdon

do we need to do this without riot?

are we able to use riot to generate an artifact during build time? e.g. the Build deploy job could generate sdist, wheels, and some other artifacts like supported library versions

then that can be packaged with the other artifacts into the OCI artifact.

(we could even generate the result as Python and have it just embedded into the sitecustomize.py file if we don't want to read something from disk)

all_requirements.py

datadog-dd-trace-py-rkomorn · 2024-05-28T22:08:47Z

Datadog Report

Branch report: emmett.butler/min_versions
Commit report: 5ade18a
Test service: dd-trace-py

✅ 0 Failed, 140838 Passed, 36443 Skipped, 7h 50m 47.26s Total duration (2h 36m 3.2s time saved)
❄️ 1 New Flaky

New Flaky Tests (1)

test_otel_trace_across_fork - test_context.py - Last Failure

Expand for error

 At request <Request GET /test/session/snapshot >:
    At snapshot (token='tests.opentelemetry.test_context.test_otel_trace_across_fork'):
     - Directory: /snapshots
     - CI mode: 1
     - Trace File: /snapshots/tests.opentelemetry.test_context.test_otel_trace_across_fork.json
     - Stats File: /snapshots/tests.opentelemetry.test_context.test_otel_trace_across_fork_tracestats.json
     At compare of 1 expected trace(s) to 1 received trace(s):
      At trace 'internal' (2 spans):
 Received fewer spans (1) than expected (2). Expected unmatched spans: 'internal'

typing

all_requirements.py

emmettbutler · 2024-05-29T19:28:47Z

@brettlangdon I realized a reason using import riotfile isn't ideal: it would require the script to live in the same directory as the riotfile, because the riotfile does not exist inside of a python module with an __init__ function.

I think this change is stable now.

brettlangdon · 2024-05-30T12:11:44Z

@brettlangdon I realized a reason using import riotfile isn't ideal: it would require the script to live in the same directory as the riotfile, because the riotfile does not exist inside of a python module with an __init__ function.

I think this change is stable now.

it just means the script needs to be executed from the root directory no?

we could also just add the root directory to the python path either via shell script or directly in python.

🤷🏻 not trying to argue for one specific way vs the other

emmettbutler · 2024-05-30T15:04:38Z

@brettlangdon yup, you're right. Updated.

brettlangdon

overall riot parsing looks good, other than I have a question about how to convert specifiers into actual minimum versions

scripts/min_compatible_versions.py

min_compatible_versions.csv

emmettbutler · 2024-05-31T15:08:16Z

@brettlangdon I've adjusted the script to avoid stripping out the specifier information (<, >=, etc) from the version string that it writes to the csv file. Because the determination of "minimum" is done based on naive gt/lt comparison there are probably some edge cases where the wrong thing gets written to the file, but I think that in the majority of cases this approach will be solid enough for use in the single-step guardrail logic.

brettlangdon · 2024-05-31T15:15:38Z

I think that in the majority of cases this approach will be solid enough for use in the single-step guardrail logic.

@emmettbutler what are you thinking, just parsing the specifiers in SSI and then comparing actual versions against them?

if yes, we need to keep in mind that packaging might not be present on the system.

emmettbutler · 2024-05-31T15:29:01Z

@brettlangdon pretty much, yes. It seems to me that including the range markers in the minimum versions file is the best we can do without going to PyPI during that file's creation. I'm trying to avoid having the script go to PyPI because I suspect the large overhead it would add is unnecessary to achieve the guardrails goal.

I think we can compare version specifiers to the degree necessary without the packaging module. That comparison really just boils down to whether or not the minimum version specifier includes a less-than sign. See my other change for an illustration of what I mean.

emmettbutler · 2024-06-05T12:27:27Z

@erikayasuda @ZStriker19 @brettlangdon let me know if you'd like any changes to this approach

pr-commenter · 2024-06-05T13:18:24Z

Benchmarks

Benchmark execution time: 2024-06-07 14:16:41

Comparing candidate commit ade7bec in PR branch emmett.butler/min_versions with baseline commit c035b91 in branch main.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 221 metrics, 9 unstable metrics.

brettlangdon

lgtm, but I am worried about including out test dependencies, and it isn't clear to me how to identify those from our riotfile.py 🤔

min_compatible_versions.csv

scripts/min_compatible_versions.py

min_compatible_versions.csv

ignore riot environments that do not run pytest. this seems like a reasonable proxy for ignoring packages that are required only for tests

This pull request adds "guardrails" to the "library injection" process. These are early exit conditions from the instrumentation process intended to avoid sending any traces when undefined behavior is likely. The code makes this determination on the basis of software versions present in the application environment, both of Python packages and the Python runtime itself. The biggest risk here is that instrumentation is disabled when it's not intended to be. I think existing tests in `tests/lib-injection` cover this pretty well. There's a new test added that verifies instrumentation was cancelled when an unsupported package version is present. Contains changes from #9418 Related RFC: "[RFC] One Step Guardrails" ## Checklist - [x] minimum package version checks - [x] Testing - [x] replace envvars with inject_force - [x] figure out what to use instead of pkg_resources - [x] replace local file path with `DD_TELEMETRY_FORWARDER_PATH` - [x] Change(s) are motivated and described in the PR description - [x] Testing strategy is described if automated tests are not included in the PR - [x] Risks are described (performance impact, potential for breakage, maintainability) - [x] Change is maintainable (easy to change, telemetry, documentation) - [x] [Library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) are followed or label `changelog/no-changelog` is set - [x] Documentation is included (in-code, generated user docs, [public corp docs](https://github.com/DataDog/documentation/)) - [x] Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) - [x] If this PR changes the public interface, I've notified `@DataDog/apm-tees`. ## Reviewer Checklist - [x] Title is accurate - [x] All changes are related to the pull request's stated goal - [x] Description motivates each change - [x] Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes - [x] Testing strategy adequately addresses listed risks - [x] Change is maintainable (easy to change, telemetry, documentation) - [x] Release note makes sense to a user of the library - [x] Author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment - [x] Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting) --------- Co-authored-by: Emmett Butler <723615+emmettbutler@users.noreply.github.com> Co-authored-by: Emmett Butler <emmett.butler321@gmail.com>

This pull request adds "guardrails" to the "library injection" process. These are early exit conditions from the instrumentation process intended to avoid sending any traces when undefined behavior is likely. The code makes this determination on the basis of software versions present in the application environment, both of Python packages and the Python runtime itself. The biggest risk here is that instrumentation is disabled when it's not intended to be. I think existing tests in `tests/lib-injection` cover this pretty well. There's a new test added that verifies instrumentation was cancelled when an unsupported package version is present. Contains changes from #9418 Related RFC: "[RFC] One Step Guardrails" ## Checklist - [x] minimum package version checks - [x] Testing - [x] replace envvars with inject_force - [x] figure out what to use instead of pkg_resources - [x] replace local file path with `DD_TELEMETRY_FORWARDER_PATH` - [x] Change(s) are motivated and described in the PR description - [x] Testing strategy is described if automated tests are not included in the PR - [x] Risks are described (performance impact, potential for breakage, maintainability) - [x] Change is maintainable (easy to change, telemetry, documentation) - [x] [Library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) are followed or label `changelog/no-changelog` is set - [x] Documentation is included (in-code, generated user docs, [public corp docs](https://github.com/DataDog/documentation/)) - [x] Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) - [x] If this PR changes the public interface, I've notified `@DataDog/apm-tees`. ## Reviewer Checklist - [x] Title is accurate - [x] All changes are related to the pull request's stated goal - [x] Description motivates each change - [x] Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes - [x] Testing strategy adequately addresses listed risks - [x] Change is maintainable (easy to change, telemetry, documentation) - [x] Release note makes sense to a user of the library - [x] Author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment - [x] Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting) --------- Co-authored-by: Emmett Butler <723615+emmettbutler@users.noreply.github.com> Co-authored-by: Emmett Butler <emmett.butler321@gmail.com> (cherry picked from commit 0c38e09)

…10] (#9512) Backport 0c38e09 from #9323 to 2.10. This pull request adds "guardrails" to the "library injection" process. These are early exit conditions from the instrumentation process intended to avoid sending any traces when undefined behavior is likely. The code makes this determination on the basis of software versions present in the application environment, both of Python packages and the Python runtime itself. The biggest risk here is that instrumentation is disabled when it's not intended to be. I think existing tests in `tests/lib-injection` cover this pretty well. There's a new test added that verifies instrumentation was cancelled when an unsupported package version is present. Contains changes from #9418 Related RFC: "[RFC] One Step Guardrails" ## Checklist - [x] minimum package version checks - [x] Testing - [x] replace envvars with inject_force - [x] figure out what to use instead of pkg_resources - [x] replace local file path with `DD_TELEMETRY_FORWARDER_PATH` - [x] Change(s) are motivated and described in the PR description - [x] Testing strategy is described if automated tests are not included in the PR - [x] Risks are described (performance impact, potential for breakage, maintainability) - [x] Change is maintainable (easy to change, telemetry, documentation) - [x] [Library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) are followed or label `changelog/no-changelog` is set - [x] Documentation is included (in-code, generated user docs, [public corp docs](https://github.com/DataDog/documentation/)) - [x] Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) - [x] If this PR changes the public interface, I've notified `@DataDog/apm-tees`. ## Reviewer Checklist - [x] Title is accurate - [x] All changes are related to the pull request's stated goal - [x] Description motivates each change - [x] Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes - [x] Testing strategy adequately addresses listed risks - [x] Change is maintainable (easy to change, telemetry, documentation) - [x] Release note makes sense to a user of the library - [x] Author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment - [x] Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting) Co-authored-by: Zachary Groves <32471391+ZStriker19@users.noreply.github.com>

This pull request adds "guardrails" to the "library injection" process. These are early exit conditions from the instrumentation process intended to avoid sending any traces when undefined behavior is likely. The code makes this determination on the basis of software versions present in the application environment, both of Python packages and the Python runtime itself. The biggest risk here is that instrumentation is disabled when it's not intended to be. I think existing tests in `tests/lib-injection` cover this pretty well. There's a new test added that verifies instrumentation was cancelled when an unsupported package version is present. Contains changes from #9418 Related RFC: "[RFC] One Step Guardrails" - [x] minimum package version checks - [x] Testing - [x] replace envvars with inject_force - [x] figure out what to use instead of pkg_resources - [x] replace local file path with `DD_TELEMETRY_FORWARDER_PATH` - [x] Change(s) are motivated and described in the PR description - [x] Testing strategy is described if automated tests are not included in the PR - [x] Risks are described (performance impact, potential for breakage, maintainability) - [x] Change is maintainable (easy to change, telemetry, documentation) - [x] [Library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) are followed or label `changelog/no-changelog` is set - [x] Documentation is included (in-code, generated user docs, [public corp docs](https://github.com/DataDog/documentation/)) - [x] Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) - [x] If this PR changes the public interface, I've notified `@DataDog/apm-tees`. - [x] Title is accurate - [x] All changes are related to the pull request's stated goal - [x] Description motivates each change - [x] Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes - [x] Testing strategy adequately addresses listed risks - [x] Change is maintainable (easy to change, telemetry, documentation) - [x] Release note makes sense to a user of the library - [x] Author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment - [x] Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting) --------- Co-authored-by: Emmett Butler <723615+emmettbutler@users.noreply.github.com> Co-authored-by: Emmett Butler <emmett.butler321@gmail.com> (cherry picked from commit 0c38e09)

…9] (#10563) This pull request adds "guardrails" to the "library injection" process. These are early exit conditions from the instrumentation process intended to avoid sending any traces when undefined behavior is likely. The code makes this determination on the basis of software versions present in the application environment, both of Python packages and the Python runtime itself. The biggest risk here is that instrumentation is disabled when it's not intended to be. I think existing tests in `tests/lib-injection` cover this pretty well. There's a new test added that verifies instrumentation was cancelled when an unsupported package version is present. Contains changes from #9418 Related RFC: "[RFC] One Step Guardrails" - [x] minimum package version checks - [x] Testing - [x] replace envvars with inject_force - [x] figure out what to use instead of pkg_resources - [x] replace local file path with `DD_TELEMETRY_FORWARDER_PATH` - [x] Change(s) are motivated and described in the PR description - [x] Testing strategy is described if automated tests are not included in the PR - [x] Risks are described (performance impact, potential for breakage, maintainability) - [x] Change is maintainable (easy to change, telemetry, documentation) - [x] [Library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) are followed or label `changelog/no-changelog` is set - [x] Documentation is included (in-code, generated user docs, [public corp docs](https://github.com/DataDog/documentation/)) - [x] Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) - [x] If this PR changes the public interface, I've notified `@DataDog/apm-tees`. - [x] Title is accurate - [x] All changes are related to the pull request's stated goal - [x] Description motivates each change - [x] Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes - [x] Testing strategy adequately addresses listed risks - [x] Change is maintainable (easy to change, telemetry, documentation) - [x] Release note makes sense to a user of the library - [x] Author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment - [x] Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting) Co-authored-by: Zachary Groves <32471391+ZStriker19@users.noreply.github.com> Co-authored-by: Taegyun Kim <taegyun.kim@datadoghq.com>

check in requirements_script

5ade18a

emmettbutler added the changelog/no-changelog A changelog entry is not required for this PR. label May 28, 2024

emmettbutler requested a review from a team as a code owner May 28, 2024 21:09

emmettbutler requested review from erikayasuda, brettlangdon and ZStriker19 May 28, 2024 21:09

brettlangdon reviewed May 28, 2024

View reviewed changes

all_requirements.py Outdated Show resolved Hide resolved

all_requirements.py Outdated Show resolved Hide resolved

emmettbutler added 2 commits May 29, 2024 12:08

add mode that imports riot;

705b1ce

typing

remove ast mode

5083060

emmettbutler requested a review from brettlangdon May 29, 2024 19:10

brettlangdon reviewed May 29, 2024

View reviewed changes

all_requirements.py Outdated Show resolved Hide resolved

all_requirements.py Outdated Show resolved Hide resolved

emmettbutler added 2 commits May 29, 2024 12:25

rename

0c651ab

commit generated file

ae4450b

emmettbutler requested a review from brettlangdon May 29, 2024 19:28

proper riotfile import

5a1546c

brettlangdon reviewed May 30, 2024

View reviewed changes

scripts/min_compatible_versions.py Outdated Show resolved Hide resolved

min_compatible_versions.csv Outdated Show resolved Hide resolved

emmettbutler mentioned this pull request May 31, 2024

feat(onboarding): early exit conditions in lib-injection #9323

Merged

22 tasks

include specifier info in file

ee1233a

emmettbutler requested a review from brettlangdon May 31, 2024 15:08

lint

1d40cde

emmettbutler enabled auto-merge (squash) June 4, 2024 18:12

Merge branch 'main' into emmett.butler/min_versions

dbae008

brettlangdon reviewed Jun 6, 2024

View reviewed changes

min_compatible_versions.csv Outdated Show resolved Hide resolved

exclude direct installs from github

f46eb77

emmettbutler requested a review from brettlangdon June 7, 2024 12:10

brettlangdon reviewed Jun 7, 2024

View reviewed changes

min_compatible_versions.csv Outdated Show resolved Hide resolved

github-advanced-security bot found potential problems Jun 7, 2024

View reviewed changes

scripts/min_compatible_versions.py Fixed Show fixed Hide fixed

brettlangdon reviewed Jun 7, 2024

View reviewed changes

min_compatible_versions.csv Outdated Show resolved Hide resolved

include pyproject.toml dependencies

ade7bec

ignore riot environments that do not run pytest. this seems like a reasonable proxy for ignoring packages that are required only for tests

emmettbutler requested a review from brettlangdon June 7, 2024 13:42

github-actions bot mentioned this pull request Jun 11, 2024

feat(onboarding): early exit conditions in lib-injection [backport 2.10] #9512

Merged

22 tasks

emmettbutler closed this Jun 11, 2024

auto-merge was automatically disabled June 11, 2024 14:13
Pull request was closed

emmettbutler deleted the emmett.butler/min_versions branch June 11, 2024 14:13

emmettbutler mentioned this pull request Sep 9, 2024

feat(onboarding): early exit conditions in lib-injection [backport 2.9] #10563

Merged

22 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: script that discovers minimum tested versions #9418

chore: script that discovers minimum tested versions #9418

emmettbutler commented May 28, 2024 •

edited

Loading

brettlangdon left a comment

datadog-dd-trace-py-rkomorn bot commented May 28, 2024

emmettbutler commented May 29, 2024

brettlangdon commented May 30, 2024

emmettbutler commented May 30, 2024

brettlangdon left a comment

emmettbutler commented May 31, 2024

brettlangdon commented May 31, 2024

emmettbutler commented May 31, 2024 •

edited

Loading

emmettbutler commented Jun 5, 2024

pr-commenter bot commented Jun 5, 2024 •

edited

Loading

brettlangdon left a comment

chore: script that discovers minimum tested versions #9418

chore: script that discovers minimum tested versions #9418

Conversation

emmettbutler commented May 28, 2024 • edited Loading

Checklist

Reviewer Checklist

brettlangdon left a comment

Choose a reason for hiding this comment

datadog-dd-trace-py-rkomorn bot commented May 28, 2024

Datadog Report

New Flaky Tests (1)

emmettbutler commented May 29, 2024

brettlangdon commented May 30, 2024

emmettbutler commented May 30, 2024

brettlangdon left a comment

Choose a reason for hiding this comment

emmettbutler commented May 31, 2024

brettlangdon commented May 31, 2024

emmettbutler commented May 31, 2024 • edited Loading

emmettbutler commented Jun 5, 2024

pr-commenter bot commented Jun 5, 2024 • edited Loading

Benchmarks

brettlangdon left a comment

Choose a reason for hiding this comment

emmettbutler commented May 28, 2024 •

edited

Loading

emmettbutler commented May 31, 2024 •

edited

Loading

pr-commenter bot commented Jun 5, 2024 •

edited

Loading