Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(features) Use dataclasses for flagpole instead of pydantic #75859

Merged
merged 9 commits into from
Aug 15, 2024

Conversation

markstory
Copy link
Member

These changes switch flagpole from using pydantic as the base classes to dataclasses from stdlib. During the pydantic 2 upgrade the time to parse features shot way up which was unexpected. We have also not been as impressed with feature flag match times and suspected that pydantic might be contributing overhead.

The changes of this pull request re-implement flagpole with basic python dataclasses. The new implementation has reduced time to build feature flags, which should help improve the overall performance of our feature flagging. Improvements were measured both with cProfile, and local mini-benchmarks. (scripts provided below)

cProfile results

The cprofile script builds a feature 1000 times and collects profiling data from those operations.

current master (pydantic)

         9004 function calls in 1.125 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    1.125    1.125 <string>:1(<module>)
     1000    1.123    0.001    1.124    0.001 __init__.py:117(from_feature_dictionary)
        1    0.001    0.001    1.125    1.125 flagpole-profile:21(main)
     2000    0.001    0.000    0.001    0.000 typing.py:2424(get_origin)
        1    0.000    0.000    1.125    1.125 {built-in method builtins.exec}
     6000    0.000    0.000    0.000    0.000 {built-in method builtins.isinstance}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

after (dataclasses)

         41004 function calls in 0.009 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.010    0.010 <string>:1(<module>)
     3000    0.000    0.000    0.000    0.000 <string>:2(__init__)
     1000    0.001    0.000    0.009    0.000 __init__.py:128(from_feature_dictionary)
     1000    0.001    0.000    0.008    0.000 __init__.py:134(<listcomp>)
     2000    0.002    0.000    0.004    0.000 conditions.py:193(_condition_from_dict)
     3000    0.002    0.000    0.007    0.000 conditions.py:217(from_dict)
     3000    0.000    0.000    0.004    0.000 conditions.py:219(<listcomp>)
     2000    0.000    0.000    0.000    0.000 enum.py:1093(__new__)
     2000    0.000    0.000    0.000    0.000 enum.py:1255(value)
     2000    0.000    0.000    0.000    0.000 enum.py:193(__get__)
     2000    0.000    0.000    0.001    0.000 enum.py:686(__call__)
        1    0.000    0.000    0.010    0.010 flagpole-profile:21(main)
        1    0.000    0.000    0.010    0.010 {built-in method builtins.exec}
     1000    0.000    0.000    0.000    0.000 {built-in method builtins.isinstance}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
    19000    0.001    0.000    0.001    0.000 {method 'get' of 'dict' objects}

While significantly more functions were called the overall runtime is much better.

flagpole-profile script
#!/usr/bin/env python

from sentry.runner import configure
from flagpole import Feature as FlagpoleFeature
import cProfile

configure()

feature_names = (
    # Largest feature flag is 2688 bytes
    "organizations:user-feedback-ui",
    # 404 bytes is around median
    "organizations:performance-chart-interpolation",
    # 97 bytes is smallest feature.
    "organizations:new-weekly-report",
)

feature_config = {}


def main():
    feature_name = "organizations:user-feedback-ui"
    # feature_name = "organizations:performance-chart-interpolation"
    # feature_name = "organizations:new-weekly-report"
    for _ in range(1000):
        FlagpoleFeature.from_feature_dictionary(feature_name, feature_config[feature_name])


if __name__ == "__main__":
    from sentry import options

    for feature_name in feature_names:
        feature_config[feature_name] = options.get(f"feature.{feature_name}")

    cProfile.run('main()')

Micro benchmark

In the microbenchmark I looked at 3 feature flags (the max, approximate median and smallest features). Each feature would be parsed 10000 times and I looked at the min, mean and max duration for building a Feature object from dictionary data loaded from options seeded with the current feature flag inventory.

before (pydantic)

Results for organizations:user-feedback-ui

option load_time 0.0018911361694335938

build_time min 0.0004580021
build_time max 0.0011467934
build_time mean 0.0004902341

RSS memory usage 272400384

Results for organizations:performance-chart-interpolation

option load_time 0.0030400753021240234

build_time min 0.0000336170
build_time max 0.0001301765
build_time mean 0.0000367536

RSS memory usage 272400384

Results for organizations:new-weekly-report

option load_time 0.0022568702697753906

build_time min 0.0000057220
build_time max 0.0000231266
build_time mean 0.0000069423

RSS memory usage 272400384

after (dataclasses)

Results for organizations:user-feedback-ui

option load_time 0.0033750534057617188

build_time min 0.0000026226
build_time max 0.0000209808
build_time mean 0.0000032377

RSS memory usage 276054016

Results for organizations:performance-chart-interpolation

option load_time 0.0033571720123291016

build_time min 0.0000016689
build_time max 0.0000610352
build_time mean 0.0000028541

RSS memory usage 276054016

Results for organizations:new-weekly-report

option load_time 0.003008127212524414

build_time min 0.0000000000
build_time max 0.0000047684
build_time mean 0.0000007447

RSS memory usage 276070400
flagpole-timing script
#!/usr/bin/env python

from sentry.runner import configure
from flagpole import Feature as FlagpoleFeature
import gc
import statistics
import time
import psutil

configure()


def main():
    from sentry import options
    gc.disable()

    feature_names = (
        # Largest feature flag is 2688 bytes
        "organizations:user-feedback-ui",
        # 404 bytes is around median
        "organizations:performance-chart-interpolation",
        # 97 bytes is smallest feature.
        "organizations:new-weekly-report",
    )
    for feature_name in feature_names:
        load_start = time.time()
        option_val = options.get(f"feature.{feature_name}")
        load_end = time.time()

        build_durations = []
        for _ in range(0, 10000):
            build_start = time.time()
            FlagpoleFeature.from_feature_dictionary(feature_name, option_val)
            build_end = time.time()
            build_durations.append(build_end - build_start)

        load_time = load_end - load_start
        rss_mem = psutil.Process().memory_info().rss

        print("")
        print(f"Results for {feature_name}")
        print("")
        print(f"option load_time {load_time}")
        print("")
        print("build_time min", "{:.10f}".format(min(build_durations)))
        print("build_time max", "{:.10f}".format(max(build_durations)))
        print("build_time mean", "{:.10f}".format(statistics.mean(build_durations)))
        print("")
        print(f"RSS memory usage {rss_mem}")


if __name__ == "__main__":
    main()

Again we see significant improvements in runtime without any impact in memory usage.

What we lose?

While moving to dataclasses gives us some gains in performance it has a few drawbacks:

  • The dataclass implementation isn't as typesound as pydantic would be. Each and every property is not validated for type correctness.
  • Updating the jsonschema will be manual going forward. Pydantic has a great integration with jsonschema that allows you to generate jsonschema documents from objects. Dataclasses do not.
  • Drift between the schema and implementation could occur. Because the jsonschema and code are not connected they could drift in the future.

Early timing measurements are looking promising.
This schema was generated with pydantic and will be used in CI to
validate that schema and dataclasses are kept compatible, and could
be part of a validation flow for flagpole.
@markstory markstory requested review from GabeVillalobos and a team August 8, 2024 19:13
@github-actions github-actions bot added Scope: Frontend Automatically applied to PRs that change frontend components Scope: Backend Automatically applied to PRs that change backend components labels Aug 8, 2024
Copy link
Contributor

github-actions bot commented Aug 8, 2024

🚨 Warning: This pull request contains Frontend and Backend changes!

It's discouraged to make changes to Sentry's Frontend and Backend in a single pull request. The Frontend and Backend are not atomically deployed. If the changes are interdependent of each other, they must be separated into two pull requests and be made forward or backwards compatible, such that the Backend or Frontend can be safely deployed independently.

Have questions? Please ask in the #discuss-dev-infra channel.

Copy link

codecov bot commented Aug 8, 2024

Test Failures Detected: Due to failing tests, we cannot provide coverage reports at this time.

❌ Failed Test Results:

Completed 21719 tests with 14 failed, 21504 passed and 201 skipped.

View the full list of failed tests

pytest

  • Class name: tests.flagpole.test_conditions.TestContainsConditions
    Test name: test_does_contain
    Flags:
    • backend

    #x1B[1m#x1B[31mtests/flagpole/test_conditions.py#x1B[0m:104: in test_does_contain
    condition = ContainsCondition(property="foo", value="bar")
    #x1B[1m#x1B[31mE TypeError: ConditionBase.__init__() missing 1 required positional argument: 'operator'#x1B[0m
  • Class name: tests.flagpole.test_conditions.TestContainsConditions
    Test name: test_does_not_contain
    Flags:
    • backend

    #x1B[1m#x1B[31mtests/flagpole/test_conditions.py#x1B[0m:120: in test_does_not_contain
    condition = ContainsCondition(property="foo", value=values)
    #x1B[1m#x1B[31mE TypeError: ConditionBase.__init__() missing 1 required positional argument: 'operator'#x1B[0m
  • Class name: tests.flagpole.test_conditions.TestContainsConditions
    Test name: test_invalid_property_provided
    Flags:
    • backend

    #x1B[1m#x1B[31mtests/flagpole/test_conditions.py#x1B[0m:136: in test_invalid_property_provided
    condition = ContainsCondition(property="foo", value=values)
    #x1B[1m#x1B[31mE TypeError: ConditionBase.__init__() missing 1 required positional argument: 'operator'#x1B[0m
  • Class name: tests.flagpole.test_conditions.TestContainsConditions
    Test name: test_missing_context_property
    Flags:
    • backend

    #x1B[1m#x1B[31mtests/flagpole/test_conditions.py#x1B[0m:149: in test_missing_context_property
    condition = ContainsCondition(property="foo", value="bar")
    #x1B[1m#x1B[31mE TypeError: ConditionBase.__init__() missing 1 required positional argument: 'operator'#x1B[0m
  • Class name: tests.flagpole.test_conditions.TestEqualsConditions
    Test name: test_equality_type_mismatch_strings
    Flags:
    • backend

    #x1B[1m#x1B[31mtests/flagpole/test_conditions.py#x1B[0m:193: in test_equality_type_mismatch_strings
    condition = EqualsCondition(property="foo", value=values)
    #x1B[1m#x1B[31mE TypeError: ConditionBase.__init__() missing 1 required positional argument: 'operator'#x1B[0m
  • Class name: tests.flagpole.test_conditions.TestEqualsConditions
    Test name: test_is_equal_case_insensitive
    Flags:
    • backend

    #x1B[1m#x1B[31mtests/flagpole/test_conditions.py#x1B[0m:183: in test_is_equal_case_insensitive
    condition = EqualsCondition(property="foo", value=values)
    #x1B[1m#x1B[31mE TypeError: ConditionBase.__init__() missing 1 required positional argument: 'operator'#x1B[0m
  • Class name: tests.flagpole.test_conditions.TestEqualsConditions
    Test name: test_is_equal_string
    Flags:
    • backend

    #x1B[1m#x1B[31mtests/flagpole/test_conditions.py#x1B[0m:165: in test_is_equal_string
    condition = EqualsCondition(property="foo", value=value)
    #x1B[1m#x1B[31mE TypeError: ConditionBase.__init__() missing 1 required positional argument: 'operator'#x1B[0m
  • Class name: tests.flagpole.test_conditions.TestEqualsConditions
    Test name: test_is_not_equals
    Flags:
    • backend

    #x1B[1m#x1B[31mtests/flagpole/test_conditions.py#x1B[0m:175: in test_is_not_equals
    condition = EqualsCondition(property="foo", value=values)
    #x1B[1m#x1B[31mE TypeError: ConditionBase.__init__() missing 1 required positional argument: 'operator'#x1B[0m
  • Class name: tests.flagpole.test_conditions.TestInConditions
    Test name: test_invalid_property_value
    Flags:
    • backend

    #x1B[1m#x1B[31mtests/flagpole/test_conditions.py#x1B[0m:75: in test_invalid_property_value
    condition = InCondition(property="foo", value=values)
    #x1B[1m#x1B[31mE TypeError: ConditionBase.__init__() missing 1 required positional argument: 'operator'#x1B[0m
  • Class name: tests.flagpole.test_conditions.TestInConditions
    Test name: test_is_in
    Flags:
    • backend

    #x1B[1m#x1B[31mtests/flagpole/test_conditions.py#x1B[0m:33: in test_is_in
    condition = InCondition(property="foo", value=values)
    #x1B[1m#x1B[31mE TypeError: ConditionBase.__init__() missing 1 required positional argument: 'operator'#x1B[0m
  • Class name: tests.flagpole.test_conditions.TestInConditions
    Test name: test_is_in_case_insensitivity
    Flags:
    • backend

    #x1B[1m#x1B[31mtests/flagpole/test_conditions.py#x1B[0m:65: in test_is_in_case_insensitivity
    condition = InCondition(property="foo", value=values)
    #x1B[1m#x1B[31mE TypeError: ConditionBase.__init__() missing 1 required positional argument: 'operator'#x1B[0m
  • Class name: tests.flagpole.test_conditions.TestInConditions
    Test name: test_is_in_numeric_string
    Flags:
    • backend

    #x1B[1m#x1B[31mtests/flagpole/test_conditions.py#x1B[0m:50: in test_is_in_numeric_string
    condition = InCondition(property="foo", value=values)
    #x1B[1m#x1B[31mE TypeError: ConditionBase.__init__() missing 1 required positional argument: 'operator'#x1B[0m
  • Class name: tests.flagpole.test_conditions.TestInConditions
    Test name: test_is_not_in
    Flags:
    • backend

    #x1B[1m#x1B[31mtests/flagpole/test_conditions.py#x1B[0m:57: in test_is_not_in
    condition = InCondition(property="foo", value=values)
    #x1B[1m#x1B[31mE TypeError: ConditionBase.__init__() missing 1 required positional argument: 'operator'#x1B[0m
  • Class name: tests.flagpole.test_conditions.TestInConditions
    Test name: test_missing_context_property
    Flags:
    • backend

    #x1B[1m#x1B[31mtests/flagpole/test_conditions.py#x1B[0m:91: in test_missing_context_property
    in_condition = InCondition(property="foo", value=values)
    #x1B[1m#x1B[31mE TypeError: ConditionBase.__init__() missing 1 required positional argument: 'operator'#x1B[0m

"""The value to compare against the condition's evaluation context property."""

operator: str = dataclasses.field(default="")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love this, but it is less gross than having to always provide operator to every condition constructor throughout tests.

Copy link
Member

@GabeVillalobos GabeVillalobos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nits, but changes look good. Nice work!

description="A brief description or identifier for the segment"
def condition_from_dict(data: Mapping[str, Any]) -> ConditionBase:
operator_kind = ConditionOperatorKind(data.get("operator", "invalid"))
condition_cls = OPERATOR_LOOKUP[operator_kind]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to raise explicitly here if we get an invalid operator type? Otherwise, won't this be an attribute error on NoneType?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can make the raise explicit. Currently it would raise KeyError which isn't as informative as it could be.

Comment on lines +84 to +87
with pytest.raises(ConditionTypeMismatchException):
not_condition.match(
context=EvaluationContext({"foo": attr_val}), segment_name="test"
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Main reason I went with the assertion util function approach before was because it made it explicit which invalid input failed when modifying condition logic. This is mostly fine, although it does make debugging just a bit harder.

@markstory markstory merged commit 79409ca into master Aug 15, 2024
49 checks passed
@markstory markstory deleted the flagpole-dataclasses branch August 15, 2024 15:01
@github-actions github-actions bot locked and limited conversation to collaborators Aug 31, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Scope: Backend Automatically applied to PRs that change backend components Scope: Frontend Automatically applied to PRs that change frontend components
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants