Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: versioned deployments for UDFs #25121

Open
wants to merge 26 commits into
base: master
Choose a base branch
from
Open

feat: versioned deployments for UDFs #25121

wants to merge 26 commits into from

Conversation

aspicer
Copy link
Contributor

@aspicer aspicer commented Sep 21, 2024

Requires: https://github.com/PostHog/posthog-cloud-infra/pull/3059

Problem

We want to start rolling out UDFs to customers.

Currently, if there is a change in the schema of the UDFs and posthog is rolled back, UDFs will not work anymore.

Changes

Create a versioning system for UDFs. Every time we make a schema change or a breaking functionality change, we need to create a new version.

The process is documented at the top of udf_versioner.py but I will copy it here for posterity and ease:

For revertible cloud deploys:

  1. Edit and develop using the top level functions inside of user_scripts, along with user_defined_function.xml inside of docker/clickhouse
  2. Increment the version in udf_versioner.py and run that file every time you make breaking changes to UDFs (likely involving type definitions).
  3. After running this, you have to copy the user_defined_function.xml file in the newly created version folder (e.g. user_scripts/v4/user_defined_function.xml) to the posthog-cloud-infra repo and deploy it
  4. After that deploy goes out, it is safe to land and deploy the changes to the posthog repo
    If deploys aren't seamless, look into moving the action that copies the user_scripts folder to the clickhouse cluster earlier in the deploy process

This ends up creating a user_script directory that looks like the following:

user_scripts/
    aggregate_funnel.py
    [ all the python files ]
    aggregate_funnel_trends.py
    v0/
        aggregate_funnel.py
        [ all the python files ]
        aggregate_funnel_trends.py    
        user_defined_function.xml
    v1/
        aggregate_funnel.py
        [ all the python files ]
        aggregate_funnel_trends.py    
        user_defined_function.xml

The user_defined_function.xml file of each version contains all of the signatures versions before it.

Finally, the mapping code is updated to look like this:
image

Does this work well for both Cloud and self-hosted?

No impact for self-hosted or dev, only cloud.

How did you test this code?

Ran the script, checked the output.

Updated CI to use the cloud versioned config and functions for testing.

Deploy Plan

  1. Deploy the new user_defined_function.xml file with the v0 definitions to cloud-infra
  2. Canary deploy this branch, checking dev to make sure these functions work before rolling out.
  3. Check to see if there's any disruption of service with this deploy and fix that before rolling out further

@aspicer aspicer requested review from a team and skoob13 September 21, 2024 04:06
@aspicer aspicer marked this pull request as ready for review September 21, 2024 04:16
Copy link
Contributor

@skoob13 skoob13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I left a few suggestions, but nothing that stops it from merging.

# 3. Copy the `user_defined_function.xml` file in the newly created version folder (e.g. `user_scripts/v4/user_defined_function.xml`) to the `posthog-cloud-infra` repo and deploy it
# 4. After that deploy goes out, it is safe to land and deploy the changes to the `posthog` repo
# If deploys aren't seamless, look into moving the action that copies the `user_scripts` folder to the clickhouse cluster earlier in the deploy process
UDF_VERSION = 0 # Last modified by: @aspicer, 2024-09-20
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we do versioning here by an analogy with Django migrations? Django has latest_migration.manifest that contains the latest versions. Additionally, the version name is not unique, so if someone doesn't change a comment after the version, it won't be caught by git unless there are merge conflicts in files. Hash suffix/automated comment makes more sense here. It is probably not a big deal right now since not many folks work with UDFs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, latest_user_defined_function.xml should catch such cases. I suppose the most significant breaking change here is a difference in function definitions, correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, absolutely. This should get caught in user_defined_function.xml inside of the docker folder. It definitely won't merge without conflict if two people have changed the signatures differently. I think this is okay, if it becomes an issue, we can add something like Django!

@@ -0,0 +1,70 @@
import argparse
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be a Django management script?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there's any good reason to add that dependency, since these files have to run without django and it's really a completely separate subsystem.

@aspicer
Copy link
Contributor Author

aspicer commented Sep 23, 2024

I added a test that fails if latest_user_defined_function.xml isn't deployed to the cloud.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants