Using upstream OTel auto instrumentation and minimize otel_wrapper #158

NathanielRN · 2021-10-21T05:27:20Z

Description

@wangzlei and I worked together to figure out how to mostly auto-instrument Lambda functions using the upstream opentelemetry-instrument auto-instrumentation package.

We made two major updates

Update the PYTHONPATH so we can call `opentelemtry-instrument`

The auto-instrumentation package counts on 2 things:

the opentelemetry package
locating all the python packages the user wants to instrument

Both these 2 things require us modifying the environment variable PYTHONPATH right away in otel-instrument. AWS Lambda will add the correct paths but it does it too late. (It only does it once it calls it's originally intended entry point of python3 /var/runtime/bootstrap.py).

Update the `otel_wrapper.py` to only call the `AwsLambdaInstrumentor`

All of the instrumentation is done in sitecustomize.py by opentelemetry-instrument. However, the way the Lambda Handler is imported in the AWS Lambda bootstrap.py file which is run after sitecustomize.py CLEARS any instrumentation done on lambda_function.lambda_handler. That's why we keep otel_wrapper.py around. So that bootstrap.py can call it, do any destructive imports it needs to do, and then call AwsLambdaInstrumentor.instrument() explicitly. This way we know the Lambda function is instrumented, we can import, and give bootstrap.py the Lambda Handler that we know is instrumented.

Future Work

We would ideally like to modify boostrap.py or investigate how we can get rid of otel_wrapper.py and have all the instrumentation be finished in sitecustomize.py such that bootstrap.py can just import the normal lambda handler at the _HANDLER environment variable without destroying the import at all.

Fixes #152

anuraaga · 2021-10-21T05:36:26Z

python/src/otel/otel_sdk/otel-instrument

+#   know where the OpenTelemetry Python packages are and can add them to the
+#   PYTHONPATH.
+
+LAMBDA_LAYER_PKGS_DIR = os.path.abspath(


I think this file would be located at /var/task - shouldn't this explicitly reference /opt/python which while not an environment variable is a well defined path for lambda?

I think this file would be located at /var/task

Didn't you also say this in #157

Note, the file structure is important. Lambda does not seem to allow loading the handler if it is placed at a subdirectory such as opentelemetry.instrumentation.lambda.handler, when inside a layer it must be at the top level.

so this file has to be moved to the top level as we do in the build:

opentelemetry-lambda/python/src/otel/Dockerfile

Line 15 in 71221ee

mv /build/python/otel-instrument /build/otel-instrument && \

shouldn't this explicitly reference /opt/python which while not an environment variable is a well defined path for lambda?

Yeah we can do that, it was just annoying to test locally because ../python always existed but /opt/python did not. I'll move it back though because that explains the intent well.

anuraaga · 2021-10-21T05:36:38Z

python/src/otel/otel_sdk/otel-instrument


-# the path to the interpreter and all of the originally intended arguments
-args = sys.argv[1:]
+AWS_LAMBDA_FUNCTION_NAME = "AWS_LAMBDA_FUNCTION_NAME"


We'd expect these to actually refer to a value, not just the name itself. I'd just inline

We could do a different name for the constant if you think that's better? I think this pattern is a good idea though because we reference this variable in 3 different places and that's 3 different chances to have a typo (as has happened to me before).

It's also pretty normal in the upstream to do this? https://github.com/open-telemetry/opentelemetry-python/blob/main/opentelemetry-api/src/opentelemetry/environment_variables.py

Renaming to something else seems like it reduces readability of the code that uses the variable. Using variables that refer to names, instead of values, also reduces readability though. I disagree with the upstream pattern here.

For reference, readability always needs to trump writability - typos during writing is still sort of OK for readability sakes. In many cases it results in duplicate docs, etc that get out of sync but it is what it is. In this case, readability suffers from the constants since then you have to jump to the constant definition to know the value every time - using the string has only positive impacts on readibility though

anuraaga · 2021-10-21T05:38:14Z

python/src/otel/otel_sdk/otel-instrument

+
+# - Set the service name
+
+service_name_resource_attribute = "service.name"


This seems to not respect the user's service.name

Actually just use OTEL_SERVICE_NAME instead, respecting the default if already set

https://github.com/open-telemetry/opentelemetry-python/blob/18b5cb046bebab0c70841c662268bb7980c2fd5e/opentelemetry-sdk/src/opentelemetry/sdk/environment_variables.py#L356

anuraaga · 2021-10-21T05:40:23Z

python/src/otel/otel_sdk/otel-instrument

+
+environ.setdefault(OTEL_PROPAGATORS, "tracecontext,b3,xray")
+
+# - Disable Lambda instrumentation because we will instrument when we know it


We should just remove the auto instrumentation entry point from setup.cfg instead as it can't really be used

I think this is the right idea, will do that!

anuraaga · 2021-10-21T05:42:16Z

python/src/otel/otel_sdk/otel-instrument

+
+environ[OTEL_PYTHON_DISABLED_INSTRUMENTATIONS] = "aws_lambda"
+
+# - Use a wrapper because instrumentations don't last across parent -> child


I don't think we need this comment, we already found enough confusion on the topic of parent / child already

Thanks for this, yes it is confusing, removing it and explaining it better.

anuraaga · 2021-10-21T05:44:35Z

python/src/otel/otel_sdk/otel-instrument

+environ["_HANDLER"] = "otel_wrapper.lambda_handler"
+
+# - Call the upstream auto instrumentation script
+


Strongly recommend rewriting this file into a shell script. It solidifies two facts

It's supposed to only do very simple things like setting environment variables, no business logic can happen in here

There is no parent python process in our lambda handler, the fact that this file is python I feel can confuse a reader into wondering why we don't load instrumentation here (I think you tried this in a PR once :P). A shell script makes obvious this isn't the app, it's just configuring the environment

There's been a lot of talk at the python SIG about moving all bash scripts to python. Python does scripts really well!

It's supposed to only do very simple things like setting environment variables, no business logic can happen in here

One huge benefit is being able to import other python packages to GET environment variable constants which like you said is the point of this file. Then we can be sure updates get picked up. (Although they are spec defined things in the instrumentation section are still changing).

All in all a python script should be easier to maintain by a python script.

There is no parent python process in our lambda handler, the fact that this file is python I feel can confuse a reader into wondering why we don't load instrumentation here (I think you tried this in a PR once :P). A shell script makes obvious this isn't the app, it's just configuring the environment

Lol yes I did but I learned a lot about the Lambda instrumentation in #152 🥲 I added an explicit comment at the top description. I feel like that was what the file was missing and with this it shouldn't confuse future readers.

anuraaga · 2021-10-21T05:45:53Z

python/src/otel/otel_sdk/otel_wrapper.py

-            configured = entry_point.name
-        except Exception as exc:  # pylint: disable=broad-except
-            logger.debug("Configuration of %s failed", entry_point.name)
+AwsLambdaInstrumentor().instrument()


When upstreaming, though initially probably a PR for this repo since it's easier to test I guess, strongly recommend providing the original handler as a parameter to this function. That's all it should take to make the instrumentation work both manually, if a user wants to, or automatically from the lambda layer. We just want to decouple ORIG_HANDLER as it's a mysterious dependency between the two components.

I'd recommend sending a PR that only adds AwsLambdaInstrumenter file and has unit tests that exercise the instrumentation manually first to keep the scope small and easy to review.

I realize currently the instrumentation seems to handle by falling back to _HANDLER when ORIG_HANDLER isn't set. It works ok but less coupling will make the parts easier to follow. Instrumentation is about instrumenting the handler, wrapper is about setting up the instrumentation automatically in the runtime. The separation will improve the code.

I'd recommend sending a PR that only adds AwsLambdaInstrumenter file and has unit tests that exercise the instrumentation manually first to keep the scope small and easy to review.

Will definitely keep this in mind as I ask for feedback from the Python SIG. The actual code change in open-telemetry/opentelemetry-python-contrib#739 is small though and when I brought it up last week they said they would be able to help me review it! It's inflated by the LICENSE.

The tests already give an example of BOTH auto instrumentation AND manual instrumentation. Auto Instrumentation is a super important part to this product so I feel reluctant to commit it later when we already have tests that can give us confidence the scripts is working as intended.

(I put in a lot of effort into #150 to make sure the tests use the script because it helps give us confidence that instrumentation works as intended in Lambda).

NathanielRN force-pushed the make-exec-script-call-auto-instrument-script branch from 4b147b1 to a270074 Compare October 21, 2021 05:35

Using upstream OTel auto instrumentation and minimize otel_wrapper

46110d2

NathanielRN force-pushed the make-exec-script-call-auto-instrument-script branch from a270074 to 46110d2 Compare October 21, 2021 05:36

anuraaga reviewed Oct 21, 2021

View reviewed changes

Update comments and delete unnecessary code

4cb25c1

anuraaga closed this Oct 21, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using upstream OTel auto instrumentation and minimize otel_wrapper #158

Using upstream OTel auto instrumentation and minimize otel_wrapper #158

NathanielRN commented Oct 21, 2021

anuraaga Oct 21, 2021

NathanielRN Oct 21, 2021

anuraaga Oct 21, 2021

NathanielRN Oct 21, 2021

anuraaga Oct 21, 2021

anuraaga Oct 21, 2021

anuraaga Oct 21, 2021

anuraaga Oct 21, 2021

anuraaga Oct 21, 2021

NathanielRN Oct 21, 2021

anuraaga Oct 21, 2021

NathanielRN Oct 21, 2021

anuraaga Oct 21, 2021

NathanielRN Oct 21, 2021

anuraaga Oct 21, 2021

anuraaga Oct 21, 2021

NathanielRN Oct 21, 2021


		# - Set the service name

		service_name_resource_attribute = "service.name"


		environ.setdefault(OTEL_PROPAGATORS, "tracecontext,b3,xray")

		# - Disable Lambda instrumentation because we will instrument when we know it


		environ[OTEL_PYTHON_DISABLED_INSTRUMENTATIONS] = "aws_lambda"

		# - Use a wrapper because instrumentations don't last across parent -> child

		environ["_HANDLER"] = "otel_wrapper.lambda_handler"

		# - Call the upstream auto instrumentation script

Using upstream OTel auto instrumentation and minimize otel_wrapper #158

Using upstream OTel auto instrumentation and minimize otel_wrapper #158

Conversation

NathanielRN commented Oct 21, 2021

Description

Update the PYTHONPATH so we can call opentelemtry-instrument

Update the otel_wrapper.py to only call the AwsLambdaInstrumentor

Future Work

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Update the PYTHONPATH so we can call `opentelemtry-instrument`

Update the `otel_wrapper.py` to only call the `AwsLambdaInstrumentor`