Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using upstream OTel auto instrumentation and minimize otel_wrapper #158

Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
167 changes: 139 additions & 28 deletions python/src/otel/otel_sdk/otel-instrument
Original file line number Diff line number Diff line change
@@ -1,44 +1,155 @@
#!/usr/bin/env python3
# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
# SPDX-License-Identifier: MIT-0

from os import environ, system
# Copyright The OpenTelemetry Authors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

"""
`otel-instrument`

This script configures and sets up OpenTelemetry Python with the values we
expect will be used by the common user. It does this by setting the environment
variables OpenTelemetry uses, and then initializing OpenTelemetry using the
`opentelemetry-instrument` auto instrumentation script from the
`opentelemetry-instrumentation` package.

Additionally, this configuration assumes the user is using packages conforming
to the `opentelemetry-instrumentation` and `opentelemetry-sdk` specifications.

DO NOT use this script for anything else besides SETTING ENVIRONMENT VARIABLES.
The final `execl` command starts a new python process which replaces the current
one. One environment variables are preserved.

Usage
-----
In the configuration of an AWS Lambda function with this file at the
root level of a Lambda Layer:

.. code::

AWS_LAMBDA_EXEC_WRAPPER = /opt/otel-instrument

"""

import os
import sys
from os import environ, execl

# the path to the interpreter and all of the originally intended arguments
args = sys.argv[1:]
AWS_LAMBDA_FUNCTION_NAME = "AWS_LAMBDA_FUNCTION_NAME"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd expect these to actually refer to a value, not just the name itself. I'd just inline

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do a different name for the constant if you think that's better? I think this pattern is a good idea though because we reference this variable in 3 different places and that's 3 different chances to have a typo (as has happened to me before).

It's also pretty normal in the upstream to do this? https://github.com/open-telemetry/opentelemetry-python/blob/main/opentelemetry-api/src/opentelemetry/environment_variables.py

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renaming to something else seems like it reduces readability of the code that uses the variable. Using variables that refer to names, instead of values, also reduces readability though. I disagree with the upstream pattern here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reference, readability always needs to trump writability - typos during writing is still sort of OK for readability sakes. In many cases it results in duplicate docs, etc that get out of sync but it is what it is. In this case, readability suffers from the constants since then you have to jump to the constant definition to know the value every time - using the string has only positive impacts on readibility though

PYTHONPATH = "PYTHONPATH"
LAMBDA_RUNTIME_DIR = "LAMBDA_RUNTIME_DIR"

# enable OTel wrapper
environ["ORIG_HANDLER"] = environ.get("_HANDLER")
environ["_HANDLER"] = "otel_wrapper.lambda_handler"
# Update the python paths for packages with `sys.path` and `PYTHONPATH`

# config default traces exporter if missing
environ.setdefault("OTEL_TRACES_EXPORTER", "otlp_proto_grpc_span")
# - Expect this file to be at the Lambda Layer root. Then, we know where the
# OpenTelemetry Python packages are and can add them to the PYTHONPATH.
#
# See more:
# https://docs.aws.amazon.com/lambda/latest/dg/configuration-layers.html#configuration-layers-path

# set service name
if environ.get("OTEL_RESOURCE_ATTRIBUTES") is None:
environ["OTEL_RESOURCE_ATTRIBUTES"] = "service.name=%s" % (
environ.get("AWS_LAMBDA_FUNCTION_NAME")
)
elif "service.name=" not in environ.get("OTEL_RESOURCE_ATTRIBUTES"):
environ["OTEL_RESOURCE_ATTRIBUTES"] = "service.name=%s,%s" % (
environ.get("AWS_LAMBDA_FUNCTION_NAME"),
environ.get("OTEL_RESOURCE_ATTRIBUTES"),
)
LAMBDA_LAYER_PKGS_DIR = os.path.abspath(os.path.join(os.sep, "opt", "python"))

# - Set Lambda Layer python packages in PYTHONPATH so `opentelemetry-instrument`
# script can find them (it needs to find `opentelemetry` find the auto
# instrumentation `run()` method)

if PYTHONPATH not in environ:
environ[PYTHONPATH] = LAMBDA_LAYER_PKGS_DIR
elif LAMBDA_LAYER_PKGS_DIR not in environ[PYTHONPATH]:
environ[PYTHONPATH] += os.pathsep + LAMBDA_LAYER_PKGS_DIR

# - Set Lambda runtime python packages in PYTHONPATH so
# `opentelemetry-instrument` script can find them during auto instrumentation
# and instrument them.

if PYTHONPATH not in environ:
environ[PYTHONPATH] = os.environ[LAMBDA_RUNTIME_DIR]
if os.environ[LAMBDA_RUNTIME_DIR] not in environ[PYTHONPATH]:
environ[PYTHONPATH] += os.pathsep + os.environ[LAMBDA_RUNTIME_DIR]

# Configure OpenTelemetry Python with environment variables

# - Set Lambda Layer python packages in current python path so we can find them
# right away in this script

if LAMBDA_LAYER_PKGS_DIR not in sys.path:
sys.path.append(LAMBDA_LAYER_PKGS_DIR)

# TODO: Remove if sdk support resource detector env variable configuration.
from opentelemetry.environment_variables import (
OTEL_PROPAGATORS,
OTEL_TRACES_EXPORTER,
)
from opentelemetry.sdk.environment_variables import (
OTEL_RESOURCE_ATTRIBUTES,
OTEL_SERVICE_NAME,
)

# - Set the default Trace Exporter

environ.setdefault(OTEL_TRACES_EXPORTER, "otlp_proto_grpc_span")

# - Set the service name

environ.setdefault(OTEL_SERVICE_NAME, environ.get(AWS_LAMBDA_FUNCTION_NAME))

# - Set the Resource Detectors (Resource Attributes)
#
# TODO: waiting on OTel Python support for configuring Resource Detectors from
# an environment variable. Replace the bottom code with the following when
# this is possible.
#
# environ["OTEL_RESOURCE_DETECTORS"] = "aws_lambda"
#
lambda_resource_attributes = (
"cloud.region=%s,cloud.provider=aws,faas.name=%s,faas.version=%s"
% (
environ.get("AWS_REGION"),
environ.get("AWS_LAMBDA_FUNCTION_NAME"),
environ.get(AWS_LAMBDA_FUNCTION_NAME),
environ.get("AWS_LAMBDA_FUNCTION_VERSION"),
)
)
environ["OTEL_RESOURCE_ATTRIBUTES"] = "%s,%s" % (
lambda_resource_attributes,
environ.get("OTEL_RESOURCE_ATTRIBUTES"),
)

# start the runtime with the extra options
system(" ".join(args))
if OTEL_RESOURCE_ATTRIBUTES not in environ:
environ[OTEL_RESOURCE_ATTRIBUTES] = lambda_resource_attributes
else:
environ[OTEL_RESOURCE_ATTRIBUTES] = "%s,%s" % (
lambda_resource_attributes,
environ.get(OTEL_RESOURCE_ATTRIBUTES),
)

# - Set the default propagators

environ.setdefault(OTEL_PROPAGATORS, "tracecontext,b3,xray")

# - Use a wrapper because AWS Lambda's `python3 /var/runtime/bootstrap.py` will
# use `imp.load_module` to load the function from the `_HANDLER` environment
# variable. This RELOADS the module and REMOVES any instrumentation patching
# done earlier. So we delay instrumentation until `boostrap.py` imports
# `otel_wrapper.py` at which we know the patching will be picked up.
#
# See more:
# https://docs.python.org/3/library/imp.html#imp.load_module

environ["ORIG_HANDLER"] = environ.get("_HANDLER")
environ["_HANDLER"] = "otel_wrapper.lambda_handler"

# - Call the upstream auto instrumentation script

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strongly recommend rewriting this file into a shell script. It solidifies two facts

  1. It's supposed to only do very simple things like setting environment variables, no business logic can happen in here
  2. There is no parent python process in our lambda handler, the fact that this file is python I feel can confuse a reader into wondering why we don't load instrumentation here (I think you tried this in a PR once :P). A shell script makes obvious this isn't the app, it's just configuring the environment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's been a lot of talk at the python SIG about moving all bash scripts to python. Python does scripts really well!

  1. It's supposed to only do very simple things like setting environment variables, no business logic can happen in here

One huge benefit is being able to import other python packages to GET environment variable constants which like you said is the point of this file. Then we can be sure updates get picked up. (Although they are spec defined things in the instrumentation section are still changing).

All in all a python script should be easier to maintain by a python script.

  1. There is no parent python process in our lambda handler, the fact that this file is python I feel can confuse a reader into wondering why we don't load instrumentation here (I think you tried this in a PR once :P). A shell script makes obvious this isn't the app, it's just configuring the environment

Lol yes I did but I learned a lot about the Lambda instrumentation in #152 🥲 I added an explicit comment at the top description. I feel like that was what the file was missing and with this it shouldn't confuse future readers.

executable = sys.argv[1]

execl(
executable,
executable,
os.path.join(LAMBDA_LAYER_PKGS_DIR, "bin", "opentelemetry-instrument",),
*sys.argv[1:],
)
86 changes: 2 additions & 84 deletions python/src/otel/otel_sdk/otel_wrapper.py
Original file line number Diff line number Diff line change
@@ -1,101 +1,19 @@
import logging
import os

from importlib import import_module
from pkg_resources import iter_entry_points

from opentelemetry.instrumentation.dependencies import get_dist_dependency_conflicts
from opentelemetry.instrumentation.aws_lambda import AwsLambdaInstrumentor
from opentelemetry.environment_variables import OTEL_PYTHON_DISABLED_INSTRUMENTATIONS
from opentelemetry.instrumentation.distro import BaseDistro, DefaultDistro

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

# TODO: waiting OTel Python supports env variable config for resource detector
# from opentelemetry.resource import AwsLambdaResourceDetector
# from opentelemetry.sdk.resources import Resource
# resource = Resource.create().merge(AwsLambdaResourceDetector().detect())
# trace.get_tracer_provider.resource = resource

def _load_distros() -> BaseDistro:
for entry_point in iter_entry_points("opentelemetry_distro"):
try:
distro = entry_point.load()()
if not isinstance(distro, BaseDistro):
logger.debug(
"%s is not an OpenTelemetry Distro. Skipping",
entry_point.name,
)
continue
logger.debug(
"Distribution %s will be configured", entry_point.name
)
return distro
except Exception as exc: # pylint: disable=broad-except
logger.debug("Distribution %s configuration failed", entry_point.name)
return DefaultDistro()

def _load_instrumentors(distro):
package_to_exclude = os.environ.get(OTEL_PYTHON_DISABLED_INSTRUMENTATIONS, [])
if isinstance(package_to_exclude, str):
package_to_exclude = package_to_exclude.split(",")
# to handle users entering "requests , flask" or "requests, flask" with spaces
package_to_exclude = [x.strip() for x in package_to_exclude]

for entry_point in iter_entry_points("opentelemetry_instrumentor"):
if entry_point.name in package_to_exclude:
logger.debug(
"Instrumentation skipped for library %s", entry_point.name
)
continue

try:
conflict = get_dist_dependency_conflicts(entry_point.dist)
if conflict:
logger.debug(
"Skipping instrumentation %s: %s",
entry_point.name,
conflict,
)
continue

# tell instrumentation to not run dep checks again as we already did it above
distro.load_instrumentor(entry_point, skip_dep_check=True)
logger.info("Instrumented %s", entry_point.name)
except Exception as exc: # pylint: disable=broad-except
logger.debug("Instrumenting of %s failed", entry_point.name)

def _load_configurators():
configured = None
for entry_point in iter_entry_points("opentelemetry_configurator"):
if configured is not None:
logger.warning(
"Configuration of %s not loaded, %s already loaded",
entry_point.name,
configured,
)
continue
try:
entry_point.load()().configure() # type: ignore
configured = entry_point.name
except Exception as exc: # pylint: disable=broad-except
logger.debug("Configuration of %s failed", entry_point.name)
AwsLambdaInstrumentor().instrument()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When upstreaming, though initially probably a PR for this repo since it's easier to test I guess, strongly recommend providing the original handler as a parameter to this function. That's all it should take to make the instrumentation work both manually, if a user wants to, or automatically from the lambda layer. We just want to decouple ORIG_HANDLER as it's a mysterious dependency between the two components.

I'd recommend sending a PR that only adds AwsLambdaInstrumenter file and has unit tests that exercise the instrumentation manually first to keep the scope small and easy to review.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize currently the instrumentation seems to handle by falling back to _HANDLER when ORIG_HANDLER isn't set. It works ok but less coupling will make the parts easier to follow. Instrumentation is about instrumenting the handler, wrapper is about setting up the instrumentation automatically in the runtime. The separation will improve the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend sending a PR that only adds AwsLambdaInstrumenter file and has unit tests that exercise the instrumentation manually first to keep the scope small and easy to review.

Will definitely keep this in mind as I ask for feedback from the Python SIG. The actual code change in open-telemetry/opentelemetry-python-contrib#739 is small though and when I brought it up last week they said they would be able to help me review it! It's inflated by the LICENSE.

The tests already give an example of BOTH auto instrumentation AND manual instrumentation. Auto Instrumentation is a super important part to this product so I feel reluctant to commit it later when we already have tests that can give us confidence the scripts is working as intended.

(I put in a lot of effort into #150 to make sure the tests use the script because it helps give us confidence that instrumentation works as intended in Lambda).



def modify_module_name(module_name):
"""Returns a valid modified module to get imported"""
return ".".join(module_name.split("/"))


class HandlerError(Exception):
pass

distro = _load_distros()
distro.configure()
_load_configurators()
_load_instrumentors(distro)
# TODO: move to python-contrib
AwsLambdaInstrumentor().instrument(skip_dep_check=True)

path = os.environ.get("ORIG_HANDLER", None)
if path is None:
Expand Down