Skip to content

Commit

Permalink
healthcheckextension: Add an ability to delay collector shutdown afte…
Browse files Browse the repository at this point in the history
…r HC NotReady state

A new optional `shutdown_delay_duration` parameter added with default to 0, means no delay.
This is useful in k8s env to delay shutdown, allowing pod readiness to detect NotReady state and move traffic.
  • Loading branch information
Yuri Kotov committed Aug 7, 2024
1 parent 3cadc39 commit 55e0c5a
Show file tree
Hide file tree
Showing 7 changed files with 28 additions and 1 deletion.
4 changes: 3 additions & 1 deletion extension/healthcheckextension/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,8 @@ The following settings are required:

- `endpoint` (default = localhost:13133): Address to publish the health check status. For full list of `ServerConfig` refer [here](https://github.com/open-telemetry/opentelemetry-collector/tree/main/config/confighttp). You can temporarily disable the `component.UseLocalHostAsDefaultHost` feature gate to change this to 0.0.0.0:13133. This feature gate will be removed in a future release.
- `path` (default = "/"): Specifies the path to be configured for the health check server.
- `response_body` (default = ""): Specifies a static body that overrides the default response returned by the health check service.
- `response_body` (default = ""): Specifies a static body that overrides the default response returned by the health check service.
- `shutdown_delay_duration` (default = 0): Specifies a sleep interval for setting NotReady status. This is useful in k8s env to delay shutdown, allowing pod readiness to detect NotReady state and move traffic.

Example:

Expand All @@ -44,6 +45,7 @@ extensions:
cert_file: "/path/to/cert.crt"
key_file: "/path/to/key.key"
path: "/health/status"
shutdown_delay_duration: 2s
```
The full list of settings exposed for this exporter is documented [here](./config.go)
Expand Down
4 changes: 4 additions & 0 deletions extension/healthcheckextension/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,10 @@ type Config struct {

// CheckCollectorPipeline contains the list of settings of collector pipeline health check
CheckCollectorPipeline checkCollectorPipelineSettings `mapstructure:"check_collector_pipeline"`

// ShutdownDelayDuration represents the sleep interval for setting NotReady status
// This is useful in k8s env to delay shutdown, allowing pod readiness to detect NotReady state and move traffic
ShutdownDelayDuration time.Duration `mapstructure:"shutdown_delay_duration"`
}

var _ component.Config = (*Config)(nil)
Expand Down
11 changes: 11 additions & 0 deletions extension/healthcheckextension/config_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ package healthcheckextension
import (
"path/filepath"
"testing"
"time"

"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
Expand Down Expand Up @@ -59,6 +60,16 @@ func TestLoadConfig(t *testing.T) {
id: component.NewIDWithName(metadata.Type, "invalidpath"),
expectedErr: errInvalidPath,
},
{
id: component.NewIDWithName(metadata.Type, "shutdowndelay5s"),
expected: &Config{
ServerConfig: confighttp.NewDefaultServerConfig(),
Path: "/",
ResponseBody: nil,
CheckCollectorPipeline: defaultCheckCollectorPipelineSettings(),
ShutdownDelayDuration: 5 * time.Second,
},
},
}
for _, tt := range tests {
t.Run(tt.id.String(), func(t *testing.T) {
Expand Down
1 change: 1 addition & 0 deletions extension/healthcheckextension/factory.go
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ func createDefaultConfig() component.Config {
},
CheckCollectorPipeline: defaultCheckCollectorPipelineSettings(),
Path: "/",
ShutdownDelayDuration: 0,
}
}

Expand Down
1 change: 1 addition & 0 deletions extension/healthcheckextension/factory_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ func TestFactory_CreateDefaultConfig(t *testing.T) {
},
CheckCollectorPipeline: defaultCheckCollectorPipelineSettings(),
Path: "/",
ShutdownDelayDuration: 0,
}, cfg)

assert.NoError(t, componenttest.CheckConfigStruct(cfg))
Expand Down
6 changes: 6 additions & 0 deletions extension/healthcheckextension/healthcheckextension.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ import (
"errors"
"fmt"
"net/http"
"time"

"go.opentelemetry.io/collector/component"
"go.opentelemetry.io/collector/extension"
Expand Down Expand Up @@ -91,6 +92,11 @@ func (hc *healthCheckExtension) Ready() error {

func (hc *healthCheckExtension) NotReady() error {
hc.state.Set(healthcheck.Unavailable)

if hc.config.ShutdownDelayDuration > 0 {
time.Sleep(hc.config.ShutdownDelayDuration)
}

return nil
}

Expand Down
2 changes: 2 additions & 0 deletions extension/healthcheckextension/testdata/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,3 +28,5 @@ health_check/invalidpath:
enabled: false
interval: "5m"
exporter_failure_threshold: 5
health_check/shutdowndelay5s:
shutdown_delay_duration: 5s

0 comments on commit 55e0c5a

Please sign in to comment.