Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New gcp_stackdriver_logs sink #572

Closed
derekperkins opened this issue Jul 3, 2019 · 7 comments
Closed

New gcp_stackdriver_logs sink #572

derekperkins opened this issue Jul 3, 2019 · 7 comments
Assignees
Labels
domain: data model Anything related to Vector's internal data model domain: logs Anything related to Vector's log events type: feature A value-adding code addition that introduce new functionality.

Comments

@derekperkins
Copy link

https://github.com/fluent/fluent-bit-docs/blob/master/output/stackdriver.md

@binarylogic binarylogic added domain: logs needs: requirements Needs a a list of requirements before work can be begin labels Aug 27, 2019
@binarylogic
Copy link
Contributor

@bruceg I've assigned this issue to you. I think it makes sense to start by doing some research and replying here with a spec (example). 👍

@binarylogic binarylogic added this to the Initial GCP support milestone Sep 7, 2019
@bruceg
Copy link
Member

bruceg commented Oct 24, 2019

stackdriver sink

The stackdriver sink feeds log events to the Google Cloud Platform's Stackdriver logging service. Records will be transmitted as JSON objects to preserve all data from the source and transforms. As Stackdriver supports batch submission (up to 1,000 entries per message) we should make use of this.

Config File(example)

[sinks.my_source_id]
  # REQUIRED - General
  type = "stackdriver" # must be: "stackdriver"
  inputs = ["input1"] # must have an input source

  # REQUIRED - stackdriver

  # The log stream is identified by a project ID and log ID
  project_id = "MY-PROJECT-ID"
  log_id = "THIS-LOG-ID"

  # OPTIONAL - stackdriver

  # The path to the GCP credentials downloaded from the cloud console.
  # If unset, use $GOOGLE_APPLICATION_CREDENTIALS
  credentials_path = "/path/to/credentials.json"

  batch_timeout = SECS
  batch_size = BYTES
  request_in_flight_limit = COUNT
  request_timeout_secs = SECS
  request_rate_limit_duration_secs = SECS
  request_rate_limit_num = COUNT
  request_retry_attempts = COUNT
  request_retry_backoff_secs = SECS

Requirements

  • Access credentials read from Google IAM JSON files
  • Configurable project and log identifiers
  • Configurable batching with Tower request

@bruceg bruceg removed the needs: requirements Needs a a list of requirements before work can be begin label Oct 24, 2019
@binarylogic
Copy link
Contributor

binarylogic commented Oct 25, 2019

Looks great! Only 3 comments:

  1. It appears Stackdriver accepts logs, metrics, exceptions, and more. Do you think we'd create separate sinks for each? Each data type is a separate endpoint, correct? If so, I'd prefer that name this gcp_stackdriver_logs sink.

  2. Are there potentially other authentication schemes for this? I ask because of Rename basic_auth namespace to auth #1084, and I'm wondering if it makes sense to be consistent here as well? Ex:

    auth.strategy = "credentials_file"
    auth.credentials_file = "path/to/credentials.json"

    vs

    credentials_file = "path/to/credentials.json"
  3. Would it be better to rename log and project to log_id and project_id? The latter seems to be more clear.

@bruceg
Copy link
Member

bruceg commented Oct 25, 2019

  1. I've not seen any reference to metrics or exceptions in the stackdriver documentation. There is a brief mention of "events" on the info page. The API documentation shows a single hook for writing data, and it is specifically labelled "write logs". I would think the pubsub system can handle both, but that's obviously a separate system. Am I missing something?

  2. From my reading, I think all GCP server-to-server authentication uses OAuth2, which is a public-key based system. As such, a credentials structure is needed, which is downloaded from the GCP console as a JSON file. I agree it would be good to normalize it into a consistent scheme.

  3. The latter does seem more clear, yes. Also, it may be an organization ID, a billing account ID, or a folder ID (with a distinct log name for each), but I figured as a MVP to stick with just the most common project ID.

@derekperkins
Copy link
Author

Here are links to Stackdriver error reporting and metrics docs. We use errors but not metrics.
https://cloud.google.com/error-reporting/reference/
https://cloud.google.com/monitoring/api/ref_v3/rest/

Yes, the service account json file is the preferred way to authenticate.

@bruceg
Copy link
Member

bruceg commented Oct 30, 2019

Ah, I see, I had misread the metrics as something to read from Stackdriver rather than written to them. In any case, metrics and time series data are handled quite differently by Stackdriver than logs are, so would probably warrant a different sink.

@binarylogic binarylogic changed the title New stackdriver sink New gcp_stackdriver_logs sink Dec 3, 2019
@bruceg
Copy link
Member

bruceg commented Jan 28, 2020

Closed by #1555

@bruceg bruceg closed this as completed Jan 28, 2020
@binarylogic binarylogic added type: feature A value-adding code addition that introduce new functionality. and removed type: new feature labels Jun 16, 2020
@binarylogic binarylogic added domain: logs Anything related to Vector's log events domain: data model Anything related to Vector's internal data model and removed event type: log labels Aug 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain: data model Anything related to Vector's internal data model domain: logs Anything related to Vector's log events type: feature A value-adding code addition that introduce new functionality.
Projects
None yet
Development

No branches or pull requests

3 participants