-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(new sink): Initial gcp_stackdriver_logging
sink implementation
#1555
Conversation
6f3640d
to
6b8615c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a good first pass, excited to get this merged. Just a few documentation comments I had.
examples = ["vector-123456"] | ||
description = """\ | ||
The billing account ID to which to publish logs. | ||
Exactly one of `billing_account_id`, `folder_id`, `log_id`, or `organization_id` must be set.\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify, a user cannot supply multiple of these attributes, they must supply exactly one? If that's the case I wonder if it would make more sense to have a single polymorphic attribute:
[sinks.stackdriver]
type = "gcp_stackdriver_logging"
billing.type = "account"
billing.id = "12345"
Just food for thought. I don't completely understand what these attributes are used for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The log name is one of projects/[PROJECT_ID]/logs/[LOG_ID]
, folders/[FOLDER_ID]/logs[LOG_ID]
, etc. Exactly one of the four attributes must be supplied to construct the log name.
I see however that I made a mistake in which four they are, should include project_id
instead of log_id
.
required = "true" | ||
examples = ["global", "gce_instance"] | ||
description = """\ | ||
The monitored resource type. \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I completely understand what this is. Could you explain a little further? What does global
mean in this context? Are there docs we could link to that further describe what this is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, understanding this requires a bit of knowledge of how Stackdriver organizes logs. It was a bit confusing to me at first too. The docs are here: https://cloud.google.com/logging/docs/reference/v2/rest/v2/MonitoredResource
common = false | ||
examples = ["vector-123456"] | ||
description = """\ | ||
The log ID to which to publish logs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we link to docs or help the user understand where they can obtain this ID? Or should it be common sense for a Stackdriver user?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be common sense, not sure. It is not an ID that is retrieved from GCP, rather it is something the user would generate based on what they are logging from. That is, you can specify anything and Stackdriver will file it away.
common = false | ||
examples = ["vector-123456"] | ||
description = """\ | ||
The folder ID to which to publish logs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we link to docs or help the user understand where they can obtain this ID? Or should it be common sense for a Stackdriver user?
request_retry_attempts = 9223372036854775807 | ||
request_retry_initial_backoff_secs = 1 | ||
request_retry_max_duration_secs = 10 | ||
request_in_flight_limit = 5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this aligned with API limits or just an arbitrary limit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most are arbitrary, but the rate limit and batch size are from the API docs: https://cloud.google.com/logging/quotas#log-limits
gcp_stackdriver_logging
sink implementation
044653a
to
e8560ae
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great! I think the way you did the splice was really smart!
I'd like to see the two config changes before we merge.
Another thing, is maybe we want to create a src/sinks/gcp
folder that contains the util file, the stackdriver and pubsub sink. What do you think?
src/sinks/gcp_stackdriver_logging.rs
Outdated
logs.pop(); // Strip the trailing comma | ||
|
||
let mut body = wrapper.clone().into_bytes(); | ||
body.splice(wrapper_splice..wrapper_splice, logs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a cool solution! I've been thinking all week how I want to accomplish this and I think this is as close to a good solution as well get.
I have some ideas of how to improve this but won't block this PR on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Frankly, I thought this was a huge hack, but if you like it, I'm happy. Best would of course be some kind of JSON serializer that could do incremental work, but it's not obvious how that might work in practice.
I am of course curious what could be done to improve on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, ive been mulling over this problem for a while since this is a major thing with making easy to write http sinks. I plan on seeing if I can clean up this hack next week. I think you have basically the gist of it. I think we could possibly write a custom body actually for http that could basically have slots for 3 slices so we don't have to do any double allocations like what is happening now. The other idea also is to use bytes as the "static" bytes and arc clone them rather then recopying the vec.
But I think for this right now, this is a good hack and its not messy either so I'm happy to keep this for now.
I think this also could use a rebase against master. |
It would make some sense. If we're going to do that, we should probably do similar with the AWS too, which makes it a "code debt" kind of thing. At least it's a separate PR. |
Signed-off-by: Bruce Guenter <bruce@timber.io>
Signed-off-by: Bruce Guenter <bruce@timber.io>
1b6c94c
to
1b84780
Compare
Signed-off-by: Bruce Guenter <bruce@timber.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Great work!
Signed-off-by: Bruce Guenter <bruce@timber.io>
Signed-off-by: binarylogic <bjohnson@binarylogic.com>
Signed-off-by: binarylogic <bjohnson@binarylogic.com>
@bruceg I'm writing documentation on GCP authentication. Do you think it makes sense to support the |
Also, it appears the |
Signed-off-by: binarylogic <bjohnson@binarylogic.com>
I had thought The |
Signed-off-by: binarylogic <bjohnson@binarylogic.com>
Signed-off-by: Bruce Guenter <bruce@timber.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work!
Adds a sink to support GCP Stackdriver Logging