Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AggregateError error message is ignored during processing #62299

Open
3 tasks done
kamilogorek opened this issue Dec 22, 2023 · 14 comments
Open
3 tasks done

AggregateError error message is ignored during processing #62299

kamilogorek opened this issue Dec 22, 2023 · 14 comments

Comments

@kamilogorek
Copy link
Contributor

Is there an existing issue for this?

How do you use Sentry?

Sentry Saas (sentry.io)

Which SDK are you using?

@sentry/browser

SDK Version

7.91.0

Framework Version

No response

Link to Sentry event

No response

SDK Setup

No response

Steps to Reproduce

function foo () {
  const rejections = [new Error('Message 1'), new Error('Message 2'), new Error('Message 3')]
  throw new AggregateError(rejections, 'wat')
}

try {
  foo()
} catch(e) {
  Sentry.captureException(e)
}

Expected Result

When creating a new AggregateError instance, it accepts an optional 2nd argument, which should be used as the error message per https://tc39.es/ecma262/multipage/fundamental-objects.html#sec-aggregate-error

Currently LinkedErrors will ignore that property and use the message from the first child instead.
In the case above, wat should be the message for the root error.

Actual Result

First child error is used as the source of the message.

image

@lforst
Copy link
Member

lforst commented Dec 27, 2023

Thanks @kamilogorek for the hint! <3

As for solving this issue, I think this needs to be fixed in Relay or the Product. There is some mechanism that is deciding what to render as the error message - I currently don't know what that mechanism is. The SDK, as of now, is sending an array of errors (exception.values) that point to each other via exception_ids and parent_ids to achieve a "hierarchy" between errors. The AggregateError also carries a is_exception_group: true flag.

The above is all defined in this RFC: https://github.com/getsentry/rfcs/blob/760467b85dbf86bd8b2b88d2a81f1a258dc07a1d/text/0079-exception-groups.md

To make solving this a bit easier for the teams that pick this up, here is an example event: https://sentry-sdks.sentry.io/issues/4765181401/?project=4505391490007040&query=is%3Aunresolved&referrer=issue-stream&statsPeriod=30d&stream_index=1

Here is that event's payload:

{"event_id":"50104445519745998f51064855b0165b","project":4505391490007040,"release":"c784c20456edf2a1807df6d988991dcbabfbeb06","dist":null,"platform":"javascript","message":"","datetime":"2023-12-27T08:34:11+00:00","tags":[["browser","Chrome 120.0.0"],["browser.name","Chrome"],["device","Mac"],["device.family","Mac"],["environment","production"],["handled","yes"],["level","error"],["mechanism","generic"],["os","Mac OS X >=10.15.7"],["os.name","Mac OS X"],["release","c784c20456edf2a1807df6d988991dcbabfbeb06"],["user","ip:84.115.220.159"],["url","http://localhost:3000/"]],"_metrics":{"bytes.ingested.event":3244,"bytes.stored.event":14770},"breadcrumbs":{"values":[{"timestamp":1703666047.316,"type":"default","category":"sentry.transaction","level":"info","message":"4288946784ac465b9c9b90e8fde241a4","event_id":"4288946784ac465b9c9b90e8fde241a4"}]},"contexts":{"AggregateError":{"type":"AggregateError"},"browser":{"name":"Chrome","version":"120.0.0","type":"browser"},"device":{"family":"Mac","model":"Mac","brand":"Apple","type":"device"},"os":{"name":"Mac OS X","version":">=10.15.7","type":"os"},"trace":{"trace_id":"211079cebf7848b1a81e05ce30ebde4e","span_id":"ac065b2420163ecb","status":"unknown","type":"trace"}},"culprit":"browserError(src/browser/index)","debug_meta":{"images":[{"code_file":"http://localhost:3000/script.js","debug_id":"5b128576-a8d0-4dd3-a86d-a4795d822f83","type":"sourcemap"}]},"environment":"production","errors":[{"type":"js_no_source","symbolicator_type":"missing_source","url":"http://localhost:3000/"}],"exception":{"values":[{"type":"Error","value":"Message 3","stacktrace":{"frames":[{"function":"HTMLButtonElement.onclick","module":"<unknown module>","filename":"http://localhost:3000/","abs_path":"http://localhost:3000/","lineno":10,"colno":52,"in_app":true},{"function":"browserError","module":"src/browser/index","filename":"../../src/browser/index.ts","abs_path":"http://localhost:3000/src/browser/index.ts","lineno":24,"colno":71,"pre_context":["  debug: true,","  normalizeDepth: 3,","});","","function browserError() {"],"context_line":"  const rejections = [new Error('Message 1'), new Error('Message 2'), new Error('Message 3')];","post_context":["  // @ts-ignore","  throw new AggregateError(rejections, 'wat');","}","","function serverError() {"],"in_app":true,"data":{"sourcemap":"http://localhost:3000/script.js.map","resolved_with":"debug-id","symbolicated":true}}]},"raw_stacktrace":{"frames":[{"function":"HTMLButtonElement.onclick","filename":"http://localhost:3000/","abs_path":"http://localhost:3000/","lineno":10,"colno":52,"in_app":true},{"function":"Object.browserError","filename":"/script.js","abs_path":"http://localhost:3000/script.js","lineno":12801,"colno":73,"pre_context":["      tracesSampleRate: 1.0,","      debug: true,","      normalizeDepth: 3","    });","    function browserError() {"],"context_line":"      var rejections = [new Error('Message 1'), new Error('Message 2'), new Error('Message 3')];","post_context":["      // @ts-ignore","      throw new AggregateError(rejections, 'wat');","    }","    function serverError() {","      fetch('error');"],"in_app":true}]},"mechanism":{"type":"onerror","handled":false,"source":"errors[2]","exception_id":3,"parent_id":0}},{"type":"Error","value":"Message 2","stacktrace":{"frames":[{"function":"HTMLButtonElement.onclick","module":"<unknown module>","filename":"http://localhost:3000/","abs_path":"http://localhost:3000/","lineno":10,"colno":52,"in_app":true},{"function":"browserError","module":"src/browser/index","filename":"../../src/browser/index.ts","abs_path":"http://localhost:3000/src/browser/index.ts","lineno":24,"colno":47,"pre_context":["  debug: true,","  normalizeDepth: 3,","});","","function browserError() {"],"context_line":"  const rejections = [new Error('Message 1'), new Error('Message 2'), new Error('Message 3')];","post_context":["  // @ts-ignore","  throw new AggregateError(rejections, 'wat');","}","","function serverError() {"],"in_app":true,"data":{"sourcemap":"http://localhost:3000/script.js.map","resolved_with":"debug-id","symbolicated":true}}]},"raw_stacktrace":{"frames":[{"function":"HTMLButtonElement.onclick","filename":"http://localhost:3000/","abs_path":"http://localhost:3000/","lineno":10,"colno":52,"in_app":true},{"function":"Object.browserError","filename":"/script.js","abs_path":"http://localhost:3000/script.js","lineno":12801,"colno":49,"pre_context":["      tracesSampleRate: 1.0,","      debug: true,","      normalizeDepth: 3","    });","    function browserError() {"],"context_line":"      var rejections = [new Error('Message 1'), new Error('Message 2'), new Error('Message 3')];","post_context":["      // @ts-ignore","      throw new AggregateError(rejections, 'wat');","    }","    function serverError() {","      fetch('error');"],"in_app":true}]},"mechanism":{"type":"chained","handled":true,"source":"errors[1]","exception_id":2,"parent_id":0}},{"type":"Error","value":"Message 1","stacktrace":{"frames":[{"function":"HTMLButtonElement.onclick","module":"<unknown module>","filename":"http://localhost:3000/","abs_path":"http://localhost:3000/","lineno":10,"colno":52,"in_app":true},{"function":"browserError","module":"src/browser/index","filename":"../../src/browser/index.ts","abs_path":"http://localhost:3000/src/browser/index.ts","lineno":24,"colno":23,"pre_context":["  debug: true,","  normalizeDepth: 3,","});","","function browserError() {"],"context_line":"  const rejections = [new Error('Message 1'), new Error('Message 2'), new Error('Message 3')];","post_context":["  // @ts-ignore","  throw new AggregateError(rejections, 'wat');","}","","function serverError() {"],"in_app":true,"data":{"sourcemap":"http://localhost:3000/script.js.map","resolved_with":"debug-id","symbolicated":true}}]},"raw_stacktrace":{"frames":[{"function":"HTMLButtonElement.onclick","filename":"http://localhost:3000/","abs_path":"http://localhost:3000/","lineno":10,"colno":52,"in_app":true},{"function":"Object.browserError","filename":"/script.js","abs_path":"http://localhost:3000/script.js","lineno":12801,"colno":25,"pre_context":["      tracesSampleRate: 1.0,","      debug: true,","      normalizeDepth: 3","    });","    function browserError() {"],"context_line":"      var rejections = [new Error('Message 1'), new Error('Message 2'), new Error('Message 3')];","post_context":["      // @ts-ignore","      throw new AggregateError(rejections, 'wat');","    }","    function serverError() {","      fetch('error');"],"in_app":true}]},"mechanism":{"type":"chained","handled":true,"source":"errors[0]","exception_id":1,"parent_id":0}},{"type":"AggregateError","value":"wat","stacktrace":{"frames":[{"function":"HTMLButtonElement.onclick","module":"<unknown module>","filename":"http://localhost:3000/","abs_path":"http://localhost:3000/","lineno":10,"colno":52,"in_app":true},{"function":"browserError","module":"src/browser/index","filename":"../../src/browser/index.ts","abs_path":"http://localhost:3000/src/browser/index.ts","lineno":26,"colno":9,"pre_context":["});","","function browserError() {","  const rejections = [new Error('Message 1'), new Error('Message 2'), new Error('Message 3')];","  // @ts-ignore"],"context_line":"  throw new AggregateError(rejections, 'wat');","post_context":["}","","function serverError() {","  fetch('error');","}"],"in_app":true,"data":{"sourcemap":"http://localhost:3000/script.js.map","resolved_with":"debug-id","symbolicated":true}}]},"raw_stacktrace":{"frames":[{"function":"HTMLButtonElement.onclick","filename":"http://localhost:3000/","abs_path":"http://localhost:3000/","lineno":10,"colno":52,"in_app":true},{"function":"Object.browserError","filename":"/script.js","abs_path":"http://localhost:3000/script.js","lineno":12803,"colno":13,"pre_context":["      normalizeDepth: 3","    });","    function browserError() {","      var rejections = [new Error('Message 1'), new Error('Message 2'), new Error('Message 3')];","      // @ts-ignore"],"context_line":"      throw new AggregateError(rejections, 'wat');","post_context":["    }","    function serverError() {","      fetch('error');","    }","    function dialog() {"],"in_app":true}]},"mechanism":{"type":"generic","handled":true,"is_exception_group":true,"exception_id":0}}]},"fingerprint":["{{ default }}"],"grouping_config":{"enhancements":"eJybzDRxc15qeXFJZU6qlZGBkbGugaGuoeEEAHJMCAM","id":"newstyle:2023-01-11"},"hashes":["fddb684c0e48458f93686e9a757d1a90"],"ingest_path":[{"version":"23.12.1","public_key":"XE7QiyuNlja9PZ7I9qJlwQotzecWrUIN91BAO7Q5R38"}],"key_id":"3206523","level":"error","location":"../../src/browser/index.ts","logger":"","main_exception_id":1,"metadata":{"display_title_with_tree_label":false,"filename":"../../src/browser/index.ts","function":"browserError","in_app_frame_mix":"in-app-only","type":"Error","value":"Message 1"},"nodestore_insert":1703666055.618679,"received":1703666051.991679,"request":{"url":"http://localhost:3000/","headers":[["User-Agent","Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"]]},"scraping_attempts":[{"details":"Can't connect to restricted host localhost","reason":"invalid_host","status":"failure","url":"http://localhost:3000/"},{"status":"not_attempted","url":"http://localhost:3000/script.js"},{"status":"not_attempted","url":"http://localhost:3000/script.js.map"}],"sdk":{"name":"sentry.javascript.browser","version":"7.90.0","integrations":["InboundFilters","FunctionToString","TryCatch","Breadcrumbs","GlobalHandlers","LinkedErrors","Dedupe","HttpContext","BrowserTracing","ExtraErrorData"],"packages":[{"name":"npm:@sentry/browser","version":"7.90.0"}]},"timestamp":1703666051.843,"title":"Error: Message 1","type":"error","user":{"ip_address":"84.115.220.159","geo":{"country_code":"AT","city":"Vienna","subdivision":"Vienna","region":"Austria"}},"version":"7"}

And here is the envelope item the SDK sent for that event:

{"exception":{"values":[{"type":"Error","value":"Message 3","stacktrace":{"frames":[{"filename":"http://localhost:3000/","function":"HTMLButtonElement.onclick","in_app":true,"lineno":10,"colno":52},{"filename":"http://localhost:3000/script.js","function":"Object.browserError","in_app":true,"lineno":12801,"colno":73}]},"mechanism":{"type":"onerror","handled":false,"source":"errors[2]","exception_id":3,"parent_id":0}},{"type":"Error","value":"Message 2","stacktrace":{"frames":[{"filename":"http://localhost:3000/","function":"HTMLButtonElement.onclick","in_app":true,"lineno":10,"colno":52},{"filename":"http://localhost:3000/script.js","function":"Object.browserError","in_app":true,"lineno":12801,"colno":49}]},"mechanism":{"type":"chained","handled":true,"source":"errors[1]","exception_id":2,"parent_id":0}},{"type":"Error","value":"Message 1","stacktrace":{"frames":[{"filename":"http://localhost:3000/","function":"HTMLButtonElement.onclick","in_app":true,"lineno":10,"colno":52},{"filename":"http://localhost:3000/script.js","function":"Object.browserError","in_app":true,"lineno":12801,"colno":25}]},"mechanism":{"type":"chained","handled":true,"source":"errors[0]","exception_id":1,"parent_id":0}},{"type":"AggregateError","value":"wat","stacktrace":{"frames":[{"filename":"http://localhost:3000/","function":"HTMLButtonElement.onclick","in_app":true,"lineno":10,"colno":52},{"filename":"http://localhost:3000/script.js","function":"Object.browserError","in_app":true,"lineno":12803,"colno":13}]},"mechanism":{"type":"generic","handled":true,"is_exception_group":true,"exception_id":0}}]},"level":"error","platform":"javascript","request":{"url":"http://localhost:3000/","headers":{"User-Agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"}},"event_id":"50104445519745998f51064855b0165b","timestamp":1703666051.843,"environment":"production","release":"c784c20456edf2a1807df6d988991dcbabfbeb06","sdk":{"integrations":["InboundFilters","FunctionToString","TryCatch","Breadcrumbs","GlobalHandlers","LinkedErrors","Dedupe","HttpContext","BrowserTracing","ExtraErrorData"],"name":"sentry.javascript.browser","version":"7.90.0","packages":[{"name":"npm:@sentry/browser","version":"7.90.0"}]},"breadcrumbs":[{"timestamp":1703666047.316,"category":"sentry.transaction","event_id":"4288946784ac465b9c9b90e8fde241a4","message":"4288946784ac465b9c9b90e8fde241a4"}],"contexts":{"trace":{"trace_id":"211079cebf7848b1a81e05ce30ebde4e","span_id":"ac065b2420163ecb"},"AggregateError":{}},"debug_meta":{"images":[{"type":"sourcemap","code_file":"http://localhost:3000/script.js","debug_id":"5b128576-a8d0-4dd3-a86d-a4795d822f83"}]}}

Suggestion

If we detect that there is an Exception Group in an event, we prioritize the message of the error that has is_exception_group: true. If there is no such flag on an error, we fall back to the previous logic, because in the case of "linked errors" without exception groups, we still want to have the message of the "first" error that happened.

@lforst lforst transferred this issue from getsentry/sentry-javascript Dec 27, 2023
@getsantry
Copy link
Contributor

getsantry bot commented Dec 27, 2023

Assigning to @getsentry/support for routing ⏲️

@getsantry
Copy link
Contributor

getsantry bot commented Dec 27, 2023

Routing to @getsentry/product-owners-issues for triage ⏲️

@getsantry
Copy link
Contributor

getsantry bot commented Dec 27, 2023

Routing to @getsentry/product-owners-settings-relay for triage ⏲️

@getsantry
Copy link
Contributor

getsantry bot commented Dec 27, 2023

Routing to @getsentry/product-owners-settings-relay for triage ⏲️

@jjbayer
Copy link
Member

jjbayer commented Jan 3, 2024

This seems to happen because the main_exception_id is set, which should only happen if there's only one distinct exception in the group:

# We'll also set the main_exception_id, which is used in the extract_metadata function
# in src/sentry/eventtypes/error.py - which will ensure the issue is titled by this
# item rather than the exception group.
if len(distinct_top_level_exceptions) == 1:
main_exception = distinct_top_level_exceptions[0]
event.data["main_exception_id"] = main_exception.mechanism.exception_id
return list(get_first_path(main_exception))

There might be a bug in the logic that identifies the main exception.

@getsantry
Copy link
Contributor

getsantry bot commented Jan 3, 2024

Routing to @getsentry/product-owners-issues for triage ⏲️

@lforst
Copy link
Member

lforst commented Jan 3, 2024

@jjbayer Dayum that could be it! Thanks for the investigation!

@lforst
Copy link
Member

lforst commented Jan 3, 2024

One thing though... the SDK doesn't set that. Does Sentry default to setting it to 1?

@jjbayer
Copy link
Member

jjbayer commented Jan 4, 2024

the SDK doesn't set that. Does Sentry default to setting it to 1?

It's a server-side helper flag that only ever gets set in the code i linked above, IIUC. If the flag is not set, we default to the last exception:

# If the event data has been marked with a main_exception_id, then we should be able to
# find the exception with the matching metadata.exception_id and use that one.
# This can be the case for some exception groups.
# Otherwise, the default behavior is to use the last one in the list.

@malwilley
Copy link
Member

Thanks everyone for the help debugging! I'll add this to our backlog and notify the people working on grouping improvements so they can prioritize as needed.

@lobsterkatie
Copy link
Member

lobsterkatie commented Feb 29, 2024

This is related to, but not the same as, #59679 and, more generally, #64088. In those cases, we have chained errors rather than aggregate errors, and the problem is that we've been labeling linked errors as aggregate errors, and the main/top error has therefore been getting ignored. getsentry/sentry-javascript#10850 fixes that, but only just got merged and hasn't yet been released.

And @jjbayer, yeah, all of the above matches what I found here. It also matches what's in the RFC. So I think this is indeed a backend fix, rather than an SDK fix as the above was.

From your comment above:

This seems to happen because the main_exception_id is set, which should only happen if there's only one distinct exception in the group
There might be a bug in the logic that identifies the main exception.

I'm not sure it's a bug - it seems to match the RFC - but that approach makes the assumption that the root doesn't contain meaningful information, and here that's backfiring. Specifically, @kamilogorek (BTW, hey, Kamil! 🙂), in your example above, if your three child errors were separate events, they'd all fall into the same group (matching stacktraces), so for our purposes they count as only a single distinct exception, which in turn lets us ignore the root error. The only way we'd not ignore the root is if we couldn't collapse all the child exceptions into one, at which point it's sort of like, ugh, I guess we have to use the root, 'cause we can't decide between the kids... - in other words, we don't want to consider the root, but we will if we have no other choice. (The full logic, from which Joris quoted above, is here.)

Now, in some cases, that's a perfectly reasonable assumption. Run the following and you will indeed get a root error worth ignoring:

await Promise.any(
  [1, 2, 3].map((num) => {
    return Promise.reject(new Error(`Rejected promise #${num}`));
  })
);

which results in

AggregateError: All promises were rejected {
  [errors]: [
    Error: Rejected promise #1
      at index.js:5:31
      at Array.map (<anonymous>)
      at ...
    Error: Rejected promise #2
      at index.js:5:31
      at Array.map (<anonymous>)
      at ...
    Error: Rejected promise #3
      at index.js:5:31
      at Array.map (<anonymous>)
      at ...
  ]
}

The question is, how do we tell the difference between that case and the case where ignoring the root is the wrong choice? And how common is that second case? I'm not sure I know the answer to either question, especially a platform-independent answer.

Kamil, or anyone else, thoughts?

@kamilogorek
Copy link
Contributor Author

in your example above, were your three child errors separate events, they'd all fall into the same group (matching stacktraces), which then makes them count as only a single distinct exception...

In this specific example yes, but imagine something like this:

async function deleteIntegration(integration) {
  await runTransaction(integration, [
    cleanupConnections,
    removeAuthorization,
    notifyUser,
  ])
}

const deleteIntegrationResults = await Promise.allSettled(
  integrations.map(async (integration) => deleteIntegration(integration))
)

const deleteIntegrationRejections = deleteIntegrationResults
  .filter(isRejected)
  .map((r) => r.reason)

if (deleteIntegrationRejections.length) {
  Sentry.captureException(new AggregateError(deleteIntegrationRejections, `Failed to delete integrations ${integration.id} for user ${user.id}`))
}

It can break on any of 3 operations inside the transaction, and those should not be grouped together.
Tbh I don't know the answer myself either from the top of my head.

Hey 👋

@lobsterkatie
Copy link
Member

Hmmm, yeah. That's a perfectly valid use case, and you're right that we don't handle it.

As a workaround for specific errors, I think it should work to use the project-level fingerprint rules to both fix the grouping and give it the right title. So, for your example above, it'd be

message:"Failed to delete integrations * for user *" -> {{ message }} title="{{ message }}"

(Or you could do ... -> {{ default }} title="{{ message }} if you're happy with the grouping and only want to fix the title.)

That said, doing it that way could get to be a pain if you've got lots of different errors like this, plus most users have no idea it's even an option.

A(slightly) better way would be for our backend to recognize a new this_is_a_meaningful_aggregate_error_so_please_actually_use_it_when_grouping_and_titling_the_issue flag, which one could set as capture context directly in the captureException call. (We'd obviously think of a better name.) So you'd do

Sentry.captureException(
  new AggregateError(
    deleteIntegrationRejections,
    `Failed to delete integrations ${integration.id} for user ${user.id}`
  ),
  {
    mechanism: {
      this_is_a_meaningful_aggregate_error_so_please_actually_use_it_when_grouping_and_titling_the_issue: true,
    },
  }
);

Even better would be for the SDK to recognize a manual call to captureException with an instance of AggregateError having a non-empty message and set the flag for you. (There was talk of making the SDK able to distinguish between manual and internal captureException calls (I actually played around with it a bit shortly before I got transferred), but IDK if that ever happened.)

Anyway, for now I think the fingerprint rules are your best bet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Development

No branches or pull requests

5 participants