Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-enable duplicate/unused diagnostic code detection #19624

Closed
ghost opened this issue Dec 7, 2014 · 11 comments
Closed

Re-enable duplicate/unused diagnostic code detection #19624

ghost opened this issue Dec 7, 2014 · 11 comments
Labels
A-diagnostics Area: Messages for errors, warnings, and lints

Comments

@ghost
Copy link

ghost commented Dec 7, 2014

It was disabled in #19362. It'd be good to have it back one way or another.

@ghost ghost added the A-diagnostics Area: Messages for errors, warnings, and lints label Dec 7, 2014
@ghost
Copy link
Author

ghost commented Dec 10, 2014

My preferred approach would be to have each crate maintain its own set of diagnostics. All crates would continue to use the same universal naming scheme (EXXXX) and warn on unused codes. The main crate (rustc) would at compile-time cross-check all diagnostic sets from child crates for duplicate codes.

@ghost
Copy link
Author

ghost commented Dec 10, 2014

The difficulty with the above approach is when adding a new diagnostic, you'll have to inspect all sets for the lowest available code.

An alternative is to have a separate crate for the diagnostics that all other crates depend on. All crates would be able to do the "is this code registered?" sanity check but only rustc would warn on unused and duplicate codes.

@nikomatsakis
Copy link
Contributor

Some thoughts from IRC:

  1. One easy option is to use the tidy script to do a check across all crates. This would require making the E123 syntax something more greppable as part of the span_err macro (e.g., #E123#).
  2. Another option is to have the crates have individual tables then get reconciled at runtime (or compilation time). This requires more work and I fear that we will add a crate and then forget to include that crate in the list of crates whose diagnostic tables must be reconciled.
  3. I don't want to namespace the E codes by crate because I want the freedom to change crates as we like.

@nikomatsakis
Copy link
Contributor

I was thinking more. I think that we should introduce an abstraction layer between the identifier that is present in the code and the error code that the user sees. I actually think this has the potential to solve all of these problems and then some more.

Here is how I envision it working:

In the main crate (probably driver), we compile a list of all the error codes. The list looks like this:

error_codes! {
...
E0123, ERR_MISMATCHED_TYPES, "some detailed comments";
...
}

then in the code instead of writing span_err!(...E0123...) we write span_err!(...ERR_MISMATCHED_TYPES...). By convention, whatever naming scheme we use for these codes (e.g., ERR_FOO) should be distinctive enough that we can easily find it using a simple regular expression. We then write a python script that scrapes these codes out of the librustc_* directories and checks them against the master list in driver. At runtime, the driver can install a master table in TLS (much as it does today) or the session or whatever, and we can translate from ERR_MISMATCHED_TYPES to E0123.

Advantages of this scheme:

  • Python script can easily check conformance, no need to write a complicated cross-crate plugin.
  • Merge conflicts are easily resolved: right now it's a pain because you have to update the diagnostics.rs file (where the merge conflict occurs) but also all uses of the code.
  • You can report the same conceptual error from multiple spots in the code without contortions or warning (right now you get a warning about using the same code E123 twice -- but if codes were more descriptive, that wouldn't be a bad thing to do).

Now, we could arguably go one step further and just NOT USE NUMERIC CODES at all. It's not clear what advantage numeric codes offer over some distinct keyword like ERR_MISMATCHED_TYPES. In that case, we could remove all the cross-crate checking completely since we wouldn't expect overlap between identifiers, and we can just check crates locally, though we still need some way to compile a master list, so we may still want to use roughly the scheme I described here. @brson you may have an opinion about numeric vs descriptive codes.

One thing I might like is that we do offer a mode for using the descriptive names rather than the numeric code in the output (perhaps a -Z flag). The reason for this is that I would like all the tests in src/test/compile-fail to eventually use //~ ERROR ERR_MISMATCHED_TYPES rather than the actual string. This would make them much less fragile to refactoring error messages. We could use (and I have been using whenever possible) //~ ERROR E0123 but the problem is that this is prone to merge conflicts when numbers clash.

@brson
Copy link
Contributor

brson commented Jan 16, 2015

I'm working on a quick patch that fixes registration according to the existing design.

I agree it would probably be best to keep the metadata all in one crate. The biggest complication I see to that design is that it will be harder to statically determine whether error codes are being duplicated across crates. If we had to we could have a dynamic check when rustc starts up though.

As to whether it would be best to get rid of numbers, the downside there is that arbitrary-length strings take up a lot of room in error messages.

@brson
Copy link
Contributor

brson commented Jan 16, 2015

I guess english-language error codes could be a potential i18n issue as well, though I'm not sure if compiler errors are ever internationalized.

@brson
Copy link
Contributor

brson commented Jan 16, 2015

If we used a tidy script to dig out all the error codes (if they had a more distinctive syntax) then we could get rid of the giant registration list. One of the purposes of the big gigantic list of error codes was to ensure that none ever get reused, so if we got rid of the list then it would be more tempting to go back and take one of the retired codes. I'm thinking there may be room in the design for a list of retired codes that perhaps the tidy script could check as well.

@brson
Copy link
Contributor

brson commented Jan 16, 2015

Here's my revised proposal, which is substantially similar to niko's

  • Get rid of the __diagnostic_used and __register_diagnostic builtins that try to enforce sanity at compile time, as well as __build_diagnostics_array. All this logic will be handled through either a tidy script or normal macros.
  • Change the current invocation syntax to make the error codes more distinctive, like %E0000, which appears nowhere in the existing source.
  • Add a single file in the driver that includes two macro invocations:
    • error_codes! { } - basic metadata per Niko. There would be a comment above it saying to never remove an existing code.
    • extended_errors! { } - optional detailed information
  • Add a tidy script that ensures that each of the %ENNNN strings is unique, that they all exist in 'error_codes'.
  • Update the span_err etc macros to accept either the numeric codes or the descriptive codes.

brson added a commit to brson/rust that referenced this issue Jan 17, 2015
brson added a commit to brson/rust that referenced this issue Jan 20, 2015
alexcrichton added a commit to alexcrichton/rust that referenced this issue Jan 21, 2015
This does the bare minimum to make registration of error codes work again. After this patch, every call to `span_err!` with an error code gets that error code validated against a list in that crate and a new tidy script `errorck.py` validates that no error codes are duplicated globally.

There are further improvements to be made yet, detailed in rust-lang#19624.

r? @nikomatsakis
@nikomatsakis
Copy link
Contributor

fwiw I like the numbers in terms of presenting to end-users, I just would like to make dealing with merge conflicts easier when rebasing

@nikomatsakis
Copy link
Contributor

idea that was proposed: get a random unused number?

@steveklabnik
Copy link
Member

Triage: @jonathandturner tells me this is back on now!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-diagnostics Area: Messages for errors, warnings, and lints
Projects
None yet
Development

No branches or pull requests

3 participants