Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Selectively ignore certain failures? #1

Open
mahmoud opened this issue Apr 13, 2020 · 5 comments
Open

Selectively ignore certain failures? #1

mahmoud opened this issue Apr 13, 2020 · 5 comments

Comments

@mahmoud
Copy link

mahmoud commented Apr 13, 2020

Hey Cam,

Been using link checker for a while on the APA, and I was curious, would it be possible to have a configuration file with expected failures? Seems like a few links are behind cloudflare (which blocks the checker's traffic) and other expected failures.

Thanks in advance!

@cam-barts
Copy link
Owner

@mahmoud Hey!

I am glad you have been using the link checker. I have some ideas that I am planning on implementing, one of them being a configuration that includes links that you don't want to check. I was also going to try to build in an option where it generates a json report instead of counting as a failed action. I have mocked out how I'd like the report to look, but unfortunately I've been pretty bust and haven't been able to put any time on this recently. Good news is, I'll be on vacation next week and have already slotted out some time to give this some TLC. Since your the only know consumer so far, I was going to try to make changes as transparent to you as possible, and I'd love your feedback!

@mahmoud
Copy link
Author

mahmoud commented Apr 13, 2020

Sounds good, will look forward to turning that ❌ to a ✔️ next week :)

@mahmoud
Copy link
Author

mahmoud commented May 5, 2020

Hi @cam-barts!

So I'm taking the new action out for a spin, and it's good progress.

Here's what I'm looking at:

    "./README.md": {
        "http://die-offenbachs.homelinux.org:48888/hg/eric": {
            "code": "Error (Likely 404)",
            "reason": "",
            "time": "2020-05-04T09:17:42.101532"
        },
        "http://sunflower-fm.org/": {
            "code": "Error (Likely 404)",
            "reason": "",
            "time": "2020-05-04T09:17:42.101555"
        },
        "http://supervisord.org/": {
            "code": 404,
            "reason": "",
            "time": "2020-05-04T09:17:42.101570"
        },
        "https://coala.io/": {
            "code": "Error (Likely 404)",
            "reason": "",
            "time": "2020-05-04T09:17:42.101525"
        },
        "https://docs.securedrop.org/": {
            "code": 521,
            "reason": "",
            "time": "2020-05-04T09:17:42.101578"
        },
        "https://gitweb.gentoo.org/proj/portage.git": {
            "code": "Error (Likely 404)",
            "reason": "",
            "time": "2020-05-04T09:17:42.101585"
        },
        "https://pypi.org/project/beancount": {
            "code": "Error (Likely 404)",
            "reason": "",
            "time": "2020-05-04T09:17:42.101563"
        },
        "https://pypi.org/project/howdoi": {
            "code": "Error (Likely 404)",
            "reason": "",
            "time": "2020-05-04T09:17:42.101510"
        },
        "https://pypi.org/project/magic-wormhole": {
            "code": "Error (Likely 404)",
            "reason": "",
            "time": "2020-05-04T09:17:42.101479"
        },
        "https://pypi.org/project/prosopopee": {
            "code": "Error (Likely 404)",
            "reason": "",
            "time": "2020-05-04T09:17:42.101517"
        },
        "https://pypi.org/project/taguette": {
            "code": "Error (Likely 404)",
            "reason": "",
            "time": "2020-05-04T09:17:42.101548"
        },
        "https://pypi.org/project/visidata": {
            "code": "Error (Likely 404)",
            "reason": "",
            "time": "2020-05-04T09:17:42.101501"
        }
    }
}

As you can see I reformatted the JSON output (probably adding an indent=2 to the json.dumps() is a good idea for readability).

What I see is:

  1. Some errors I definitely want to ignore, like the 521 error code, probably not high priority to code around that. 521 is enough to tell me it's still there :)
  2. A lot of PyPI errors which makes me think we may be hitting them too hard (there's a lot of PyPI links in the README). The action definitely seems faster (~12 minutes), but that comes with consequences.
  3. A lot of "likely 404s" that actually load when I visit them.
  4. A genuine missing link! Which I then fixed. Success! 🎉

Anyways, I'm mostly curious what we should do about #2 and #3. Maybe it's worth including the exception repr or something?

@cam-barts
Copy link
Owner

Hey @mahmoud,
I'll definitely add the indention to the json dump, thats a great idea, that I simply didn't think of because I was using vscode which does that for me haha.
I can add some ignored errors into another env variable, which seems to be a standard way to pass parameters into actions (other than secrets).
"Likely 404s" is a catch all error that pops if requests (aiohttp in this case) for whatever reason doesn't actually get a response code, and doesn't really make a whole lot of sense now that we are outputting errors into json like this. I'll go back in and be more granular with the error handling and passing the actual error messages into the json output.
The "likely 404" also likely includes timeout errors. I have a short timeout on the requests to try to help with speeding up the action. I can try to make an incremental timeout, at the cost of performance.
I am also likely to add some more verbose output to the action. If for whatever reason a user is watching it run, going 12 minutes with no output can be nerve racking.
I'll hopefully be able to get some of this done today, but I am back at work now after my vacation, so I might have to put this on my schedule for later in the week.
As always, I appreciate the feedback!

@mahmoud
Copy link
Author

mahmoud commented May 6, 2020

All that sounds great! Just one bit of info, re: the env var. It might be useful, but not required at this point from my perspective. It's just one URL that has a 521 code. Looking forward to retries/looser timeouts, and more output of all sorts. Thanks!

This was referenced Oct 24, 2020
kapremom added a commit that referenced this issue Oct 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants