Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use requests_html to implement ComposerPopularityFeature #1018

Closed

Conversation

jacobtylerwalls
Copy link
Member

Fixes #907
This had stopped working because Google's response to the static UserAgent we were giving lacked result counts.

I was looking into this package for another reason today and realized it would solve this problem. It rotates user agents. It's owned by PSF. If you don't want to add it to requirements.txt then we can fiddle with the CI script to still have the test run on CI, or we can just deprecate the feature.

@jacobtylerwalls jacobtylerwalls changed the title Use html_response to implement ComposerPopularityFeature Use requests_html to implement ComposerPopularityFeature May 13, 2021
This had stopped working because Google's response to the static UserAgent we were giving lacked result counts.
@coveralls
Copy link

coveralls commented May 13, 2021

Coverage Status

Coverage increased (+0.02%) to 92.286% when pulling d84e6de on jacobtylerwalls:popularity into f4c1f1e on cuthbertLab:master.

@jacobtylerwalls
Copy link
Member Author

#846 imports requests, so we should decide in tandem whether to use both packages. I tried this patch with requests instead of requests_html: no dice. Get back the same response from Google as status quo. Maybe a security through obscurity thing where Google is more inclined to filter and doctor requests coming from requests than requests_html.

@mscuthbert
Copy link
Member

I'm not so thrilled to put a project with only 9 GitHub stars into the requirements.txt for music21. It means that a project with thousands of users is dependent on one with only a few dozen to implement proper security features on something extremely...

Never mind -- a google search put me on a fork of the project; the main branch has 11k stars. But its sub requirements are still huge:

# What packages are required for this module to be executed?
REQUIRED = [
    'requests', 'pyquery', 'fake-useragent', 'parse', 'beautifulsoup4', 'w3lib', 'pyppeteer>=0.0.14'
]

BS4 is a big file in itself. I think better just to remove the Feature.

@jacobtylerwalls jacobtylerwalls deleted the popularity branch May 13, 2021 19:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Composer popularity feature always returns 0
3 participants