Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Downloads from the arXiv are marked as HTML files instead of PDF #4913

Closed
1 task done
tobiasdiez opened this issue Apr 23, 2019 · 5 comments · Fixed by #4965 or #11797
Closed
1 task done

Downloads from the arXiv are marked as HTML files instead of PDF #4913

tobiasdiez opened this issue Apr 23, 2019 · 5 comments · Fixed by #4965 or #11797
Labels
bug Confirmed bugs or reports that are very likely to be bugs entry-editor

Comments

@tobiasdiez
Copy link
Member

JabRef version: latest on Windows

Steps to reproduce the behavior:

  1. Import an entry from the arXiv, say 1207.0408v1 (using Library > New Entry > ArXiv)
  2. Go to File tab, right-click link, and select Download
  3. The file is downloaded successfully, but has file extension html instead of pdf. Renaming the file results in a readable pdf through.
Log File
Paste an excerpt of your log file here
@tobiasdiez tobiasdiez added bug Confirmed bugs or reports that are very likely to be bugs entry-editor labels Apr 23, 2019
jabesse added a commit to benjagooder/jabref that referenced this issue Apr 25, 2019
jabesse added a commit to benjagooder/jabref that referenced this issue Apr 25, 2019
…nstead of PDF

may have found where to edit
jabesse added a commit to benjagooder/jabref that referenced this issue Apr 25, 2019
…xed.

After a file was added to a library, attempting to edit the file link
in the General tab would throw an exception. It was fixed by only
allowing an edit if a file was downloaded.
jabesse added a commit to benjagooder/jabref that referenced this issue Apr 25, 2019
…xed.

After a file was added to a library, attempting to edit the file link
in the General tab would throw an exception. It was fixed by only
allowing an edit if a file was downloaded.
jabesse added a commit to benjagooder/jabref that referenced this issue Apr 25, 2019
possibly located source of error
This method could be recognizing the ArXiv PDFs as URLs.
Siedlerchr added a commit that referenced this issue May 11, 2019
If we already have a filetype we use that instead of relying on the autodetection


Fixes #4913
Siedlerchr added a commit that referenced this issue May 18, 2019
* Fix downloading pdf produces html as extension

If we already have a filetype we use that instead of relying on the autodetection


Fixes #4913

* add relativze if not an URL

* Create  Test for download pdf document
heavy mocking and refactoring of ExternalFileType

TODO : Cleanup

* refactor and simply test
fix cehckstyle
fail test on exception
@Siedlerchr
Copy link
Member

Thank you for reporting this issue. We think, that is already fixed in our development version and consequently the change will be included in the next release.

We would like to ask you to use a development build from https://builds.jabref.org/master and report back if it works for you. Please remember to make a backup of your library before trying-out this version.

@wenjie-yin
Copy link

Hi, I am still experiencing the same issue using JabRef-5.16-portable_windows from https://builds.jabref.org/master, following manual installation. could you advise on what the problem might be?

@koppor koppor reopened this Sep 18, 2024
@Siedlerchr
Copy link
Member

@wenjie-yin I cannot reproduce this, can you tell me how you import the axiv entry? e..g by New entry or via search? or directly in the entry editor by clicking on download somwhere?

@wenjie-yin
Copy link

@wenjie-yin I cannot reproduce this, can you tell me how you import the axiv entry? e..g by New entry or via search? or directly in the entry editor by clicking on download somwhere?

it was through clicking on the firefox plug-in on an arxiv.org/abs/xxxx.xxxxx page.

@Siedlerchr
Copy link
Member

@wenjie-yin as a workaround you can directly paste the arxiv url on the main table (works with DOI as well) and the entry will be imported correctly

Siedlerchr added a commit that referenced this issue Sep 19, 2024
github-merge-queue bot pushed a commit that referenced this issue Sep 20, 2024
* fix arxiv html download redirect

Fixes #4913

* Fix catch indents

* Add redirect test case

* Use Optionals

* Class-global Unirest config

* Manual handling of redirect

* Fix conditions

* Simplyfiy code

* Use Wiremock instead of real endpoint

* Improve test names

* Fix condition

* fix condition

* wiremock with head request as well

* use path insteadd of file

* try with static unirest config

* max retries

* fix head

* no body for head

---------

Co-authored-by: Subhramit Basu Bhowmick <subhramit.bb@live.in>
Co-authored-by: Oliver Kopp <kopp.dev@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Confirmed bugs or reports that are very likely to be bugs entry-editor
Projects
Archived in project
4 participants