Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deactivate fulltextsearch as default #9491

Open
teertinker opened this issue Dec 21, 2022 · 13 comments · Fixed by #9527
Open

Deactivate fulltextsearch as default #9491

teertinker opened this issue Dec 21, 2022 · 13 comments · Fixed by #9527

Comments

@teertinker
Copy link
Contributor

teertinker commented Dec 21, 2022

Currently, a fresh installed JabRef has fulltextsearch option enabled.
Suggestion: Set the default option to deactivated

Explanation:

  1. The fulltextsearch option does not seem to be documented in the help files (https://docs.jabref.org/finding-sorting-and-cleaning-entries/search)
  2. The fulltextsearch option indexes and searches files beyond the core program and the loaded libraries. A standard setting that is confined to the core of the program is way more intuitive.
  3. Searching a large library takes some time and makes the program unresponsive for a short period. Users who are not aware of the feature might think the program is just slow. This happend to me as a long time user of JabRef. I am still stuck at 4.3. due to performance issues in previous versions. After trying the most recent 5.8 I thought there are still perfomance issues, because searching took way to long for my database. I simply was not aware, that the attached files are search as well.
@koppor
Copy link
Member

koppor commented Jan 2, 2023

  • Disable file search as default
    grafik
  • Button state should be persisted across JabRef sessions (preferences!)
  • Deactivate full text index as default
    grafik
  • On file search button click
    • If full text index is deactivated:
      • Ask user if they want to activate the full text index?
      • If yes: Activate and start indexing
      • If not: button is deactivated
    • Note that we do not intend to support search based on old index

This refs koppor#96

@ThiloteE
Copy link
Member

ThiloteE commented Jan 2, 2023

Metaissue: #8906

@koppor
Copy link
Member

koppor commented Jan 2, 2023

We cannot implement the full solution ("On file search button click"...), because #8963 touches nearly all code places which we would touch here, too. Therefore, as first step, we disable the search in files and building the full text index (PR #9516).

We accept (for now), that "drive-by" users wanting to have a full-fledged, pre-configured bibliographic library management tool need to configure JabRef for their use case.

@tobiasdiez
Copy link
Member

I'm against this. Fulltext search is a very convenient feature for most users. If there are performance issues, then these should be fixed instead of being hidden under the carpet.

@ThiloteE
Copy link
Member

ThiloteE commented Jan 3, 2023

This is a controversial change, because there does not seem to be a "right" solution. Gains in terms of performance for users with large databases will be a loss in terms of convenience for users with small databases. Fulltext search as it is currently implemented does not work without creating an index.

  1. Since we are talking about a fresh installation of JabRef, we can assume that:

    JabRef does not automatically open a large library upon fresh installation. Users choose to open a large library.
    --> hence, experienced users of JabRef do have the opportunity to disable full-text indexing in the preferences BEFORE they open a large library.

  2. There is a difference between indexing and file search.

    • indexing:

      1. full indexing is only done once for each library AND when migrating to a new version of Lucene
      2. adding new entries and attaching files to the library automatically triggers them being added to the index.
      3. Subset of users that are currently negatively affected by full-text indexing are
        • Long time users of JabRef with large libraries that were taken by surprise after upgrading to JabRef 5.4
        • New users with large libraries that will want to transfer to or at least try JabRef and immediately open a large library, then are taken by surprise.
        • All users of JabRef with large libraries that update to a newer version of JabRef with an updated version of Lucene. Nowadays, almost every new version of JabRef comes with a new updated version of Lucene, hence this is a periodic, cyclical problem.
        • Users that attach a lot of files to their library at once/simultaneously.
      4. Subset of users that are NOT negatively affected:
        • All users of JabRef that are in the process of slowly attaining a large library and that are using a single version of JabRef for a long time. Once the initial library has been indexed completely, every new entry and new attached file will be added to the index separately, so it is not necessary to trigger full indexing again.
    • file search:

      • every search takes time.
      • if full-text search is activated by default (and users fail to deactivate it), it will lead to permanent performance loss, whenever the search bar is used

In my opinion:

  • disable "file search" by default.
    • because that is a win win and whoever wants to search through attached files can very easily activate it in the search bar. No need to go to hidden preferences to activate it.
  • do not disable indexing for now
    • disabling indexing is a choice between increased performanc for non-users of file search and convenience of users of file search. Disabling indexing will lead to other new inconveniences, because enabling it is hidden in the preferences and users will have to search for it.

    • It can be expected of experienced users with large libraries to look out for performance issues: they will go the extra step and check their preferences to find potential bottlenecks.

    • Conversely, going through the preferences to activate a feature they thought would work out of the box, but, which does not and they are not even notified about it, cannot be expected, expecially when users are new and have small libraries.

    • Disabling indexing should only be done, once the following situation can be prevented/resolved:

      1. indexing is disabled
      2. users clicks file search button in the search bar
      3. File search will not work.
      4. Users are not notified that file search does not work.
      5. Users will not know that they first have to enable indexing for file search to work.
      6. Users will report that file search is broken

      Ideas about how to resolve:

      • implement a popup when clicking on file search button, notifying users about the need to enable indexing and its negative effects on performance, when used on huge libraries with many attached files.
      • automatically enable indexing, once users click on the file search button.
    • avoid potential interferences with the switch to Lucene backend Lucene search backend #8963. Re-evaluate once this pull-request has been merged.

@tobiasdiez
Copy link
Member

Can we not run the indexing in a low-impact mode, ie only using max say 30% of the CPU in a low prio thread?

Do we have data on how much longer the fulltext search is taking? I thought once lucene built the index, querying is really fast and cheap.

@koppor
Copy link
Member

koppor commented Jan 3, 2023

DevCall decision: We keep the full text indexing as default "true". We hope that our friends of JabRef know the tweaks they need to do to get JabRef running smoothly.

Maybe #8701 could be a solution: Store in the library whether the full text indexer should run for this particular library.

@wujastyk
Copy link

wujastyk commented Jan 4, 2023

JabRef 6.0--2023-01-04--06771c8
Linux 5.15.0-56-generic amd64
Java 19.0.1
JavaFX 19+11

I'm not a developer, merely a user. I found JabRef now makes my system grind almost to a halt, and I've had to kill it several times. Almost by chance, I found this discussion about switching off full text indexing. My life has now returned to normal. But I must say in the strongest terms that I think full text indexing should be off by default. I agree with the points made by @ThiloteE in the issuecomment above.

@teertinker
Copy link
Contributor Author

Dear developers thank you, for the instant and elaborate discussion. I do understand the concern about disabling indexing and fulltext search. May be the problem does not affect many users. In my case the problem occured while trying to switch to a new version (large databases without fulltext-index got loaded automatically). While I was aware, that an index is created at first startup (because of the circle) - I was not aware, that each search runs through all my attached pdfs. Now, since I know it, I get the logic. But it probably kept me from updating to the most recent version from more than three years. Every single release I tried and considered the new version to slow in comparison to 4.x. I really like the suggestion from: #9491 (comment)

@koppor
Copy link
Member

koppor commented Jan 6, 2023

My life has now returned to normal.

Yeah!

But I must say in the strongest terms that I think full text indexing should be off by default. I agree with the points made by @ThiloteE in the issuecomment above.

However, these two statements contradict. You say: turn it off - Thilo says: keep it on and ask power users to turn it off manually.

@ThiloteE
Copy link
Member

ThiloteE commented Jan 6, 2023

Thank you for the feedback everybody. Yes, Koppor is right. I would like to emphasize though: I only strongly advocate for keeping it enabled FOR NOW. I have listed multiple reasons for it in my comment. I am open to disabling indexing, once there is a solution to following problem:

1. indexing is disabled
2. users clicks file search button in the search bar
3. File search will not work.
4. Users are not notified that file search does not work.
5. Users will not know that they first have to enable indexing for file search to work.
6. Users will report that file search is broken

As I have seen in #9516, there already might be a fix for it (i have not tested this pr yet), but i think that one will have to wait until #8963 has been merged to avoid code conflicts, so let's just give it some time for now.

@AEgit
Copy link

AEgit commented Feb 23, 2023

I was about to comment, that I would also be in favour of changing the default to switching off full text indexing. But when looking through the JabRef documentation I found the following (https://docs.jabref.org/faq#q-i-have-a-huge-library.-what-can-i-do-to-mitigate-performance-issues):

Q: I have a huge library. What can I do to mitigate performance issues?
A: Check your configuration. Disable some or all of following preferences:
disable fulltext index (Options → Preferences → Linked files → Fulltext Index → ...)
disable time stamps (Options → Preferences → General → Time Stamp → ...)
disable field formatters (Library → Library Properties → Saving → Save actions → ...)
disable autosave (Options → Preferences → File → Saving → ...)
disable count of items in group (Options → Preferences → Groups → ...)

Any preference that has the potential to affect all your entries at once is worth inspecting.

This is perfect and really helps users with large databases. Maybe it could be made more prominent, but on the other hand, it is exactly where you would expect to find this kind of suggestions.

@koppor
Copy link
Member

koppor commented Jan 29, 2024

Current way: Implement the question on the button (#9491 (comment)). This will be a dialog asking the user with some nice text.

Dialog

Active fulltext search. Note that this requires preparation time.

[Active fulltext search] [Cancel]

@Siedlerchr Siedlerchr removed this from the 6.0 milestone Sep 18, 2024
@Siedlerchr Siedlerchr added this to the 6.0-alpha milestone Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment