-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support to reindex the database and remove events. #49
Conversation
This will allow the index to be searched by the sender or timestamp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, this looks great to me! 😁
- Do we want all the non-implemented features as well
Yes, I think at least this batch seems reasonable to add. It seems quite reasonable to want to support these extra query filters, even if we might not add it to Riot right away.
- Since re-indexing requires events to be de-serialized and
neon-serde
can't be used for this; do we want to make this feature optional since it requiresserde_json
.
I would think all clients would want to be able to support re-indexing, so I would say it's fine to be required until there's a problem... I assume it works fine with Riot?
src/database/static_methods.rs
Outdated
|
||
transaction.execute("DELETE from events WHERE event_id == ?1", &[&event_id])?; | ||
transaction.execute( | ||
"INSERT OR IGNORE INTO undeleted_events (event_id) VALUES (?1)", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, does undeleted_events
mean "events to be deleted"? If so, "undeleted" is confusing to me as I would assumed it means "restored" or "brought back from the trash".
If my assumption is correct, maybe use pending_deletion_events
or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is events to be deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to pending_deletion_events
* // reindex the database | ||
* await recovery.reindex(); | ||
*/ | ||
class SeshatRecovery extends seshat.SeshatRecovery { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might have feedback after seeing how you plan to handle the recovery path in Riot, but at an abstract level it seems okay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This commit implements support for it in Riot: element-hq/element-web@921d734.
I didn't notice any delays at startup, the thing seems to be reasonably quick but do note that this was using a 3k events large database.
It does work fine with Riot, the only thing that worries me is a bit of code/dependency bloat. |
Co-Authored-By: J. Ryan Stinnett <jryans@gmail.com>
Co-Authored-By: J. Ryan Stinnett <jryans@gmail.com>
This is a bit of a big one and it essentially implements two features.
While developing support to remove events from the database/index it has come to my attention that documents can only be deleted by a field if the field is indexed.
Tantivy has no way to find a document by a field if the field isn't indexed (doh).
This in turn requires a schema change on our part since we really need to delete events by event id, which is unique and guarantees that we're only going to remove this one intended event.
Our current Tantivy schema is lacking some more definitions to support the following use cases:
While this PR doesn't implement all of the things that are needed to support those features it does make the required Tantivy schema changes and bumps the database version.
To allow breaking schema changes without the need to delete the whole database and re-download the complete room history we introduce a
RecoveryDatabase
, this database acts as a read only database that allows the index to be deleted while keeping the SQLite database in tact.The
RecoveryDatabase
can then proceed to create a new index and fetch events out of the database for them to be re-added to the index.As already mentioned, support to remove events from the database and index is included here as well. Removing an event requires a Tantivy commit, so they need to go through the flow control logic the same way as newly added events.
Some things to consider during review:
neon-serde
can't be used for this; do we want to make this feature optional since it requiresserde_json
.