Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Malformed DB using fts5 #31

Closed
theboolean opened this issue Oct 10, 2023 · 26 comments · Fixed by #36
Closed

Malformed DB using fts5 #31

theboolean opened this issue Oct 10, 2023 · 26 comments · Fixed by #36

Comments

@theboolean
Copy link

Hi!
I think I found a blocking bug for @vlcn.io/crsqlite-wasm when using fts5.

Basically after some INSERT I receive this error and the db becomes unusable:

Error: database disk image is malformed
    at check (sqlite-api.js:866:1)
    at Object.step (sqlite-api.js:699:1)
    at async TX.statements (TX.ts:152:1)

I saw this can be mitigated by wrapping the INSERT into a transaction, but still if you perform lot of transactions this happens again.

Minimal code to reproduce:

    const sqlite = await initWasm()
    const db = await sqlite.open('sqlite_test.db')
    await db.exec('CREATE VIRTUAL TABLE IF NOT EXISTS test_fts USING fts5(fieldA, fieldB, fieldC, fieldD, fieldE, id UNINDEXED, prefix=2, detail=none);')

    for (let i = 0; i < 1000; i++) {
      await db.exec(`INSERT INTO test_fts (fieldA, fieldB, fieldC, fieldD, fieldE, id) VALUES ('test', 'test', 'test', 'test', 'test', '${i}');`)
        .catch(err => {
          console.error(err, i)
        })
    }

to me it always fail during iteration 877.

I also created a basic create-react-app project that demonstrates this, code lives here.

@theboolean
Copy link
Author

Forgot to say: tested in Chrome and Firefox on Mac OS

@tantaman
Copy link
Contributor

Thanks for the report. It may be a problematic interaction with the indexeddb vfs. I should release a variant of cr-sqlite that lets the user choose their virtual filesystem so we can track these things down.

@theboolean
Copy link
Author

FYI I tried to set the DB in memory and the error didn't occur, so you're right: it must be something related to indexeddb. Did you manage to reproduce the bug on your side?

@AlexErrant
Copy link
Contributor

I found a blocking bug

In the interest of getting you unblocked, have you tried inserting multiple values at a time, e.g.

INSERT INTO test_fts (
  fieldA, fieldB, fieldC, fieldD, fieldE, 
  id
) 
VALUES 
  (
    'test', 'test', 'test', 'test', 'test', 
    '${i}'
  ), 
  (
    'test', 'test', 'test', 'test', 'test', 
    '${i}'
  ), 
  (
    'test', 'test', 'test', 'test', 'test', 
    '${i}'
  ), 
  (
    'test', 'test', 'test', 'test', 'test', 
    '${i}'
  ), 
  (
    'test', 'test', 'test', 'test', 'test', 
    '${i}'
  );

I do this in my own project here in batches of 1000 and was able to add ~35k rows without database disk image is malformed (I really should use a transaction though). Schema here - I only have 3 FTS columns though compared to your example's 5. If you want to test my code I can upload those rows somewhere.

@theboolean
Copy link
Author

Unfortunately, this is not a workaround we can implement with our use case. We perform lot of updates after the first db population, and eventually it will corrupt

@tantaman
Copy link
Contributor

tantaman commented Oct 16, 2023

@theboolean - would switching to OPFS be acceptable for your use case? I think the OPFS VFS should handle this fine since it is less temperamental than indexeddb. I'll need to expose OPFS for you -- it currently isn't exposed by the cr-sqlite wrappers over wa-sqlite.

@rhashimoto - just an fyi. I assume this is in wa-sqlite itself but haven't stripped out cr-sqlite layers to confirm.

@rhashimoto
Copy link

Will take a look, but I've never used FTS (it's not in the wa-sqlite default build) and I'm deep into something else right now.

@rhashimoto
Copy link

@theboolean If you run your test with Dev Tools open is there anything interesting printed to the console when it fails?

@theboolean
Copy link
Author

@tantaman I would like to use indexeddb because of wider browser support, but I can give OPFS a try to compare performances.

@rhashimoto the only error I receive is the one I pasted in the OP, but I added the repo that can reproduce the issue to codesandbox, so you can try by yourself.

Thank you guys for the interest in this issue 🙏🏻

@theboolean
Copy link
Author

@tantaman is exposing OPFS a simple change to do?

@tantaman
Copy link
Contributor

@theboolean - should be pretty straightforward. Would be a matter of exposing an option for the user to supply a VFS class to initWasm

See these lines:

sqlite3.vfs_register(
new IDBBatchAtomicVFS("idb-batch-atomic", { durability: "relaxed" })
);

export default async function initWasm(
locateWasm?: (file: string) => string
): Promise<SQLite3> {
if (api != null) {
return api;
}
const wasmModule = await SQLiteAsyncESMFactory({
locateFile(file: string) {
if (locateWasm) {
return locateWasm(file);
}
return new URL("crsqlite.wasm", import.meta.url).href;
},
});
const sqlite3 = SQLite.Factory(wasmModule);
sqlite3.vfs_register(
new IDBBatchAtomicVFS("idb-batch-atomic", { durability: "relaxed" })
);

@rhashimoto
Copy link

Reproduced in wa-sqlite. It fails with SQLITE_CORRUPT on iteration 877, exactly as reported.

@rhashimoto
Copy link

It's a bug in IDBBatchAtomicVFS. It also reveals a likely bug in SQLite, without which it probably wouldn't have shown up here. It's still a critical bug in IDBBatchAtomicVFS, though, as I think it can be exposed in other ways. I hope to merge the fix tomorrow-ish.

@tantaman
Copy link
Contributor

Appreciate it.

@theboolean - I'll roll a new release to update to https://github.com/rhashimoto/wa-sqlite/releases/tag/v0.9.8 which will resolve this issue.

@rhashimoto
Copy link

@theboolean What a great bug report, including a compact reproduction! This exposed not one but two quite subtle bugs, one in SQLite which is possibly the best tested library on the planet (and one in wa-sqlite which is...not). The bug was fixed in SQLite only hours after submission, and was considered important enough to bypass the code freeze for their imminent release.

@tantaman
Copy link
Contributor

tantaman commented Nov 5, 2023

@theboolean, @iansinnott - do you guys need a fix immediately or can you wait till the v0.16 release?

@theboolean
Copy link
Author

@tantaman I think it makes sense to merge wa-sqlite v0.9.8 as a patch and then releasing v0.16. What do you think?

@iansinnott
Copy link

@theboolean, @iansinnott - do you guys need a fix immediately or can you wait till the v0.16 release?

@tantaman no rush, thanks for looking into this

@tantaman
Copy link
Contributor

tantaman commented Nov 6, 2023

This'll fix it -- #36

but I'll need to publish a new release for everyone.

@theboolean
Copy link
Author

Hi @tantaman! Touching base just to check if there are any updates on this :)

@tantaman
Copy link
Contributor

@theboolean - I'll be publishing a new release today. Luckily it'll be compatible with the current release as we decided to revert some breaking changes.

@tantaman
Copy link
Contributor

re-opening until the release is actually published.

@tantaman tantaman reopened this Nov 27, 2023
@tantaman
Copy link
Contributor

Almost there. Need to get this pulled in -- #38

@tantaman tantaman reopened this Nov 27, 2023
@tantaman
Copy link
Contributor

tantaman commented Nov 27, 2023

New release published. You should be all set. Note that I've published these under the next tag until I've had a chance to roll them into the various examples and demo apps to further verify the builds.

@theboolean
Copy link
Author

Thank you @tantaman, @rhashimoto, and all the people involved in fixing this 🙏🏻

@tantaman
Copy link
Contributor

tantaman commented Dec 1, 2023

v0.16.0-next.2 is pretty solid at this point. There were a few bugs in v0.16.0-next.0 that are now fixed.

Will be making next.2 the stable release soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants