Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python API for CAGRA+HNSW #246

Open
wants to merge 14 commits into
base: branch-24.10
Choose a base branch
from

Conversation

divyegala
Copy link
Member

No description provided.

@divyegala divyegala added feature request New feature or request non-breaking Introduces a non-breaking change labels Jul 24, 2024
@divyegala divyegala self-assigned this Jul 24, 2024
@divyegala divyegala requested review from a team as code owners July 24, 2024 01:13
@divyegala divyegala changed the title Python API for CAGRA+HNSW Python and C API for CAGRA+HNSW Jul 24, 2024
@jameslamb
Copy link
Member

@divyegala I observed that same pip devcontainer build failure on an unrelated PR: #247 (comment)

So I suspect it's unrelated to your changes here.

@divyegala
Copy link
Member Author

@jameslamb I suspect it's related to this PR rapidsai/raft#2346

What I cannot figure out is why conda-cpp builds pass (new conda packages released?) but wheel builds fail. cuVS wheel builds pick up libraft wheel builds?

@vyasr
Copy link
Contributor

vyasr commented Jul 24, 2024

Maybe conda packages weren't out yet when the builds failed? Wheel builds will pull the source of raft to rebuild the static lib (unless you changed something in how cuvs wheels specifically are built, but I don't see any indication of that in the CMake).

Copy link
Member

@cjnolet cjnolet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall. Mostly a couple small documentation-related things to polish this off and then I think it's ready to merge.

@@ -416,6 +428,32 @@ cuvsError_t cuvsCagraSerialize(cuvsResources_t res,
cuvsCagraIndex_t index,
bool include_dataset);

/**
* Save the CAGRA index to file in hnswlib format.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a note here that this must be loaded by the (patched) hnswlib wrappers inside cuVS and it can't just be loaded with hnswlib?

Loads base-layer-only hnswlib index from file, which was originally
saved as a built CAGRA index.

Saving / loading the index is experimental. The serialization format is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you are alluding to it in this description but we should also be very explicit and mention that the serialized index can only be loaded with the functions in cuVS because the hnswlib version underneath is patched to be able to search the converted CAGRA graph.

We recently had 2 different partners try to do this conversion themselves and they didn't realixe they couldn't just load this into hnswlib proper.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right above, the note is quite clear:

The loaded index is immutable and can only
    be searched by the hnswlib wrapper in cuVS, as the format is not
    compatible with the original hnswlib.

can only be searched by the hnswlib wrapper in cuVS, as the format is not compatible with the original hnswlib.

@divyegala divyegala changed the title Python and C API for CAGRA+HNSW Python API for CAGRA+HNSW Sep 16, 2024
wrapper in cuVS, as the format is not compatible with the original
hnswlib.

Saving / loading the index is experimental. The serialization format is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please include the note here about how even though this format closely matches hnswlib, it is only intended to be used by cuVS hnsw apis.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That note is already present right above

    The saved index is immutable and can only be searched by the hnswlib
    wrapper in cuVS, as the format is not compatible with the original
    hnswlib.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CMake cpp feature request New feature or request non-breaking Introduces a non-breaking change Python
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

4 participants