Skip to content

Commit

Permalink
Merge branch 'branch-24.10' into fea-persistent-cagra
Browse files Browse the repository at this point in the history
  • Loading branch information
achirkin committed Sep 10, 2024
2 parents f9ee7c7 + bdea78e commit 5bc2982
Show file tree
Hide file tree
Showing 36 changed files with 2,217 additions and 59 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/pr.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ jobs:
with:
build_type: pull-request
enable_check_symbols: true
symbol_exclusions: (void (thrust::|cub::)|_ZN\d+raft_cutlass)
symbol_exclusions: (void (thrust::|cub::)|raft_cutlass)
conda-python-build:
needs: conda-cpp-build
secrets: inherit
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ jobs:
date: ${{ inputs.date }}
sha: ${{ inputs.sha }}
enable_check_symbols: true
symbol_exclusions: (void (thrust::|cub::)|_ZN\d+raft_cutlass)
symbol_exclusions: (void (thrust::|cub::)|raft_cutlass)
conda-cpp-tests:
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/conda-cpp-tests.yaml@branch-24.10
Expand Down
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,11 @@ bench/ann/data
temporary_*.json
rust/target/
rust/Cargo.lock
rmm_log.txt

## example notebooks
notebooks/simplewiki-2020-11-01-nq-distilbert-base-v1.pt
notebooks/data/

## scikit-build
_skbuild
Expand Down
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,7 @@ repos:
hooks:
- id: check-json
- repo: https://github.com/rapidsai/pre-commit-hooks
rev: v0.3.1
rev: v0.4.0
hooks:
- id: verify-copyright
files: |
Expand Down
2 changes: 1 addition & 1 deletion conda/environments/all_cuda-118_arch-aarch64.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ dependencies:
- make
- nccl>=2.9.9
- ninja
- numpy>=1.23,<2.0a0
- numpy>=1.23,<3.0a0
- numpydoc
- nvcc_linux-aarch64=11.8
- openblas
Expand Down
2 changes: 1 addition & 1 deletion conda/environments/all_cuda-118_arch-x86_64.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ dependencies:
- make
- nccl>=2.9.9
- ninja
- numpy>=1.23,<2.0a0
- numpy>=1.23,<3.0a0
- numpydoc
- nvcc_linux-64=11.8
- openblas
Expand Down
2 changes: 1 addition & 1 deletion conda/environments/all_cuda-125_arch-aarch64.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ dependencies:
- make
- nccl>=2.9.9
- ninja
- numpy>=1.23,<2.0a0
- numpy>=1.23,<3.0a0
- numpydoc
- openblas
- pre-commit
Expand Down
2 changes: 1 addition & 1 deletion conda/environments/all_cuda-125_arch-x86_64.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ dependencies:
- make
- nccl>=2.9.9
- ninja
- numpy>=1.23,<2.0a0
- numpy>=1.23,<3.0a0
- numpydoc
- openblas
- pre-commit
Expand Down
47 changes: 47 additions & 0 deletions conda/environments/bench_ann_cuda-118_arch-aarch64.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# This file is generated by `rapids-dependency-file-generator`.
# To make changes, edit ../../dependencies.yaml and run `rapids-dependency-file-generator`.
channels:
- rapidsai
- rapidsai-nightly
- dask/label/dev
- conda-forge
- nvidia
dependencies:
- benchmark>=1.8.2
- c-compiler
- clang-tools=16.0.6
- clang==16.0.6
- click
- cmake>=3.26.4,!=3.30.0
- cuda-nvtx=11.8
- cuda-profiler-api=11.8.86
- cuda-python>=11.7.1,<12.0a0
- cuda-version=11.8
- cudatoolkit
- cxx-compiler
- cython>=3.0.0
- dlpack>=0.8,<1.0
- gcc_linux-aarch64=11.*
- glog>=0.6.0
- h5py>=3.8.0
- hnswlib=0.6.2
- libcublas-dev=11.11.3.6
- libcublas=11.11.3.6
- libcurand-dev=10.3.0.86
- libcurand=10.3.0.86
- libcusolver-dev=11.4.1.48
- libcusolver=11.4.1.48
- libcusparse-dev=11.7.5.86
- libcusparse=11.7.5.86
- matplotlib
- nccl>=2.9.9
- ninja
- nlohmann_json>=3.11.2
- nvcc_linux-aarch64=11.8
- openblas
- pandas
- pylibraft==24.10.*,>=0.0.0a0
- pyyaml
- rmm==24.10.*,>=0.0.0a0
- sysroot_linux-aarch64==2.17
name: bench_ann_cuda-118_arch-aarch64
47 changes: 47 additions & 0 deletions conda/environments/bench_ann_cuda-118_arch-x86_64.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# This file is generated by `rapids-dependency-file-generator`.
# To make changes, edit ../../dependencies.yaml and run `rapids-dependency-file-generator`.
channels:
- rapidsai
- rapidsai-nightly
- dask/label/dev
- conda-forge
- nvidia
dependencies:
- benchmark>=1.8.2
- c-compiler
- clang-tools=16.0.6
- clang==16.0.6
- click
- cmake>=3.26.4,!=3.30.0
- cuda-nvtx=11.8
- cuda-profiler-api=11.8.86
- cuda-python>=11.7.1,<12.0a0
- cuda-version=11.8
- cudatoolkit
- cxx-compiler
- cython>=3.0.0
- dlpack>=0.8,<1.0
- gcc_linux-64=11.*
- glog>=0.6.0
- h5py>=3.8.0
- hnswlib=0.6.2
- libcublas-dev=11.11.3.6
- libcublas=11.11.3.6
- libcurand-dev=10.3.0.86
- libcurand=10.3.0.86
- libcusolver-dev=11.4.1.48
- libcusolver=11.4.1.48
- libcusparse-dev=11.7.5.86
- libcusparse=11.7.5.86
- matplotlib
- nccl>=2.9.9
- ninja
- nlohmann_json>=3.11.2
- nvcc_linux-64=11.8
- openblas
- pandas
- pylibraft==24.10.*,>=0.0.0a0
- pyyaml
- rmm==24.10.*,>=0.0.0a0
- sysroot_linux-64==2.17
name: bench_ann_cuda-118_arch-x86_64
43 changes: 43 additions & 0 deletions conda/environments/bench_ann_cuda-125_arch-aarch64.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# This file is generated by `rapids-dependency-file-generator`.
# To make changes, edit ../../dependencies.yaml and run `rapids-dependency-file-generator`.
channels:
- rapidsai
- rapidsai-nightly
- dask/label/dev
- conda-forge
- nvidia
dependencies:
- benchmark>=1.8.2
- c-compiler
- clang-tools=16.0.6
- clang==16.0.6
- click
- cmake>=3.26.4,!=3.30.0
- cuda-cudart-dev
- cuda-nvcc
- cuda-nvtx-dev
- cuda-profiler-api
- cuda-python>=12.0,<13.0a0
- cuda-version=12.5
- cxx-compiler
- cython>=3.0.0
- dlpack>=0.8,<1.0
- gcc_linux-aarch64=11.*
- glog>=0.6.0
- h5py>=3.8.0
- hnswlib=0.6.2
- libcublas-dev
- libcurand-dev
- libcusolver-dev
- libcusparse-dev
- matplotlib
- nccl>=2.9.9
- ninja
- nlohmann_json>=3.11.2
- openblas
- pandas
- pylibraft==24.10.*,>=0.0.0a0
- pyyaml
- rmm==24.10.*,>=0.0.0a0
- sysroot_linux-aarch64==2.17
name: bench_ann_cuda-125_arch-aarch64
43 changes: 43 additions & 0 deletions conda/environments/bench_ann_cuda-125_arch-x86_64.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# This file is generated by `rapids-dependency-file-generator`.
# To make changes, edit ../../dependencies.yaml and run `rapids-dependency-file-generator`.
channels:
- rapidsai
- rapidsai-nightly
- dask/label/dev
- conda-forge
- nvidia
dependencies:
- benchmark>=1.8.2
- c-compiler
- clang-tools=16.0.6
- clang==16.0.6
- click
- cmake>=3.26.4,!=3.30.0
- cuda-cudart-dev
- cuda-nvcc
- cuda-nvtx-dev
- cuda-profiler-api
- cuda-python>=12.0,<13.0a0
- cuda-version=12.5
- cxx-compiler
- cython>=3.0.0
- dlpack>=0.8,<1.0
- gcc_linux-64=11.*
- glog>=0.6.0
- h5py>=3.8.0
- hnswlib=0.6.2
- libcublas-dev
- libcurand-dev
- libcusolver-dev
- libcusparse-dev
- matplotlib
- nccl>=2.9.9
- ninja
- nlohmann_json>=3.11.2
- openblas
- pandas
- pylibraft==24.10.*,>=0.0.0a0
- pyyaml
- rmm==24.10.*,>=0.0.0a0
- sysroot_linux-64==2.17
name: bench_ann_cuda-125_arch-x86_64
2 changes: 2 additions & 0 deletions conda/recipes/cuvs/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,8 @@ requirements:
- libcuvs {{ version }}
- python x.x
- rmm ={{ minor_version }}
- cuda-python
- numpy >=1.23,<3.0a0

tests:
requirements:
Expand Down
2 changes: 2 additions & 0 deletions cpp/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -424,6 +424,8 @@ add_library(
src/selection/select_k_float_int64_t.cu
src/selection/select_k_float_uint32_t.cu
src/selection/select_k_half_uint32_t.cu
src/stats/silhouette_score.cu
src/stats/trustworthiness_score.cu
)

target_compile_definitions(cuvs PRIVATE "CUVS_EXPLICIT_INSTANTIATE_ONLY")
Expand Down
121 changes: 121 additions & 0 deletions cpp/include/cuvs/stats/silhouette_score.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
/*
* Copyright (c) 2024, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#pragma once

#include <cuvs/distance/distance.hpp>
#include <raft/core/device_mdspan.hpp>
#include <raft/core/resources.hpp>

namespace cuvs {
namespace stats {

/**
* @defgroup stats_silhouette_score Silhouette Score
* @{
*/
/**
* @brief main function that returns the average silhouette score for a given set of data and its
* clusterings
* @param[in] handle: raft handle for managing expensive resources
* @param[in] X_in: input matrix Data in row-major format (nRows x nCols)
* @param[in] labels: the pointer to the array containing labels for every data sample (length:
* nRows)
* @param[out] silhouette_score_per_sample: optional array populated with the silhouette score
* for every sample (length: nRows)
* @param[in] n_unique_labels: number of unique labels in the labels array
* @param[in] metric: Distance metric to use. Euclidean (L2) is used by default
* @return: The silhouette score.
*/
float silhouette_score(
raft::resources const& handle,
raft::device_matrix_view<const float, int64_t, raft::row_major> X_in,
raft::device_vector_view<const int, int64_t> labels,
std::optional<raft::device_vector_view<float, int64_t>> silhouette_score_per_sample,
int64_t n_unique_labels,
cuvs::distance::DistanceType metric = cuvs::distance::DistanceType::L2Unexpanded);

/**
* @brief function that returns the average silhouette score for a given set of data and its
* clusterings
* @param[in] handle: raft handle for managing expensive resources
* @param[in] X: input matrix Data in row-major format (nRows x nCols)
* @param[in] labels: the pointer to the array containing labels for every data sample (length:
* nRows)
* @param[out] silhouette_score_per_sample: optional array populated with the silhouette score
* for every sample (length: nRows)
* @param[in] n_unique_labels: number of unique labels in the labels array
* @param[in] batch_size: number of samples per batch
* @param[in] metric: the numerical value that maps to the type of distance metric to be used in
* the calculations
* @return: The silhouette score.
*/
float silhouette_score_batched(
raft::resources const& handle,
raft::device_matrix_view<const float, int64_t, raft::row_major> X,
raft::device_vector_view<const int, int64_t> labels,
std::optional<raft::device_vector_view<float, int64_t>> silhouette_score_per_sample,
int64_t n_unique_labels,
int64_t batch_size,
cuvs::distance::DistanceType metric = cuvs::distance::DistanceType::L2Unexpanded);

/**
* @brief main function that returns the average silhouette score for a given set of data and its
* clusterings
* @param[in] handle: raft handle for managing expensive resources
* @param[in] X_in: input matrix Data in row-major format (nRows x nCols)
* @param[in] labels: the pointer to the array containing labels for every data sample (length:
* nRows)
* @param[out] silhouette_score_per_sample: optional array populated with the silhouette score
* for every sample (length: nRows)
* @param[in] n_unique_labels: number of unique labels in the labels array
* @param[in] metric: the numerical value that maps to the type of distance metric to be used in
* the calculations
* @return: The silhouette score.
*/
double silhouette_score(
raft::resources const& handle,
raft::device_matrix_view<const double, int64_t, raft::row_major> X_in,
raft::device_vector_view<const int, int64_t> labels,
std::optional<raft::device_vector_view<double, int64_t>> silhouette_score_per_sample,
int64_t n_unique_labels,
cuvs::distance::DistanceType metric = cuvs::distance::DistanceType::L2Unexpanded);

/**
* @brief function that returns the average silhouette score for a given set of data and its
* clusterings
* @param[in] handle: raft handle for managing expensive resources
* @param[in] X: input matrix Data in row-major format (nRows x nCols)
* @param[in] labels: the pointer to the array containing labels for every data sample (length:
* nRows)
* @param[out] silhouette_score_per_sample: optional array populated with the silhouette score
* for every sample (length: nRows)
* @param[in] n_unique_labels: number of unique labels in the labels array
* @param[in] batch_size: number of samples per batch
* @param[in] metric: the numerical value that maps to the type of distance metric to be used in
* the calculations
* @return: The silhouette score.
*/
double silhouette_score_batched(
raft::resources const& handle,
raft::device_matrix_view<const double, int64_t, raft::row_major> X,
raft::device_vector_view<const int, int64_t> labels,
std::optional<raft::device_vector_view<double, int64_t>> silhouette_score_per_sample,
int64_t n_unique_labels,
int64_t batch_size,
cuvs::distance::DistanceType metric = cuvs::distance::DistanceType::L2Unexpanded);

} // namespace stats
} // namespace cuvs
Loading

0 comments on commit 5bc2982

Please sign in to comment.