[FEA] Add multi-groups version of the DBSCAN #5412

georgeliu95 · 2023-05-10T08:53:12Z

This draft PR is for the discussion on implementation details.

Merge latest commits from rapidsai/cuml: branch-23.06

rapids-bot · 2023-05-10T08:53:16Z

Pull requests from external contributors require approval from a rapidsai organization member with write or admin permissions before CI can begin.

cjnolet · 2023-05-22T17:01:52Z

Hi @georgeliu95, thanks for contributing this feature. Can you please provide benchmarks to demonstrate the performance of this design compared to the existing dbscan implementation?

There's a lot of code in this PR taken and modified from the building blocks in RAFT and I'd like to keep the algorithms in cuml based on the reusable building blocks that have been centralized in RAFT. I think after benchmarks, a next step would be consolidating these back into RAFT. I'm also fine exposing these as multi-group building blocks if it makes sense.

Another note- we should prefer using raft::linalg::map_offset over thrust::for_each. We centralize these operations behind RAFT APIs so we have more control over future changes and optimizations.

`raft::linalg::map_offset`

georgeliu95 · 2023-06-01T06:32:46Z

Thank you very much for your help and useful advice! @cjnolet
I just add the benchmark for multi-group DBSCAN and using raft::linalg::map / raft::linalg::map_offset instead of thrust::for_each as you suggested.
And if directly run this benchmark with cosine metric, there will be some difference compared to the baseline due to #5360.

tfeher

Thanks @georgeliu95 for this work! It is great to have an efficient implementation for performing a batch of smaller independent DBSCAN clustering!

I see that you arranged the new implementation to have a similar code structure as the existing DBSCAN code in cuML, thank you for this effort! I still think that we can sacrifice some of this similarity to simplify the code, and I have left a few comments along this line.

The main issue, as @cjnolet has already mentioned, is consolidating the computational primitives in raft (epsilon neighborhood, coalesced reduction, adj_to_csr). It is clear that we need a modified version of these primitives for the multi group case, but there is a large overlap between the existing single group version and the proposed multi-group variants. The goal is to reduce code duplication and improve maintainability of the code. These changes could be done in separate smaller PRs for raft.

Apart from these issues, the implementation looks good overall, and I am excited to have this feature added to cuML!

cpp/examples/dbscan/mgrp_dbscan_example.cpp

cpp/include/cuml/cluster/dbscan.hpp

tfeher · 2023-05-22T20:23:52Z

cpp/include/cuml/cluster/dbscan.hpp

+ * @param[in] custom_workspace workspace buffer provided by user
+ * @param[in] custom_workspace_size required size of workspace buffer provided by user


Do we need extra workspace arg, as opposed to using a pooling allocator (like below), and keeping allocations inside? I am not opposed to the current API, just curious if we could simplify it without loosing perf.

rmm::mr::pool_memory_resource<rmm::mr::device_memory_resource> mr(rmm::mr::get_current_device_resource(), 1<<30); rmm::mr::set_current_device_resource(&mr); ML::Dbscan::fit(...)

cpp/include/cuml/cluster/dbscan_api.h

tfeher · 2023-05-22T20:58:49Z

cpp/src/dbscan/dbscan.cuh

@@ -207,5 +211,223 @@ void dbscanFitImpl(const raft::handle_t& handle,
                              metric);
 }

+template <typename Index_ = int>


Please add a docstring.

tfeher · 2023-06-01T22:17:28Z

cpp/src/dbscan/multigroups/mgrp_accessor.cuh

+                                  this->n_groups * sizeof(Index_t),
+                                  cudaMemcpyDeviceToDevice,
+                                  stream));
+    h_adj_group_offset = new std::size_t[this->n_groups];


It is recommended to manage the lifetime of such allocation with a vector object, e.g.:

Suggested change

h_adj_group_offset = new std::size_t[this->n_groups];

h_adj_group_offset = raft::make_host_vector<size_t>(this->n_groups);

cpp/src/dbscan/multigroups/mgrp_adjgraph.cuh

tfeher · 2023-06-01T22:40:09Z

cpp/src/dbscan/multigroups/mgrp_csr.cuh

+
+/**
+ * The implementation is based on
+ * https://github.com/rapidsai/raft/blob/branch-23.06/cpp/include/raft/sparse/convert/detail/adj_to_csr.cuh


Ideally the changes here should be added to raft to minimize code duplication.

tfeher · 2023-06-01T22:48:58Z

cpp/examples/dbscan/mgrp_dbscan_benchmark.cpp

We should refactoring this into a gbench, and adding it to https://github.com/rapidsai/cuml/blob/branch-23.06/cpp/bench/sg/dbscan.cu

tfeher

Thanks @georgeliu95 for this work! It is great to have an efficient implementation for performing a batch of smaller independent DBSCAN clustering!

I see that you arranged the new implementation to have a similar code structure as the existing DBSCAN code in cuML, thank you for this effort! I still think that we can sacrifice some of this similarity to simplify the code, and I have left a few comments along this line.

The main issue, as @cjnolet has already mentioned, is consolidating the computational primitives in raft (epsilon neighborhood, coalesced reduction, adj_to_csr). It is clear that we need a modified version of these primitives for the multi group case, but there is a large overlap between the existing single group version and the proposed multi-group variants. The goal is to reduce code duplication and improve maintainability of the code. These changes could be done in separate smaller PRs for raft.

Apart from these issues, the implementation looks good overall, and I am excited to have this feature added to cuML!

- some docstring modification - remove dispatcher for `VertexDeg` and `AdjGraph` - replace CUDA runtime API `cudaMemcpyAsync` with `raft::copy` - correct the value of `accum_est_mem` in `mgrp_dbscan_scheduler` Co-authored-by: Tamas Bela Feher <tfeher@nvidia.com>

dantegd · 2023-07-12T02:59:23Z

/ok to test

georgeliu95 and others added 3 commits April 25, 2023 02:15

Add multi-groups DBSCAN

29632c0

Update the year of copyright

3521e43

Merge pull request #1 from rapidsai/branch-23.06

860aaa3

Merge latest commits from rapidsai/cuml: branch-23.06

github-actions bot added CMake CUDA/C++ labels May 10, 2023

cjnolet assigned georgeliu95 May 31, 2023

georgeliu95 added 2 commits May 31, 2023 23:21

Add benchmark for multi-group DBSCAN

2cd482e

Replace thrust::for_each with raft::linalg::map and

f7dbaa5

`raft::linalg::map_offset`

tfeher requested changes Jun 1, 2023

View reviewed changes

tfeher changed the base branch from branch-23.06 to branch-23.08 June 7, 2023 18:51

tfeher added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Jun 7, 2023

georgeliu95 and others added 3 commits July 2, 2023 19:51

Merge branch 'rapidsai:branch-23.08' into fea-multi-groups-dbscan

22ffc07

[chore] add docstring and format code

49d063b

[feat] add test unit for multi-groups DBSCAN

c7b0ed0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEA] Add multi-groups version of the DBSCAN #5412

[FEA] Add multi-groups version of the DBSCAN #5412

georgeliu95 commented May 10, 2023

rapids-bot bot commented May 10, 2023

cjnolet commented May 22, 2023

georgeliu95 commented Jun 1, 2023

tfeher left a comment

tfeher May 22, 2023

tfeher May 22, 2023

tfeher Jun 1, 2023

tfeher Jun 1, 2023

tfeher Jun 1, 2023

tfeher left a comment

dantegd commented Jul 12, 2023

		* @param[in] custom_workspace workspace buffer provided by user
		* @param[in] custom_workspace_size required size of workspace buffer provided by user

	h_adj_group_offset = new std::size_t[this->n_groups];
	h_adj_group_offset = raft::make_host_vector<size_t>(this->n_groups);

[FEA] Add multi-groups version of the DBSCAN #5412

Are you sure you want to change the base?

[FEA] Add multi-groups version of the DBSCAN #5412

Conversation

georgeliu95 commented May 10, 2023

rapids-bot bot commented May 10, 2023

cjnolet commented May 22, 2023

georgeliu95 commented Jun 1, 2023

tfeher left a comment

Choose a reason for hiding this comment

tfeher May 22, 2023

Choose a reason for hiding this comment

tfeher May 22, 2023

Choose a reason for hiding this comment

tfeher Jun 1, 2023

Choose a reason for hiding this comment

tfeher Jun 1, 2023

Choose a reason for hiding this comment

tfeher Jun 1, 2023

Choose a reason for hiding this comment

tfeher left a comment

Choose a reason for hiding this comment

dantegd commented Jul 12, 2023