Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multidense vectors quantization #4202

Merged
merged 9 commits into from
May 28, 2024

Conversation

IvanPleshkov
Copy link
Contributor

@IvanPleshkov IvanPleshkov commented May 8, 2024

Add quantization support for multivectors.

The main idea is to reuse quantization integration for multivectors.

Encoded vectors are a struct which implements EncodedVectors trait.
https://github.com/qdrant/quantization/blob/master/quantization/src/encoded_vectors.rs#L21

There are encoded vectors for scalar, PQ and binary quantizations:
https://github.com/qdrant/quantization/blob/master/quantization/src/encoded_vectors_u8.rs#L262
https://github.com/qdrant/quantization/blob/master/quantization/src/encoded_vectors_pq.rs#L497
https://github.com/qdrant/quantization/blob/master/quantization/src/encoded_vectors_binary.rs#L160

Multivector encoding aggregates one of this structure into generic encoded type
struct QuantizedMultivectorStorage<TEncodedQuery, QuantizedStorage: EncodedVectors<TEncodedQuery>>
https://github.com/qdrant/qdrant/blob/basic-multidense-vectors-quantization/lib/segment/src/vector_storage/quantized/quantized_multivector_storage.rs#L20
It implements EncodedVectors trait.
https://github.com/qdrant/qdrant/blob/basic-multidense-vectors-quantization/lib/segment/src/vector_storage/quantized/quantized_multivector_storage.rs#L136
So we can reuse quantization scorers and extend enum QuantizedVectorStorage:
https://github.com/qdrant/qdrant/blob/basic-multidense-vectors-quantization/lib/segment/src/vector_storage/quantized/quantized_vectors.rs#L44

Encoded query type is Vec<TEncodedQuery>. It's maybe not efficient because it's unflattened data, keep it as is in this PR.

Integration test covers SQ, PQ, BQ, HNSW search, and persistence.

pub count: PointOffsetType,
}

pub struct QuantizedMultivectorStorage<TEncodedQuery, QuantizedStorage>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quantized multivectors container. It aggregates SQ/PQ/BQ for all inner vectors and contains offsets

TEncodedQuery: Sized,
QuantizedStorage: EncodedVectors<TEncodedQuery>,
{
quantized_storage: QuantizedStorage,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quantized storage for all inner vectors

})
}

fn score_point_max_similarity(&self, query: &Vec<TEncodedQuery>, vector_index: u32) -> f32 {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We duplicate here MaxSim metric because it uses another interface for metric calculation

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we can have a comment mentioning that several implementations need to be kept in sync?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

)
}

fn encode_query(&self, query: &[f32]) -> Vec<TEncodedQuery> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EncodedVectors wasn't designed for multivectors, query here is a flattened data. Like with save/load, it requires refactoring

@@ -172,4 +190,68 @@ impl<'a> QuantizedScorerBuilder<'a> {
}
}
}

fn new_multi_quantized_scorer<TElement, TMetric, TEncodedQuery>(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the same as new_quantized_scorer but for multivectors. It uses QuantizedQueryScorer::new_multi and QuantizedCustomQueryScorer::new_multi inside

@@ -41,6 +48,23 @@ pub enum QuantizedVectorStorage {
PQMmap(EncodedVectorsPQ<QuantizedMmapStorage>),
BinaryRam(EncodedVectorsBin<ChunkedVectors<u8>>),
BinaryMmap(EncodedVectorsBin<QuantizedMmapStorage>),

ScalarRamMulti(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important part, definition of multivector quantized storages

// Config files
self.path.join(QUANTIZED_CONFIG_PATH),
// Storage file
self.path.join(QUANTIZED_DATA_PATH),
// Meta file
self.path.join(QUANTIZED_META_PATH),
]
];
if self.is_multivector() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one more file with offsets if multivector

@@ -226,6 +302,151 @@ impl QuantizedVectors {
Ok(quantized_vectors)
}

fn create_multi_impl<
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Encode different as regular dense, multivector version of Self::create_impl

let vector_parameters =
Self::construct_vector_parameters(distance, dim, inner_vectors_count);

let quantized_storage = match quantization_config {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First, encode all flattened vectors as regular dense storage

.collect_vec();

// convert into multivector quantized storage
let quantized_storage = match quantized_storage {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second, convert all inner vectors storage into multivector quantized storage

multi_vector_config,
))
}
QuantizedVectorStorage::ScalarRamMulti(_) => unreachable!(),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unreachable is fine because we previously constructed regular dense quantization for all inner vectors

} else {
QuantizedVectorStorage::ScalarMmap(
EncodedVectorsU8::<QuantizedMmapStorage>::load(
let quantized_store = if let Some(multivector_config) =
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add multivector case loading. Dense case wasn't changed, multicast is the same but with another types

@IvanPleshkov IvanPleshkov marked this pull request as ready for review May 23, 2024 11:15
Distance::Dot,
128, // dim
32, // ef
5., // min_acc out of 100
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are some of those min_acc so low?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Binary quantization works badly on random data. Combination of HNSW+BQ+Discovery+Random data gives very low ACC but non-zero

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not discovery, is it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Increased acc, 25 is too large to show that is works and too low to be sure that this test won't be flaky. Test shows some about 33 acc

}

fn sames_count(a: &[Vec<ScoredPointOffset>], b: &[Vec<ScoredPointOffset>]) -> usize {
a[0].iter()
Copy link
Member

@agourlay agourlay May 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We ask for the top 5 in the test but then just compare the first result between plain and indexed.

I find this implementation a bit misleading given the name.
Were the results really bad for the rest of the tops?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Search with exact flag uses original vectors instead of quantized. That's why I compare hnsw search with plain - it's a comparison between between quantized and non-quantized data

offsets_path: &Path,
) -> OperationResult<()> {
let offsets_serialized = bincode::serialize(&self.offsets).map_err(|_| {
OperationError::service_error("Cannot serialize quantized multivector offsets")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we can we have a log::error with the actual error

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think serialization of Vec<_> will produce some unexpected errors. Anyway this error message will appear in logs

let mut file = File::open(offsets_path)?;
let mut offsets_serialized = Vec::new();
file.read_to_end(&mut offsets_serialized)?;
let offsets = bincode::deserialize(&offsets_serialized).map_err(|_| {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment about logging

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the error message will be shown in logs anyway. I added specific filename into error message

match query {
QueryVector::Nearest(vector) => {
let query_scorer = QuantizedQueryScorer::<TElement, TMetric, _, _>::new_multi(
vector.try_into()?,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe a nice explicit call to try_from to help code navigation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@agourlay
Copy link
Member

Is it possible already to write a short integration test in our openapi folder to make sure everything is wired properly?

new multiquantization scorers

encode query

maxsim for quantized vectors

remove obsolete todo

reuse existing scorers

create multivector quantized storage

save load offsets

add test

fix vectors count

tempopery disable test while debugging

fix tests

fix build

less static lifetimes

less static lifetimes

fix build
@timvisee timvisee force-pushed the basic-multidense-vectors-quantization branch from a93b189 to 3dd7465 Compare May 27, 2024 11:16
@IvanPleshkov
Copy link
Contributor Author

Is it possible already to write a short integration test in our openapi folder to make sure everything is wired properly?

Will do as a separate PR

@IvanPleshkov IvanPleshkov merged commit 93ed4ab into dev May 28, 2024
17 checks passed
@IvanPleshkov IvanPleshkov deleted the basic-multidense-vectors-quantization branch May 28, 2024 10:36
timvisee pushed a commit that referenced this pull request May 28, 2024
* quantized multivector definition

new multiquantization scorers

encode query

maxsim for quantized vectors

remove obsolete todo

reuse existing scorers

create multivector quantized storage

save load offsets

add test

fix vectors count

tempopery disable test while debugging

fix tests

fix build

less static lifetimes

less static lifetimes

fix build

* fix build after rebase

* add persistence test

* fix codespell

* increase accuracy in tests

* review remarks

* add comment references

* are you happy codespell

* don't use bincode
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants