Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to extract encoder embedding. #1604

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

colinator
Copy link

Adds the ability to get the encoder output. No other top-level methods expose ggml_tensors though - I'm not sure that's cool. It seemed the quickest way. What I'm doing, eventually, is this:

ggml_tensor * tensor = whisper_get_encoder_embedding(ctx);
std::vector<float> tensor_data(ggml_nelements(tensor));
ggml_backend_tensor_get(tensor, tensor_data.data(), 0, ggml_nbytes(tensor))

... so maybe it should expose a method that returns or populates a vector, instead of returning a ggml_tensor? What do you think?

Added methods to get encoder embedding
Added implementations of methods to get encoder embedding.
whisper.cpp Outdated Show resolved Hide resolved
whisper.h Show resolved Hide resolved
Copy link
Collaborator

@bobqianic bobqianic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style

whisper.cpp Outdated Show resolved Hide resolved
whisper.cpp Outdated Show resolved Hide resolved
Copy link
Owner

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so maybe it should expose a method that returns or populates a vector, instead of returning a ggml_tensor?

Yes, populating a buffer with the data would be more portable compared to returning ggml_tensor:

WHISPER_API void whisper_get_encoder_embedding(float * buffer);

@bobqianic
Copy link
Collaborator

so maybe it should expose a method that returns or populates a vector, instead of returning a ggml_tensor?

Yes, populating a buffer with the data would be more portable compared to returning ggml_tensor:

WHISPER_API void whisper_get_encoder_embedding(float * buffer);

This leads to the question: how can user determine the size of the buffer? In this scenario, we end up with only a float pointer, which points to the buffer we've just filled in whisper_get_encoder_embedding

@ggerganov
Copy link
Owner

The buffer size can be determined using the model parameters:

whisper.cpp/whisper.h

Lines 347 to 358 in 2623640

WHISPER_API int whisper_model_n_vocab (struct whisper_context * ctx);
WHISPER_API int whisper_model_n_audio_ctx (struct whisper_context * ctx);
WHISPER_API int whisper_model_n_audio_state(struct whisper_context * ctx);
WHISPER_API int whisper_model_n_audio_head (struct whisper_context * ctx);
WHISPER_API int whisper_model_n_audio_layer(struct whisper_context * ctx);
WHISPER_API int whisper_model_n_text_ctx (struct whisper_context * ctx);
WHISPER_API int whisper_model_n_text_state (struct whisper_context * ctx);
WHISPER_API int whisper_model_n_text_head (struct whisper_context * ctx);
WHISPER_API int whisper_model_n_text_layer (struct whisper_context * ctx);
WHISPER_API int whisper_model_n_mels (struct whisper_context * ctx);
WHISPER_API int whisper_model_ftype (struct whisper_context * ctx);
WHISPER_API int whisper_model_type (struct whisper_context * ctx);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants