Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] wrong order of arguments in scaled_dot_product_attention__default #2746

Open
3 tasks done
lilanxiao opened this issue Apr 22, 2024 · 0 comments
Open
3 tasks done

Comments

@lilanxiao
Copy link

lilanxiao commented Apr 22, 2024

Checklist

  • I have searched related issues but cannot get the expected help.
  • 2. I have read the FAQ documentation but cannot get the expected help.
  • 3. The bug has not been fixed in the latest version.

Describe the bug

The order of arguments scale and is_causal in the multi_head_attention__default function is different from the original PyTorch API in the 2.1 and 2.2 versions.

The mmdeploy's version:

def scaled_dot_product_attention__default(query,
                                          key,
                                          value,
                                          attn_mask=None,
                                          dropout_p=0.,
                                          scale=None,
                                          is_causal=False):

The API in PyTorch 2.1 and 2.2:

def scaled_dot_product_attention(query, key, value, attn_mask=None, dropout_p=0.0, is_causal=False, scale=None) -> torch.Tensor:

This can cause an error at this line since the variable scale becomes True in some cases. Dividing by a boolean value is not allowed.

Reproduction

  1. Unable to share my script due to company constraints. But it is not needed anyway because it's so obviously in the source code.
  2. No special modifications.

Environment

I don't think it's necessary.

Error traceback

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant