Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Docathon][Add CN Doc No.26] #6390

Closed
wants to merge 12 commits into from
Closed
1 change: 1 addition & 0 deletions docs/api/paddle/nn/Overview_cn.rst
Original file line number Diff line number Diff line change
Expand Up @@ -199,6 +199,7 @@ Transformer 相关


" :ref:`paddle.nn.MultiHeadAttention <cn_api_paddle_nn_MultiHeadAttention>` ", "多头注意力机制"
" :ref:`paddle.nn.functional.scaled_dot_product_attention <cn_api_paddle_nn_functional_scaled_dot_product_attention>` ", "点乘注意力机制,并在此基础上加入了对注意力权重的缩放"
" :ref:`paddle.nn.Transformer <cn_api_paddle_nn_Transformer>` ", "Transformer 模型"
" :ref:`paddle.nn.TransformerDecoder <cn_api_paddle_nn_TransformerDecoder>` ", "Transformer 解码器"
" :ref:`paddle.nn.TransformerDecoderLayer <cn_api_paddle_nn_TransformerDecoderLayer>` ", "Transformer 解码器层"
Expand Down
42 changes: 42 additions & 0 deletions docs/api/paddle/nn/functional/scaled_dot_product_attention_cn.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
.. _cn_api_paddle_nn_functional_scaled_dot_product_attention:

scaled_dot_product_attention
-------------------------------

.. py:function:: paddle.nn.functional.scaled_dot_product_attention(query, key, value, attn_mask=None, dropout_p=0.0, is_causal=False, training=True, name=None)

计算公式为:

.. math::
result=softmax(\frac{ Q * K^T }{\sqrt{d}}) * V

其中, ``Q``、``K`` 和 ``V`` 表示注意力模块的三个输入参数。这三个参数的尺寸相同。``d`` 表示三个参数中最后一个维度的大小。

.. warning::
此 API 仅支持数据类型为 float16 和 bfloat16 的输入。


参数
::::::::::

- **query** (Tensor) - 注意力模块中的查询张量。具有以下形状的四维张量:[batch_size, seq_len, num_heads, head_dim]。数据类型可以是 float61 或 bfloat16。
- **key** (Tensor) - 注意力模块中的关键张量。具有以下形状的四维张量:[batch_size, seq_len, num_heads, head_dim]。数据类型可以是 float61 或 bfloat16。
- **value** (Tensor) - 注意力模块中的值张量。具有以下形状的四维张量: [batch_size, seq_len, num_heads, head_dim]。数据类型可以是 float61 或 bfloat16。
- **attn_mask** (Tensor, 可选) - 与添加到注意力分数的 ``query``、 ``key``、 ``value`` 类型相同的浮点掩码, 默认值为空。
- **dropout_p** (float) - ``dropout`` 的比例, 默认值为 0.00 即不进行正则化。
- **is_causal** (bool) - 是否启用因果关系, 默认值为 False 即不启用。
- **training** (bool): - 是否处于训练阶段, 默认值为 True 即处于训练阶段。
- **name** (str, 可选) - 默认值为 None。通常不需要用户设置此属性。欲了解更多信息, 请参阅:ref:`api_guide_Name`。


返回
::::::::::

- ``out`` (Tensor): 形状为 ``[batch_size, seq_len, num_heads, head_dim]`` 的 4 维张量。数据类型可以是 float16 或 bfloat16。
- ``softmax`` (Tensor): 如果 return_softmax 为 False,则为 None。
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

该API并没有return_softmax参数,也没有返回softmax

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.



代码示例
::::::::::

COPY-FROM: paddle.nn.functional.scaled_dot_product_attention