-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug fixes] update attribute map handler #4421
Conversation
Thanks for your contribution! |
Codecov Report
@@ Coverage Diff @@
## develop #4421 +/- ##
===========================================
- Coverage 39.65% 39.62% -0.03%
===========================================
Files 433 433
Lines 60936 60983 +47
===========================================
+ Hits 24163 24167 +4
- Misses 36773 36816 +43
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
# do standard config map: there are some old-school pretrained-config not refactored. | ||
config_dict = convert_to_legacy_config(cls.attribute_map, config_dict) | ||
|
||
config_dict = flatten_model_config(config_dict) | ||
if "model_type" in config_dict and hasattr(cls, "model_type") and config_dict["model_type"] != cls.model_type: | ||
logger.warning( | ||
f"You are using a model of type {config_dict['model_type']} to instantiate a model of type " | ||
f"{cls.model_type}. This is not supported for all configurations of models and can yield errors." | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
将 convert_to_legacy_config
和flatten_model_config
迁移到from_dict
函数中,因为:from_dict
是from_pretrained
的调用函数。
value = config.pop(standard_field, None) or config.pop(paddle_field, None) | ||
if value is not None: | ||
config[paddle_field] = value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
问题
问题是出在:将 attribute_map
中的values 参数都映射到第一层级参数上,且都是:target_paddle_field: None
。故导致init_args
下的参数没办法正确 map 上来,比如d_model
为什么这样调整
- 老版本
model_config.json
{
"init_args": [
{
"tie_word_embeddings": false,
"pad_token_id": 0,
"bos_token_id": 0,
"eos_token_id": 1,
"vocab_size": 32128,
"d_model": 768,
"d_kv": 64,
"d_ff": 2048,
"num_layers": 12,
"num_decoder_layers": 12,
"num_heads": 12,
"relative_attention_num_buckets": 32,
"dropout_rate": 0.1,
"layer_norm_epsilon": 1e-06,
"initializer_factor": 1.0,
"feed_forward_proj": "gated-gelu",
"init_class": "T5Model"
}
],
"init_class": "T5ForConditionalGeneration"
}
在老板本配置文件中,模型的参数是放在init_args
参数中,而这个模块是先递归调用convert_to_legacy_config
函数(深度优先),针对于模型参数做 map,此时可能会将hidden_size
-> d_model
。
可是执行完毕之后,老代码会在根目录上设置:d_model: None
,导致flatten_model_config
中没办法将init_args
中的正确d_model
映射回来。
现在这种做法是可以解决这个问题。
- 在新版本的配置文件
config.json
中
{
"architectures": [
"T5ForConditionalGeneration"
],
"bos_token_id": 0,
"d_ff": 2048,
"d_kv": 64,
"d_model": 768,
"dropout_rate": 0.1,
"enable_recompute": false,
"eos_token_id": 1,
"feed_forward_proj": "gated-gelu",
"initializer_factor": 1.0,
"is_encoder_decoder": true,
"layer_norm_epsilon": 1e-06,
"model_type": "t5",
"num_decoder_layers": 12,
"num_heads": 12,
"num_layers": 12,
"pad_token_id": 0,
"paddlenlp_version": null,
"relative_attention_max_distance": 128,
"relative_attention_num_buckets": 32,
"tie_word_embeddings": false,
"use_cache": true,
"vocab_size": 32128
}
因为没有init_args
参数,也是可以针对于模型参数做 map。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
PR types
Bug fixes
PR changes
APIs
Description
update
attribute mapping
inconfiguration_utils
module.try to fix: #4384