Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aware of compression ratio for unpack size limit #11337

Merged
merged 1 commit into from
Nov 29, 2022

Conversation

weihanglo
Copy link
Member

@weihanglo weihanglo commented Nov 4, 2022

What does this PR try to resolve?

Cargo now is aware of the compression ratio and pick the larger one between

  • size of .crate file times a fixed compression ratio (20:1)
  • a hard unpack size limit (512 MiB)

to determine the unpack size limit of a compressed .crate file.

How should we test and review this PR?

Get a debug build and tweak values of __CARGO_TEST_MAX_UNPACK_SIZE and __CARGO_TEST_MAX_UNPACK_RATIO if you want to manually test it.

Additional information

I've heard of #11151 and other use case hitting the hard limit 512MiB after we set it. Weeks ago I posted a topic on Zulip to discuss adding registries.<regname>.max-unpack-size to configure the limit. After some investigations, I felt like we can first simply add a hard ratio as @arlosi suggested, without stacking up the code complexity.

A ratio of 20:1 should fit most cases in general. The ratio should cover other algorithms with higher compression ratio, such as lzma/xz and bzip2. I've listed a couple of references in the doc comment of fn max_unpack_size(…).

Here is data from Cargo's dependencies (size in bytes):

min max median mean stddev
unpacked 4096 54267904 147200 713959.02 7807881.35
packed 774 5106276 27934 316858.75 803932.06
ratio 2.92 18.40 4.97 5.79 2.98
Expand to see raw data
crate,uncompressed,compressed,compression ratio,
winapi v0.3.9,7254016,1200382,6.04308961647209,
winapi-x86_64-pc-windows-gnu v0.4.0,54267904,2947998,18.4083924073219,
winapi-i686-pc-windows-gnu v0.4.0,52018176,2918815,17.8216762624558,
walkdir v2.3.2,125440,23516,5.33424051709474,
winapi-util v0.1.5,46592,10164,4.58402203856749,
same-file v1.0.6,48640,10183,4.77658843169989,
url v2.3.1,516096,72777,7.09147120656251,
percent-encoding v2.2.0,34304,10075,3.4048635235732,
idna v0.3.0,2286592,271128,8.43362544628367,
unicode-normalization v0.1.22,734720,122604,5.99262666797168,
tinyvec v1.6.0,424960,45991,9.24006870909526,
tinyvec_macros v0.1.0,8704,1817,4.79031370390754,
unicode-bidi v0.3.8,173568,36575,4.74553656869446,
form_urlencoded v1.1.0,31744,8734,3.63453171513625,
unicode-xid v0.2.4,86016,15352,5.6029181865555,
unicode-width v0.1.10,107008,18968,5.64150147617039,
toml_edit v0.15.0,707072,102015,6.93105915796697,
toml_datetime v0.5.0,41472,10594,3.91466868038512,
serde v1.0.147,527872,76697,6.88256385517035,
serde_derive v1.0.147,332288,54861,6.05690745702776,
syn v1.0.103,1970688,236495,8.33289498720903,
unicode-ident v1.0.5,301056,35455,8.49121421520237,
quote v1.0.21,146944,28030,5.24238316089904,
proc-macro2 v1.0.47,205824,41955,4.90582767250626,
kstring v2.0.0,121344,22063,5.49988668812038,
static_assertions v1.1.0,78336,18480,4.23896103896104,
itertools v0.10.5,618496,115354,5.36172131005427,
either v1.8.0,73728,15992,4.61030515257629,
indexmap v1.9.1,307200,54114,5.67690431311675,
hashbrown v0.12.3,618496,102968,6.00668168751457,
autocfg v1.1.0,56832,13272,4.28209764918626,
combine v4.6.6,755712,132428,5.70658773069139,
memchr v2.5.0,345600,65812,5.25132194736522,
bytes v1.2.1,297472,54857,5.42268078823122,
termcolor v1.1.3,92672,17242,5.37478250782972,
tempfile v3.3.0,143360,27578,5.19834650808616,
remove_dir_all v0.5.3,32256,9184,3.51219512195122,
redox_syscall v0.2.16,152576,24012,6.35415625520573,
bitflags v1.3.2,127488,23021,5.53790017809826,
libc v0.2.137,3608576,606185,5.95292856141277,
fastrand v1.8.0,49664,11369,4.36837012929897,
instant v0.1.12,31232,6128,5.09660574412533,
cfg-if v1.0.0,31232,7934,3.93647592639274,
tar v0.4.38,251392,49158,5.11395907075145,
filetime v0.2.18,83456,14622,5.7075639447408,
windows-sys v0.42.0,33236992,3006791,11.0539748190014,
windows_x86_64_msvc v0.42.0,4354560,659377,6.6040520066669,
windows_x86_64_gnullvm v0.42.0,3245568,357906,9.06821344151816,
windows_x86_64_gnu v0.42.0,10570752,692493,15.2647781277211,
windows_i686_msvc v0.42.0,4642304,717477,6.47031751540467,
windows_i686_gnu v0.42.0,10993664,728570,15.0893723321026,
windows_aarch64_msvc v0.42.0,4354560,659424,6.60358130732275,
windows_aarch64_gnullvm v0.42.0,3245568,357917,9.0679347446475,
strip-ansi-escapes v0.1.1,33792,8668,3.89847715736041,
vte v0.10.1,90112,24947,3.61213773199182,
vte_generate_state_changes v0.1.1,10240,2422,4.22791081750619,
utf8parse v0.2.0,41472,13392,3.09677419354839,
arrayvec v0.5.2,126464,27838,4.54285509016452,
snapbox v0.4.1,185856,33581,5.53455823233376,
yansi v0.5.1,77824,16525,4.70947049924357,
snapbox-macros v0.3.1,8192,1877,4.36441129461907,
similar v2.2.0,282624,50996,5.54208173190054,
normalize-line-endings v0.3.0,23552,5737,4.1052815060136,
dunce v1.0.3,30208,8035,3.75955196017424,
content_inspector v0.2.4,44032,11386,3.86720533989109,
concolor v0.0.9,42496,10225,4.1560880195599,
concolor-query v0.1.0,26624,7281,3.65664057135009,
atty v0.2.14,24576,5470,4.49287020109689,
hermit-abi v0.1.19,41472,9979,4.15592744764004,
shell-escape v0.1.5,26624,6847,3.88841828538046,
serde_json v1.0.87,760320,144383,5.26599391895168,
ryu v1.0.11,189952,47007,4.04093007424426,
itoa v1.0.4,43520,10601,4.10527308744458,
serde_ignored v0.1.5,66048,11858,5.56991060887165,
serde-value v0.7.0,56320,10249,5.49517026051322,
ordered-float v2.10.0,98304,15589,6.30598498941561,
num-traits v0.2.15,285696,49262,5.79952092891072,
semver v1.0.14,140800,29813,4.72277194512461,
rustfix v0.6.1,66048,17352,3.80636237897649,
log v0.4.17,193536,38028,5.08930261912275,
anyhow v1.0.66,249344,43770,5.69668722869545,
rustc-workspace-hack v1.0.0,4096,774,5.29198966408269,
pretty_env_logger v0.4.0,34816,8690,4.00644418872267,
env_logger v0.7.1,161792,32281,5.01198847619343,
regex v1.6.0,1012736,239329,4.23156408124381,
regex-syntax v0.6.27,1458176,297300,4.90472922973428,
aho-corasick v0.7.19,510464,113070,4.51458388608826,
humantime v1.3.0,76800,17020,4.5123384253819,
quick-error v1.2.3,74240,15066,4.92765166600292,
pathdiff v0.2.1,28672,7142,4.01456174740969,
os_info v3.5.1,147456,22593,6.52662329039968,
openssl v0.10.42,1132032,225875,5.01176314333149,
openssl-sys v0.9.77,395264,60799,6.50115955854537,
vcpkg v0.2.15,3131392,228735,13.6900430629331,
pkg-config v0.3.26,77824,18662,4.17018540349373,
openssl-src v111.24.0+1.1.1s,24458752,5106276,4.7899392825613,
cc v1.0.74,257024,59410,4.32627503787241,
jobserver v0.1.25,92160,21888,4.21052631578947,
openssl-macros v0.1.0,19968,5566,3.58749550844413,
once_cell v1.16.0,159232,32120,4.9574097135741,
foreign-types v0.3.2,28160,7504,3.75266524520256,
foreign-types-shared v0.1.1,19456,5672,3.43018335684062,
opener v0.5.0,52224,12350,4.22866396761134,
bstr v0.2.17,1814016,330350,5.49119418798244,
regex-automata v0.1.10,556032,114533,4.85477547955611,
lazy_static v1.4.0,40960,10443,3.92224456573781,
libgit2-sys v0.14.0+1.5.0,8022528,1740370,4.60966805909088,
libz-sys v1.1.8,7592448,2481844,3.05919630726186,
libssh2-sys v0.2.23,2957824,493516,5.99337002245115,
lazycell v1.3.0,54272,12502,4.34106542953127,
im-rc v15.1.0,818688,194077,4.21836693683435,
version_check v0.9.4,69632,14895,4.67485733467607,
typenum v1.15.0,229888,40741,5.64266954664834,
sized-chunks v0.6.5,223232,43628,5.11671403685706,
bitmaps v2.1.0,81920,16717,4.90040078961536,
rand_xoshiro v0.6.0,114176,17125,6.66721167883212,
rand_core v0.6.4,90624,22666,3.99823524221301,
ignore v0.4.18,250880,53174,4.71809530973784,
thread_local v1.1.4,57856,13106,4.41446665649321,
globset v0.4.9,109056,22929,4.75624754677483,
fnv v1.0.7,43008,11266,3.81750399431919,
crossbeam-utils v0.8.12,210944,41785,5.04831877467991,
humantime v2.1.0,77312,16749,4.61591736820109,
home v0.5.4,33792,8538,3.95783555867885,
hex v0.4.3,55808,13299,4.19640574479284,
glob v0.3.0,84992,18724,4.53920102542192,
git2-curl v0.16.0,32768,9289,3.52761330606093,
git2 v0.15.0,993280,198983,4.99178321766181,
openssl-probe v0.1.5,29184,7227,4.0381901203819,
curl v0.4.44,450560,91415,4.92873160859815,
socket2 v0.4.7,220672,44619,4.94569577982474,
schannel v0.1.20,202240,41579,4.86399384304577,
windows-sys v0.36.1,35942912,3347053,10.7386742904878,
windows_x86_64_msvc v0.36.1,4352000,661999,6.5740280574442,
windows_x86_64_gnu v0.36.1,13886976,790934,17.5576925508323,
windows_i686_msvc v0.36.1,4657152,724575,6.42742573232585,
windows_i686_gnu v0.36.1,14016000,818115,17.1320657853725,
windows_aarch64_msvc v0.36.1,4352000,661960,6.57441537253006,
curl-sys v0.4.59+curl-7.86.0,19558400,2996584,6.52689862857173,
libnghttp2-sys v0.1.7+1.45.0,13851136,4527090,3.05961136182404,
fwdansi v1.1.0,44544,8280,5.37971014492754,
flate2 v1.0.24,321024,70191,4.57357780912083,
miniz_oxide v0.5.4,242688,53485,4.53749649434421,
adler v1.0.2,50176,12778,3.92674910001565,
crc32fast v1.3.2,113152,38661,2.92677375132563,
env_logger v0.9.1,170496,33425,5.1008526551982,
clap v4.0.19,1238528,205060,6.0398322442212,
strsim v0.10.0,58368,11355,5.14029062087186,
clap_lex v0.3.0,38400,9671,3.97063385378968,
os_str_bytes v6.3.1,118784,22934,5.17938432022325,
miow v0.4.0,134656,27567,4.88468095911779,
windows-sys v0.28.0,23503872,3075898,7.64130410046107,
windows_x86_64_msvc v0.28.0,4401152,668950,6.57919425966066,
windows_x86_64_gnu v0.28.0,13673472,743221,18.3975856441085,
windows_i686_msvc v0.28.0,4710912,732280,6.43321133992462,
windows_i686_gnu v0.28.0,13804032,774446,17.8243957616154,
windows_aarch64_msvc v0.28.0,4400640,669636,6.57168969410247,
crypto-hash v0.3.4,40448,8102,4.9923475685016,
hex v0.3.2,34304,9053,3.78924113553518,
commoncrypto v0.2.0,14848,3009,4.93452974410103,
commoncrypto-sys v0.2.0,19456,4338,4.48501613646842,
core-foundation v0.9.3,149504,27059,5.52511179274918,
core-foundation-sys v0.8.3,110080,17519,6.28346366801758,
bytesize v1.1.0,43008,9370,4.58996798292423,

@rustbot
Copy link
Collaborator

rustbot commented Nov 4, 2022

r? @ehuss

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Nov 4, 2022
fn max_unpack_size(size: u64) -> u64 {
const SIZE_VAR: &str = "__CARGO_TEST_MAX_UNPACK_SIZE";
const RATIO_VAR: &str = "__CARGO_TEST_MAX_UNPACK_RATIO";
let max_unpack_size = if cfg!(debug_assertions) && std::env::var(SIZE_VAR).is_ok() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the CFG mean that the test suite will only pass in debug mode?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, unfortunately.
If SIZE_VAR were exposed, this PR would become unnecessary 😆.

// Limit the test to debug builds so that `__CARGO_TEST_MAX_UNPACK_SIZE` will take affect.
#[cfg(debug_assertions)]
#[cargo_test]
fn reach_max_unpack_size() {

@jonhoo
Copy link
Contributor

jonhoo commented Nov 4, 2022

Now that we have a ratio limit, why do we still have the MB limit? Especially given that we take the max of the two?

@arlosi
Copy link
Contributor

arlosi commented Nov 4, 2022

Now that we have a ratio limit, why do we still have the MB limit? Especially given that we take the max of the two?

It's possible that some small crate contains a small amount of highly compressible data -- and that should be allowed. For example, if a 30KB crate compresses to 1K, we don't want to block that.

@weihanglo
Copy link
Member Author

Friendly ping @joshtriplett, as you're the original author of that CVE fix. Do you think this enhancement is safe enough in terms of security?

@ehuss
Copy link
Contributor

ehuss commented Nov 29, 2022

Discussed this in today's meeting. Looks good, thanks!

@bors r+

@bors
Copy link
Collaborator

bors commented Nov 29, 2022

📌 Commit de7cd31 has been approved by ehuss

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 29, 2022
@bors
Copy link
Collaborator

bors commented Nov 29, 2022

⌛ Testing commit de7cd31 with merge 0460192...

@bors
Copy link
Collaborator

bors commented Nov 29, 2022

☀️ Test successful - checks-actions
Approved by: ehuss
Pushing 0460192 to master...

@bors bors merged commit 0460192 into rust-lang:master Nov 29, 2022
@weihanglo weihanglo deleted the compression-ratio branch November 29, 2022 20:24
weihanglo added a commit to weihanglo/rust that referenced this pull request Dec 3, 2022
9 commits in e027c4b5d25af2119b1956fac42863b9b3242744..f6e737b1e3386adb89333bf06a01f68a91ac5306
2022-11-25 19:44:46 +0000 to 2022-12-02 20:21:24 +0000
- Refactor generate_targets into separate module (rust-lang/cargo#11445)
- Improve file found in multiple build targets warning (rust-lang/cargo#11299)
- Error when precise without -p flag (rust-lang/cargo#11349)
- Improve strategy for selecting targets to be scraped for examples (rust-lang/cargo#11430)
- Aware of compression ratio for unpack size limit (rust-lang/cargo#11337)
- Add test for rustdoc-map generation when using sparse registries (rust-lang/cargo#11403)
- Add error message when `cargo fix` on an empty repo (rust-lang/cargo#11400)
- Store the sparse+ prefix in the SourceId for sparse registries (rust-lang/cargo#11387)
- Update documentation for -Zrustdoc-scrape-examples in the Cargo Book (rust-lang/cargo#11425)
bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 3, 2022
Update cargo

9 commits in e027c4b5d25af2119b1956fac42863b9b3242744..f6e737b1e3386adb89333bf06a01f68a91ac5306
2022-11-25 19:44:46 +0000 to 2022-12-02 20:21:24 +0000
- Refactor generate_targets into separate module (rust-lang/cargo#11445)
- Improve file found in multiple build targets warning (rust-lang/cargo#11299)
- Error when precise without -p flag (rust-lang/cargo#11349)
- Improve strategy for selecting targets to be scraped for examples (rust-lang/cargo#11430)
- Aware of compression ratio for unpack size limit (rust-lang/cargo#11337)
- Add test for rustdoc-map generation when using sparse registries (rust-lang/cargo#11403)
- Add error message when `cargo fix` on an empty repo (rust-lang/cargo#11400)
- Store the sparse+ prefix in the SourceId for sparse registries (rust-lang/cargo#11387)
- Update documentation for -Zrustdoc-scrape-examples in the Cargo Book (rust-lang/cargo#11425)
@ehuss ehuss added this to the 1.67.0 milestone Dec 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants