-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Misreporting disk io (values inflated) #1305
Comments
Here is a screenshot from glances: https://imgur.com/a/El1w5St In this tests I created partitions on each nvme drive and wrote to the partitions. You can see that the partitions throughputs are much lower than what glances reports for the disks. Here is the io coutners for this instance:
|
Specifically what values are higher than expected? 5.x def disk_io_counters():
"""Return disk I/O statistics for every disk installed on the
system as a dict of raw tuples.
"""
# determine partitions we want to look for
def get_partitions():
partitions = []
with open_text("%s/partitions" % get_procfs_path()) as f:
lines = f.readlines()[2:]
for line in reversed(lines):
_, _, _, name = line.split()
if name[-1].isdigit():
# we're dealing with a partition (e.g. 'sda1'); 'sda' will
# also be around but we want to omit it
partitions.append(name)
else:
if not partitions or not partitions[-1].startswith(name):
# we're dealing with a disk entity for which no
# partitions have been defined (e.g. 'sda' but
# 'sda1' was not around), see:
# https://github.com/giampaolo/psutil/issues/338
partitions.append(name)
return partitions
retdict = {}
partitions = get_partitions()
with open_text("%s/diskstats" % get_procfs_path()) as f:
lines = f.readlines()
for line in lines:
# OK, this is a bit confusing. The format of /proc/diskstats can
# have 3 variations.
# On Linux 2.4 each line has always 15 fields, e.g.:
# "3 0 8 hda 8 8 8 8 8 8 8 8 8 8 8"
# On Linux 2.6+ each line *usually* has 14 fields, and the disk
# name is in another position, like this:
# "3 0 hda 8 8 8 8 8 8 8 8 8 8 8"
# ...unless (Linux 2.6) the line refers to a partition instead
# of a disk, in which case the line has less fields (7):
# "3 1 hda1 8 8 8 8"
# See:
# https://www.kernel.org/doc/Documentation/iostats.txt
# https://www.kernel.org/doc/Documentation/ABI/testing/procfs-diskstats
fields = line.split()
fields_len = len(fields)
if fields_len == 15:
# Linux 2.4
name = fields[3]
reads = int(fields[2])
(reads_merged, rbytes, rtime, writes, writes_merged,
wbytes, wtime, _, busy_time, _) = map(int, fields[4:14])
elif fields_len == 14:
# Linux 2.6+, line referring to a disk
name = fields[2]
(reads, reads_merged, rbytes, rtime, writes, writes_merged,
wbytes, wtime, _, busy_time, _) = map(int, fields[3:14])
elif fields_len == 7:
# Linux 2.6+, line referring to a partition
name = fields[2]
reads, rbytes, writes, wbytes = map(int, fields[3:])
rtime = wtime = reads_merged = writes_merged = busy_time = 0
else:
raise ValueError("not sure how to interpret line %r" % line)
if name in partitions:
ssize = get_sector_size(name)
rbytes *= ssize
wbytes *= ssize
retdict[name] = (reads, writes, rbytes, wbytes, rtime, wtime,
reads_merged, writes_merged, busy_time)
return retdict 3.x: def disk_io_counters():
"""Return disk I/O statistics for every disk installed on the
system as a dict of raw tuples.
"""
# man iostat states that sectors are equivalent with blocks and
# have a size of 512 bytes since 2.4 kernels. This value is
# needed to calculate the amount of disk I/O in bytes.
SECTOR_SIZE = 512
# determine partitions we want to look for
partitions = []
with open_text("%s/partitions" % get_procfs_path()) as f:
lines = f.readlines()[2:]
for line in reversed(lines):
_, _, _, name = line.split()
if name[-1].isdigit():
# we're dealing with a partition (e.g. 'sda1'); 'sda' will
# also be around but we want to omit it
partitions.append(name)
else:
if not partitions or not partitions[-1].startswith(name):
# we're dealing with a disk entity for which no
# partitions have been defined (e.g. 'sda' but
# 'sda1' was not around), see:
# https://github.com/giampaolo/psutil/issues/338
partitions.append(name)
#
retdict = {}
with open_text("%s/diskstats" % get_procfs_path()) as f:
lines = f.readlines()
for line in lines:
# http://www.mjmwired.net/kernel/Documentation/iostats.txt
fields = line.split()
if len(fields) > 7:
_, _, name, reads, _, rbytes, rtime, writes, _, wbytes, wtime = \
fields[:11]
else:
# from kernel 2.6.0 to 2.6.25
_, _, name, reads, rbytes, writes, wbytes = fields
rtime, wtime = 0, 0
if name in partitions:
rbytes = int(rbytes) * SECTOR_SIZE
wbytes = int(wbytes) * SECTOR_SIZE
reads = int(reads)
writes = int(writes)
rtime = int(rtime)
wtime = int(wtime)
retdict[name] = (reads, writes, rbytes, wbytes, rtime, wtime)
return retdict |
(sent from phone - please excuse my fat thumbs :D) Again, I didn't dig into glances to see which psutil api is being used. Also I'm not sure exactly how counters would be used to establish a throughput (would you need to make two calls and compute dx/dt?).
From the screenshot you can see that both read and write rates are off by factor of 8. By inspection of the snippets you provided, I believe nvme0n1 is incorrectly being classified as a partition:
Which means it will execute the final condition which is probably causing the inflation:
Both versions share the first issue but the effect of the final condition has structurally changed - could it also be a functional change? If so, it may explain the reason the two version have differing results. |
I can totally relate :D
It's
Correct.
That is also correct. There is a PR open for this (#1244). I am not sure why you think that could be the cause of the inflation though (it's probably my fault as I failed to understand your reasoning). |
Cause non-partitions are not expected to multiply rbytes and wbytes by ssize:
|
So |
Appears so... FYI, a disk without partitions can still have a read/write load. That is actually common for my daily work - so that 'bug' is a feature :b. |
Can you please paste the output of |
I can tomorrow when I have access to the instance. |
|
OK, |
I filed a bug on |
Btw, if a disk has partitions, I wouldn't expect any IO to the disk that wasn't through a partition. Before I was saying that I often have disks without partitions that are taking IO. (I realized I was vague before.) So I believe the current get partitions returns partitions of disks and disks without partitions, this sounds reasonable to me. One of the issues is nvme partitions aren't recognizing their parent disk (which I believe I can try to get iostat and counters for IO to a raw nvme (no partitions) on Monday if that would be useful. Currently not near a computer. |
This should now be fixed in #1313. Can you please use latest GIT version and check if it's OK? |
I'm applying the same load to both nvme0n1p1 and nvme0n2, notice that the load to the raw devices is still inflated by 8x. Also I'm not sure if you were expecting the root device for a partition to still appear in the output: taken from glances
Here is the result of calling counters
|
That is supposed to come directly from
"read_bytes" is the third column after the disk name (5699664). psutil extracts that and multiplies it by 512 (sector size).
Yes, that is expected. The difference introduced in #1313 is that when |
The sector size on these disks is 4096, that could be the source of the 8x skew. |
Interesting read: https://stackoverflow.com/a/38136179/1364745 Short answer, the sector size is hard-coded to 512B for legacy reasons. So the sector size diskstats is referring to is the 512B constant and not what the disk actually uses. |
How did you determine that? Can you paste the output of:
...and while you're at it also this again: ;-)
Interesting read indeed. I am not sure what to get from that though. It's still unclear whether it is advised to use 512 or try to determine the sector size. FWIW psutil currently tries to determine it: Line 255 in 45c0ebc
I'm not sure this is a reliable method though. Note: sudo fdisk -l is able to report sector size. My disks all have 512. How about yours?
|
I had done an fdisk earlier in this thread:
|
BTW, I do not have these instances at the moment as the project using them is current on hold. So I'm not able to run diskstats at the moment. Also, as mentioned in the initial description, the amplification issue was not present in 3.4.2 which had SECTOR_SIZE hard-coded to 512. https://github.com/giampaolo/psutil/blob/release-3.4.2/psutil/_pslinux.py#L726 |
OK, I think we finally got to the hang of it: we should always assume sector size = 512 instead of determining at runtime. |
Fixed. Thanks for helping debugging this. It was tricky. |
LGTM Noticed a typo in the next line in the history file:
"recognize" |
* origin/master: (182 commits) giampaolo#1394 / windows / process exe(): convert errno 0 into ERROR_ACCESS_DENIED; errno 0 occurs when the Python process runs in 'Virtual Secure Mode' pre-release fix win num_handles() test update readme fix giampaolo#1111: use a lock to make Process.oneshot() thread safe pdate HISTORY giampaolo#1373: different approach to oneshot() cache (pass Process instances around - which is faster) use PROCESS_QUERY_LIMITED_INFORMATION also for username() Linux: refactor _parse_stat_file() and return a dict instead of a list (+ maintainability) fix giampaolo#1357: do not expose Process' memory_maps() and io_counters() methods if not supported by the kernel giampaolo#1376 Windows: check if variable is NULL before free()ing it enforce lack of support for Win XP fix giampaolo#1370: improper usage of CloseHandle() may lead to override the original error code resulting in raising a wrong exception update HISTORY (Windows) use PROCESS_QUERY_LIMITED_INFORMATION access rights (giampaolo#1376) update HISTORY revert 5398c48; let's do it in a separate branch giampaolo#1111 make Process.oneshot() thread-safe sort HISTORY give CREDITS to @EccoTheFlintstone for giampaolo#1368 fix ionice set not working on windows x64 due to LENGTH_MISMATCH (giampaolo#1368) make flake8 happy give CREDITS to @amanusk for giampaolo#1369 / giampaolo#1352 and update doc Add CPU frequency support for FreeBSD (giampaolo#1369) giampaolo#1359: add test case for cpu_count(logical=False) against lscpu utility disable false positive mem test on travis + osx fix PEP8 style mistakes give credits to @koenkooi for giampaolo#1360 Fix giampaolo#1354 [Linux] disk_io_counters() fails on Linux kernel 4.18+ (giampaolo#1360) giampaolo#1350: give credits to @amanusk FreeBSD adding temperature sensors (WIP) (giampaolo#1350) pre release sensors_temperatures() / linux: convert defaultdict to dict fix giampaolo#1004: Process.io_counters() may raise ValueError fix giampaolo#1307: [Linux] disk_partitions() does not honour PROCFS_PATH refactor hasattr() checks as global constants giampaolo#1197 / linux / cpu_freq(): parse /proc/cpuinfo in case /sys/devices/system/cpu fs is not available fix giampaolo#1277 / osx / virtual_memory: 'available' and 'used' memory were not calculated properly travis / osx: set py 3.6 travis: disable pypy; se py 3.7 on osx skip test on PYPY + Travis fix travis fix giampaolo#715: do not print exception on import time in case cpu_times() fails. fix different travis failures give CREDITS for giampaolo#1320 to @truthbk [aix] improve compilation on AIX, better support for gcc/g++ + fix cpu metrics (giampaolo#1320) give credits to @alxchk for giampaolo#1346 (sunOS) Fix giampaolo#1346 (giampaolo#1347) giampaolo#1284, giampaolo#1345 - give credits to @amanusk Add parsing for /sys/class/thermal (giampaolo#1345) Fix decoding error in tests catch UnicodeEncodeError on print() use memory tolerance in occasionally failing test Fix random 0xC0000001 errors when querying for Connections (giampaolo#1335) Correct capitalization of PyPI (giampaolo#1337) giampaolo#1341: move open() utilities/wrappers in _common.py Refactored ps() function in test_posix (giampaolo#1341) fix giampaolo#1343: document Process.as_dict() attrs values giampaolo#1332 - update HISTORY make psutil_debug() aware of PSUTIL_DEBUG (giampaolo#1332) also include PYPY (or try to :P) travis: add python 3.7 build add download badge remove failing test assertions remove failing test make test more robust pre release pre release set version to 5.4.7 OSX / SMC / sensors: revert giampaolo#1284 (giampaolo#1325) setup.py: add py 3.7 fix giampaolo#1323: [Linux] sensors_temperatures() may fail with ValueError fix failing linux tests giampaolo#1321 add unit tests giampaolo#1321: refactoring make disk_io_counters more robust (giampaolo#1324) fix typo Fix DeprecationWarning: invalid escape sequence (giampaolo#1318) remove old test update is_storage_device() docstring fix giampaolo#1305 / disk_io_counters() / Linux: assume SECTOR_SIZE is a fixed 512 giampaolo#1313 remove test which no longer makes sense disk_io_counters() - linux: mimic iostat behavior (giampaolo#1313) fix wrong reference link in doc disambiguate TESTFN for parallel testing fix giampaolo#1309: add STATUS_PARKED constant and fix STATUS_IDLE (both on linux) give CREDITS to @sylvainduchesne for giampaolo#1294 retain GIL when querying connections table (giampaolo#1306) Update index.rst (giampaolo#1308) fix giampaolo#1279: catch and skip ENODEV in net_if_stat() appveyor: retire 3.5, add 3.7 revert file renaming of macos files; get them back to 'osx' prefix winmake: add upload-wheels cmd Rename OSX to macOS (giampaolo#1298) apveyor: reset py 3.4 and remove 3.7 (not available yet) try to fix occasional children() failure on Win: https://ci.appveyor.com/project/giampaolo/psutil/build/job/je3qyldbb86ff66h appveyor: remove py 3.4 and add 3.7 giampaolo#1284: give credits to @amanusk + some minor adjustments little refactoring Osx temps (giampaolo#1284) ...
Note: This could be an issue with glances.
Using glances to observe a system (ubuntu18.04 under test, I noticed that the disk throughput on 8 NVME devices were 8x larger than expected when compared with values from iostat. We then noticed that another instance running ubuntu16.04 was reporting the correct values. Glances versions differed on the two instances (2.11.1 vs 2.3) as well as psutil version (5.4.6 vs 3.4.2). Downgrading glances didn't fix the problem, but downgrading psutil did.
FYI, this only affected NVME disks, the regular drives reported the expected values. Also these are GCE instances.
I'm not sure what API glances is using, the following seems like a reasonable guess:
With psutil 3.4.2:
with psutil 5.4.6:
The text was updated successfully, but these errors were encountered: