fix(du): use getdents64 on Linux to avoid EOVERFLOW on 32-bit architectures#11902
fix(du): use getdents64 on Linux to avoid EOVERFLOW on 32-bit architectures#11902mattsu2020 wants to merge 6 commits intouutils:mainfrom
Conversation
…ctures On 32-bit Linux (i686), du fails with "Value too large for defined data type" (EOVERFLOW) when reading directories. The root cause is nix::dir::Dir calling libc::readdir(), which uses 32-bit d_ino on 32-bit glibc. Modern filesystems can return inode numbers exceeding 32 bits. Replace nix::dir::Dir with rustix::fs::RawDir on Linux, which calls getdents64 syscall directly with 64-bit d_ino/d_off. This fixes du and all other utilities using DirFd::read_dir() (rm, chmod, chown, install). Closes uutils#11848
rustix::io::Errno implements From<Errno> for std::io::Error, so use io::Error::from instead of the invalid `as i32` cast.
Merging this PR will improve performance by 40.59%
Performance Changes
Comparing Footnotes
|
|
GNU testsuite comparison: |
What happened? |
|
Would you add comment why cannot we use rustix at other unix? |
@oech3 The benchmark contains 46897 system calls in HEAD but 70335 system calls in BASE. Since system calls are not instrumented in CodSpeed, be careful interpreting this result. |
|
perf is near with mimalloc. Does mimalloc increase num of syscalls too by same reason with this? |
70334 system calls with mimalloc (#11866) so no change |
|
@oech Could you run benchmarks with hyperfine? |
| // Helper function to read directory entries. | ||
| // On Linux, use rustix::fs::RawDir which calls getdents64 directly, | ||
| // avoiding EOVERFLOW on 32-bit architectures where libc readdir() uses | ||
| // 32-bit d_ino (Issue #11848). |
There was a problem hiding this comment.
In general please use full URL instead of (Issue #11848). But since this PR will close that issue I think you could remove the reference.
Co-authored-by: xtqqczze <45661989+xtqqczze@users.noreply.github.com>
Remove issue reference from comment explaining use of rustix::fs::RawDir on Linux.
Unable to generate the performance reportThere was an internal error while processing the run's data. We're working on fixing the issue. Feel free to contact us on Discord or at support@codspeed.io if the issue persists. |
Summary
duproducing "Value too large for defined data type" (EOVERFLOW) on 32-bit Linux (i686) when reading directoriesnix::dir::Dir(which callslibc::readdir()with 32-bitd_ino) withrustix::fs::RawDir(which callsgetdents64syscall with 64-bitd_ino/d_off) on Linux/AndroidDirFd::read_dir():rm,chmod,chown,installRoot Cause
On 32-bit glibc,
libc::readdir()usesstruct direntwith 32-bitd_ino. Modern filesystems (XFS, Btrfs, ext4+inode64) can return inode numbers exceeding 2^32, causingEOVERFLOW.Fix
Use
rustix::fs::RawDiron Linux which calls thegetdents64syscall directly, always using 64-bit directory entry fields.rustixwithfsfeature is already an unconditional dependency ofuucore, so no new dependencies are needed.On non-Linux Unix (macOS, BSDs), the existing
nix::dir::Dirimplementation is retained.Test
cargo check -p uucorepassescargo check -p uu_dupassescargo test -p uu_du— all tests passcargo test -p uu_rm— all tests passcargo test -p uu_chmod— all tests passcargo test -p coreutils -- du— all 112 du integration tests passCloses #11848