Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 13 additions & 5 deletions src/uu/split/src/filenames.rs
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,12 @@ use uucore::display::Quotable;
use uucore::error::{UResult, USimpleError};
use uucore::translate;

// This is an allocation guard, not a semantic suffix limit. 4096 is around the
// common full-path limit and is already far above typical filename-component
// limits, so realistic suffix lengths stay accepted while pathological values
// cannot request attacker-sized filename buffers.
const MAX_SUFFIX_LENGTH: usize = 4096;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why 4096 ?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I picked 4096 as a conservative allocation guard, not as a semantic suffix limit.

The failure mode here is that split -a <huge> flows into filename construction and can request an absurd allocation before the filename could ever be useful. 4096 is around the common full-path limit on Linux and is already far above typical filename-component limits such as 255 bytes. Since the suffix is appended to the output prefix to form one filename component, values in the thousands are already beyond realistic portable usage, while still keeping the guard loose enough that we do not reject any plausible manual -a value.

So the tradeoff was: make the cap high enough to avoid changing normal behavior, but low enough that pathological inputs cannot allocate attacker-sized strings. If you would rather make the bound stricter and tie it to a component-oriented limit instead, I can switch this to a NAME_MAX-style value such as 255, or rename/comment it more explicitly as an allocation guard.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok - my point was more: it should be documented in the code :)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I updated the code comment so it explains that 4096 is an allocation guard rather than a semantic suffix limit, and documents the path-limit / filename-component tradeoff directly next to the constant.

That should make the choice reviewable from the code without needing the PR discussion context.


/// The format to use for suffixes in the filename for each output chunk.
#[derive(Clone, Copy)]
pub enum SuffixType {
Expand Down Expand Up @@ -175,11 +181,13 @@ impl Suffix {
let (mut length, is_length_cmd_opt) =
if let Some(v) = matches.get_one::<String>(OPT_SUFFIX_LENGTH) {
// suffix length was specified in command line
(
v.parse::<usize>()
.map_err(|_| SuffixError::NotParsable(v.to_owned()))?,
true,
)
let parsed_length = v
.parse::<usize>()
.map_err(|_| SuffixError::NotParsable(v.to_owned()))?;
if parsed_length > MAX_SUFFIX_LENGTH {
return Err(SuffixError::NotParsable(v.to_owned()));
}
(parsed_length, true)
} else {
// no suffix length option was specified in command line
// set to default value
Expand Down
104 changes: 67 additions & 37 deletions src/uu/split/src/number.rs
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@ impl Number {
#[allow(dead_code)]
fn digits(&self) -> Vec<u8> {
match self {
Self::FixedWidth(number) => number.digits.clone(),
Self::FixedWidth(number) => number.digits(),
Self::DynamicWidth(number) => number.digits(),
}
}
Expand Down Expand Up @@ -188,69 +188,94 @@ impl Display for Number {
#[derive(Clone)]
pub struct FixedWidthNumber {
radix: u8,
digits: Vec<u8>,
width: usize,
current: usize,
}

impl FixedWidthNumber {
/// Instantiate a number of the given radix and width.
pub fn new(radix: u8, width: usize, mut suffix_start: usize) -> Result<Self, Overflow> {
let mut digits = vec![0_u8; width];

for i in (0..digits.len()).rev() {
let remainder = (suffix_start % (radix as usize)) as u8;
suffix_start /= radix as usize;
digits[i] = remainder;
if suffix_start == 0 {
break;
}
}
if suffix_start == 0 {
Ok(Self { radix, digits })
pub fn new(radix: u8, width: usize, suffix_start: usize) -> Result<Self, Overflow> {
if fits_in_width(radix, width, suffix_start) {
Ok(Self {
radix,
width,
current: suffix_start,
})
} else {
Err(Overflow)
}
}

fn digits(&self) -> Vec<u8> {
fixed_width_digits(self.radix, self.width, self.current)
}

/// Increment this number.
///
/// This method adds one to this number. If incrementing this
/// number would require more digits than are available with the
/// specified width, then this method returns `Err(Overflow)`.
fn increment(&mut self) -> Result<(), Overflow> {
for i in (0..self.digits.len()).rev() {
// Increment the current digit.
self.digits[i] += 1;

// If the digit overflows, then set it to 0 and continue
// to the next iteration to increment the next most
// significant digit. Otherwise, terminate the loop, since
// there will be no further changes to any higher order
// digits.
if self.digits[i] == self.radix {
self.digits[i] = 0;
} else {
break;
}
}

// Return an error on overflow, which is signified by all zeros.
if self.digits == vec![0; self.digits.len()] {
Err(Overflow)
} else {
let next = self.current.checked_add(1).ok_or(Overflow)?;
if fits_in_width(self.radix, self.width, next) {
self.current = next;
Ok(())
} else {
Err(Overflow)
}
}
}

impl Display for FixedWidthNumber {
fn fmt(&self, f: &mut Formatter) -> fmt::Result {
for d in &self.digits {
f.write_char(map_digit(self.radix, *d))?;
let radix = self.radix as usize;
let mut remaining = self.current;
let mut significant_digits = Vec::new();

while remaining > 0 {
significant_digits.push((remaining % radix) as u8);
remaining /= radix;
}

for _ in significant_digits.len()..self.width {
f.write_char(map_digit(self.radix, 0))?;
}

for d in significant_digits.into_iter().rev() {
f.write_char(map_digit(self.radix, d))?;
}
Ok(())
}
}

fn fits_in_width(radix: u8, width: usize, value: usize) -> bool {
let radix = radix as usize;
let mut remaining = value;

for _ in 0..width {
remaining /= radix;
if remaining == 0 {
return true;
}
}

remaining == 0
}

fn fixed_width_digits(radix: u8, width: usize, value: usize) -> Vec<u8> {
let radix = radix as usize;
let mut remaining = value;
let mut digits = Vec::with_capacity(width);

for _ in 0..width {
digits.push((remaining % radix) as u8);
remaining /= radix;
}

digits.reverse();
digits
}

/// A positional notation representation of a number of dynamically growing width.
///
/// The digits are represented as a [`Vec<u8>`] with the most
Expand Down Expand Up @@ -571,6 +596,11 @@ mod tests {
assert_eq!(format!("{}", num(0xf).unwrap()), "23");
}

#[test]
fn test_fixed_width_number_large_width_does_not_allocate() {
assert!(FixedWidthNumber::new(26, usize::MAX, 0).is_ok());
}

#[test]
fn test_dynamic_width_number_start_suffix() {
fn num(n: usize) -> Result<Number, Overflow> {
Expand Down
8 changes: 8 additions & 0 deletions tests/by-util/test_split.rs
Original file line number Diff line number Diff line change
Expand Up @@ -914,6 +914,14 @@ fn test_suffix_length_req() {
.stderr_only("split: the suffix length needs to be at least 2\n");
}

#[test]
fn test_large_suffix_length_is_rejected() {
new_ucmd!()
.args(&["-a", "66542562175252"])
.fails_with_code(1)
.stderr_only("split: invalid suffix length: '66542562175252'\n");
}

#[test]
fn test_verbose() {
new_ucmd!()
Expand Down
Loading