Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EncoderWriter::write_all leads to WriteZero error #148

Closed
paolobarbolini opened this issue Nov 6, 2020 · 8 comments
Closed

EncoderWriter::write_all leads to WriteZero error #148

paolobarbolini opened this issue Nov 6, 2020 · 8 comments

Comments

@paolobarbolini
Copy link

paolobarbolini commented Nov 6, 2020

Using base64::write::EncoderWriter with a writer that accepts very limited writes leads to EncoderWriter::write_all calls failing.

To reproduce (playground link)

extern crate base64; // 0.13.0

use std::io::{self, Write};

/// A Writer that limits how much can be written in a single call
struct ShortWriter<'a, W> {
    writer: &'a mut W,
    max_write_len: usize,
}

impl<'a, W> Write for ShortWriter<'a, W>
where
    W: Write,
{
    fn write(&mut self, buf: &[u8]) -> io::Result<usize> {
        let write_len = std::cmp::min(self.max_write_len, buf.len());
        let written = self.writer.write(&buf[..write_len])?;
        // do some other stuff...
        Ok(written)
    }

    fn flush(&mut self) -> io::Result<()> {
        self.writer.flush()
    }
}

fn main() {
    let contents = vec![b'A'; 100];

    let mut vec = Vec::with_capacity(8196);
    {
        let writer_ = ShortWriter {
            writer: &mut vec,
            max_write_len: 76,
        };
        let mut writer = base64::write::EncoderWriter::new(writer_, base64::STANDARD);
        writer.write_all(&contents).unwrap();
    }
    
    assert_eq!(vec, base64::encode(contents).into_bytes());
}

To reproduce 2 (expanded the default Write::write_all implementation) (playground link)

Click to expand!
extern crate base64; // 0.13.0

use std::io::{self, Write};

/// A Writer that limits how much can be written in a single call
struct ShortWriter<'a, W> {
    writer: &'a mut W,
    max_write_len: usize,
}

impl<'a, W> Write for ShortWriter<'a, W>
where
    W: Write,
{
    fn write(&mut self, buf: &[u8]) -> io::Result<usize> {
        let write_len = std::cmp::min(self.max_write_len, buf.len());
        let written = self.writer.write(&buf[..write_len])?;
        // do some other stuff...
        Ok(written)
    }

    fn flush(&mut self) -> io::Result<()> {
        self.writer.flush()
    }
}

fn main() {
    let contents = vec![b'A'; 100];

    let mut vec = Vec::with_capacity(8196);
    {
        let writer_ = ShortWriter {
            writer: &mut vec,
            max_write_len: 76,
        };
        let mut writer = base64::write::EncoderWriter::new(writer_, base64::STANDARD);
        // writer.write_all(&contents).unwrap();
        
        let mut buf: &[u8] = contents.as_ref();
        while !buf.is_empty() {
            println!("writing");
            match writer.write(buf) {
                Ok(0) => {
                    // this should never happen
                    println!("ignoring zero write");
                }
                Ok(n) => {
                    println!("written len: {}", n);
                    buf = &buf[n..];
                },
                Err(ref e) if e.kind() == std::io::ErrorKind::Interrupted => {}
                Err(e) => panic!("err: {}", e),
            }
        }
    }
    
    assert_eq!(vec, base64::encode(contents).into_bytes());
}

Expected behavior

Encoding succeeds, and the contents of vec are the same I would get if I were to call base64::encode(contents).

Observed behavior

writer.write_all(&contents) fails with Custom { kind: WriteZero, error: "failed to write whole buffer" }

Environment

rust: stable
base64: 0.13.0

@marshallpierce
Copy link
Owner

At first glance this appears to be the issue I reported in rust a while back: rust-lang/rust#56889.

tl;dr Write::write_all erroneously (IMO) interprets an Ok(0) as "no more writes are ever possible". If you do your own loop akin to write_all without that particular issue, does this still happen? write_all is small, so your own implementation is not out of the question:

    fn write_all(&mut self, mut buf: &[u8]) -> Result<()> {
        while !buf.is_empty() {
            match self.write(buf) {
                Ok(0) => {
                    return Err(Error::new(ErrorKind::WriteZero, "failed to write whole buffer"));
                }
                Ok(n) => buf = &buf[n..],
                Err(ref e) if e.kind() == ErrorKind::Interrupted => {}
                Err(e) => return Err(e),
            }
        }
        Ok(())
    }

@marshallpierce
Copy link
Owner

Feel free to re-open if something new comes to light, or if that rust bug is ever fixed 😢

@ajtribick
Copy link
Contributor

I ended up spending quite a bit of time trying to figure out what's going on when I started randomly getting hit by this. It would be useful to add that EncoderWriter breaks the current implementation of write_all in the documentation, and the suggested mitigation for this.

@marshallpierce
Copy link
Owner

It's documented at https://github.com/marshallpierce/rust-base64/blob/master/src/write/encoder.rs#L243, but I'm open to PRs to explain it differently, and I agree that it would be nice to offer a suggested mitigation.

@ajtribick
Copy link
Contributor

Unfortunately docs.rs doesn't make the documentation on implemented traits particularly obvious, so will see about adding this to the main documentation of the struct as this is something that would be pretty important for users to know about. In my case I worked around it by adding a wrapper with write_all detecting multiple consecutive returns of Ok(0), but not sure how reliable that is as a way to distinguish the buffer being flushed vs EOF.

@unixzii
Copy link

unixzii commented Dec 20, 2023

Hi @marshallpierce, I'm facing the same problem and came across this issue. As rust-lang/rust#107200 mentioned, write method will not have to represent only one attempt to write to any wrapped object. So do we have any consideration to mitigate the impact of returning Ok(0)?

IMHO, as an adapter, write should only return Ok(0) when the underlying object also returned Ok(0) (since it typically means the object is no longer able to accept bytes). I also agree that write should not behave like write_all, at least from the perspective of semantics. Based on this, can we flush the buffer without returning Ok(0) directly?

For example, change the code below:

if self.output_occupied_len > 0 {
    let current_len = self.output_occupied_len;
    return self
        .write_to_delegate(current_len)
        // did not read any input
        .map(|_| 0);
}

to this:

while !self.has_spare_capacity_to_encode_more() {
    let current_len = self.output_occupied_len;
    let res = self.write_to_delegate(current_len)?;
    if res == 0 {
        return Ok(0);
    }
}

We even don't need to flush all the contents in output buffer, since we don't want it to behave like write_all. This is done only to consume at least one byte from the input buffer, so that it can work well with write_all.

@marshallpierce
Copy link
Owner

I wonder if such a change would require an MSRV update? Or maybe most users don't care. Technically the https://github.com/rust-lang/rust/pull/107200/files / https://github.com/rust-lang/rust/pull/114897/files combo only applies to 1.73+.

@unixzii
Copy link

unixzii commented Mar 9, 2024

I don't know. But maybe some crates will rely on the behaviors described in the doc to work correctly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants