Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Malformed data being sent on Windows #60

Closed
martinzak-zaber opened this issue Apr 2, 2019 · 5 comments · Fixed by #147
Closed

Malformed data being sent on Windows #60

martinzak-zaber opened this issue Apr 2, 2019 · 5 comments · Fixed by #147

Comments

@martinzak-zaber
Copy link
Contributor

martinzak-zaber commented Apr 2, 2019

Hello,

We have been using this library for some time and we have encountered an issue which is very difficult to reproduce and debug. The problem is caused by the Read function in the windows implementation:

func (port *windowsPort) Read(p []byte) (int, error) {
	var readed uint32
	params := &dcb{}
	ev, err := createOverlappedEvent()
	if err != nil {
		return 0, err
	}
	defer syscall.CloseHandle(ev.HEvent)
	for {
		err := syscall.ReadFile(port.handle, p, &readed, ev)
		switch err {
		case nil:
			// operation completed successfully
		case syscall.ERROR_IO_PENDING:
			// wait for overlapped I/O to complete
			if err := getOverlappedResult(port.handle, ev, &readed, true); err != nil {
				return int(readed), err
			}
		default:
			// error happened
			return int(readed), err
		}

		if readed > 0 {
			return int(readed), nil
		}
		if err := resetEvent(ev.HEvent); err != nil {
			return 0, err
		}

		// At the moment it seems that the only reliable way to check if
		// a serial port is alive in Windows is to check if the SetCommState
		// function fails.

		getCommState(port.handle, params)
		if err := setCommState(port.handle, params); err != nil {
			port.Close()
			return 0, err
		}
	}
}

If there is no traffic on the port this function calls setCommState every 1 second. If there is also an ongoing Write operation at the same time, the transmitted data gets malformed and the other side receives gibberish. The race condition requires very precise timing to be reproduced. We are very confident that the issue is not caused by our code.

As this library is used by your customers I thought this may be quite critical information for you. We have spent around two days hunting this down. Feel free to contact me for more details.

Thank you for implementing this awesome library.

@cmaglie
Copy link
Member

cmaglie commented May 8, 2020

Hi @martinzak-zaber, thanks for the bug report, and sorry for the long delay.

May you provide an example program to help to reproduce the issue? Writing in a tight loop may and reading from another goroutine may be sufficient to trigger the issue?

I guess you already tried to remove the getCommState/setCommState piece and this did solve the issue for you?

@martinzak-zaber
Copy link
Contributor Author

Hello @cmaglie,

I, unfortunately, don't have the testing program anymore. It was also a part of a bigger project. But I do recall that we were running an intense stress test. Basically writing and reading from the port at the same time many times. It was slightly unusual because our protocol is request/response so most of the time there is no overlap between writing and reading. We had to be bashing it with multiple requests at the same time.

In our fork, We have removed the get/set commState entirely. I don't recall what it does to detecting whether the port is alive. But one use case we test for is ripping out the FTDI cable and for that, we get an error. So it works for us without the call.

Feel free to check out our merge requests: https://github.com/martinzak-zaber/go-serial/pulls These are all the fixes and changes we have made for our fork.

Thank you again for creating the library!

palazzol added a commit to palazzol/go-serial that referenced this issue Nov 14, 2022
Remove unnecessary code, originally for checking for disconnects when using non-overlapped I/O.  getOverlappedEvent() returns all required error codes.  Calling getCommState()/setCommState() every second in some applications contributes to other downstream issues.  For examples:
bugst#60 (comment)
arduino/arduino-ide#375
cmaglie pushed a commit to cmaglie/go-serial that referenced this issue Nov 24, 2022
The old code is no more needed when using overlapped I/O,
getOverlappedEvent() returns all required error codes.

Calling getCommState()/setCommState() every second in some
applications contributes to other downstream issues. For examples:

bugst#60 (comment)
arduino/arduino-ide#375
@palazzol
Copy link
Contributor

Hi @cmaglie,

FWIW - the link cited in this comment seems to be gone, but the fixes described seem to be here - if you are looking for them:
https://github.com/zabertech/go-serial

Hello @cmaglie,

I, unfortunately, don't have the testing program anymore. It was also a part of a bigger project. But I do recall that we were running an intense stress test. Basically writing and reading from the port at the same time many times. It was slightly unusual because our protocol is request/response so most of the time there is no overlap between writing and reading. We had to be bashing it with multiple requests at the same time.

In our fork, We have removed the get/set commState entirely. I don't recall what it does to detecting whether the port is alive. But one use case we test for is ripping out the FTDI cable and for that, we get an error. So it works for us without the call.

Feel free to check out our merge requests: https://github.com/martinzak-zaber/go-serial/pulls These are all the fixes and changes we have made for our fork.

Thank you again for creating the library!

@martinzak-zaber
Copy link
Contributor Author

I am sorry, we have moved the library under company GitHub.
https://github.com/zabertech/go-serial
We have made dozen or so fixes to the library over the years (see commits). This library is an important dependency of our software.

Thank you again to the authors of the library.

@cmaglie
Copy link
Member

cmaglie commented Nov 29, 2022

Thank you everyone for the feedback and the fix, I've finally managed to reproduce the bug and confirm the solution. 👍🏼

@bugst bugst locked as resolved and limited conversation to collaborators Feb 29, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
3 participants