Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle long lines #59

Merged
merged 20 commits into from
Dec 14, 2020
Merged

Handle long lines #59

merged 20 commits into from
Dec 14, 2020

Conversation

Mortal
Copy link
Contributor

@Mortal Mortal commented Mar 29, 2017

I've found two test cases that cause the current aiosmtpd to fail with an unhandled exception. The issue is that StreamReader.readline raises a ValueError when its buffer overflows. The issue can arise both when reading a command line and reading a text line in DATA.

This PR sets a lower limit on StreamReader (1000 bytes per RFC 5321 4.5.3.1.6. + 1 byte for a leading transparent dot) to make the buffer overflow more frequent, and then it adds more complicated handling of StreamReader.readuntil in command line reading and DATA line reading. I couldn't get it working with StreamReader.readline, but since its implementation simply calls readuntil and masks some exceptions, it seems like we should use readuntil anyway.

I'm not sure exactly what to do in case of an early EOF -- should the code simply send some low-level SMTP error back to the client and return? Right now my implementation is (too) lenient about EOFs (materializing as asyncio.IncompleteReadError exceptions).

@Mortal Mortal force-pushed the handle_long_lines branch 3 times, most recently from efd0709 to 65f1429 Compare April 4, 2017 02:12
@Mortal
Copy link
Contributor Author

Mortal commented Apr 4, 2017

After digging into the circumstances surrounding early EOF or connection reset from client I have found that cancelling the _handle_client coroutine on connection reset is the sanest thing to do, and then simply catching CancelledError at the appropriate points in _handle_client.

Currently the Python 3.4 build fails with stalling on Travis in controller.rst. I don't know why that is.

I've managed to add tests that keep coverage at 100% which was not an easy task. However, most of the tasks merely consist of opening an SMTP session and closing/resetting at odd times without asserting that the output of aiosmtpd is anything useful -- it only ensures that no exception is raised.

@Mortal Mortal force-pushed the handle_long_lines branch from 65f1429 to f7d0cf0 Compare April 4, 2017 03:22
@Mortal
Copy link
Contributor Author

Mortal commented Apr 19, 2017

This PR conflicts with #70 as they get rid of self._connection_closed in two different ways.

@kozzztik
Copy link
Contributor

kozzztik commented Apr 20, 2017

I dont see conflict here. Your PR completely remove self._connection_closed usage, but #70 replace it with self.transport. You can still not use it, even it is replaced )

warsaw added a commit that referenced this pull request Apr 26, 2017
Lift some additional (non-line-length) tests from #59.
warsaw added a commit that referenced this pull request Apr 27, 2017
* Test cases for connection problems during readline().

Also, add some much improved debugging options for when the test suite is run
with the -E option.

Update the documentation.

* Some lines won't be covered in Python 3.4

* Refactor test class.

Lift some additional (non-line-length) tests from #59.

* Remove redundant test cases.

Closes #62
@warsaw
Copy link
Contributor

warsaw commented May 8, 2017

In the interest of getting 1.0 before Pycon, and since this branch has conflicts, I'm going to defer it past 1.0

@warsaw warsaw added this to the post-1.0 milestone May 8, 2017
@Mortal Mortal force-pushed the handle_long_lines branch from f7d0cf0 to de3ad56 Compare June 24, 2017 19:19
@Mortal
Copy link
Contributor Author

Mortal commented Jun 24, 2017

Rebased on top of #115.

@waynew
Copy link
Collaborator

waynew commented Jun 16, 2018

@Mortal do you have any interest in continuing this PR? There's apparently a conflict that needs to be resolved.

@Mortal Mortal force-pushed the handle_long_lines branch from 9ae0353 to 702b2dd Compare June 17, 2018 10:10
@Mortal
Copy link
Contributor Author

Mortal commented Jun 17, 2018

@waynew I have rebased onto master and squashed all commits into a single commit with a proper commit message.

Copy link
Collaborator

@waynew waynew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd like to see this setting made configurable, It would also be great to have some documentation that cites https://www.ietf.org/rfc/rfc2821.txt

@Mortal
Copy link
Contributor Author

Mortal commented Jun 24, 2018

I'll try to have a look at making the limit configurable on Tuesday.

@waynew
Copy link
Collaborator

waynew commented Jun 1, 2019

@Mortal any update on this PR, too?

@Mortal
Copy link
Contributor Author

Mortal commented Jun 2, 2019

As with the other PR, [help-wanted] is probably the best way forward.

@pepoluan
Copy link
Collaborator

I'm still thinking about the necessity of limiting StreamReader to N bytes (1001 in the PR).

Can we be sure that the underlying TCP stack will not buffer things until 1 page (4096 octets) and deliver things in bulk?

Yes, I know of the PSH flag, but IIRC it was never a requirement ("MUST"), just a recommendation ("SHOULD") for TCP stack implementation to actually deliver the data to the app when PSH flag is received.

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@Mortal
Copy link
Contributor Author

Mortal commented Nov 17, 2020

Hi @pepoluan, thanks for picking up this project! I don't really have the headspace these days to keep working on aiosmtpd, sorry. Can you confirm that the notification from CLAassistant is legit? I tried googling and I couldn't find any news anywhere that aiosmtpd specifically or aio-libs generally have started using CLAs. If you can confirm then I'll gladly sign a CLA.

@pepoluan
Copy link
Collaborator

Hi @Mortal , sorry for not answering sooner. This week had been a bit hectic in my life...

The CLA is legit, set by the maintainers of the aio-libs project, however there seems to be some... opinion about it. It's being discussed in #208 .

I myself am personally neutral about it, and I also personally think there's nothing wrong with the words, but it's still an interesting discussion over there (with links to other comments in other PRs).

I won't stop you from signing the CLA, but at the moment I also am not requiring anyone to sign the CLA. Just do what you think is right 😉

(For the record, I have already signed it.)

So the response sender will send a response based on first non-nominal
event.

A test case has been added, and a previous test case had comments added
as well to explain what's happening.

(Also changed that other test case's name to more properly reflect
what's being tested)
These are all faster, tested with lots of timeit's.

In relation to aio-libs#149 ... well, there's no relation, actually. `data` is
an internal temporary structure and after removal of "transparent dots"
gets flattened into bytes anyways. The visible external structs are
`original_content` and `content`, and their data types remain the same.
Copy link
Collaborator

@pepoluan pepoluan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments on some new stuff

if text.startswith(b'.'):
data[i] = text[1:]
content = original_content = EMPTYBYTES.join(data)
del text[0]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

timeit shows that doing it this way is 10%-20% faster. Not only that no copy is being done, also we can do away with counter increment of enumerate.

@pepoluan
Copy link
Collaborator

pepoluan commented Dec 2, 2020

Okay, I've tested on my systems (Windows, WSL, FreeBSD, and various Ubuntu), and they all pass with flying colors.

So I'm un-draft-ing this for review & merge.

@pepoluan pepoluan marked this pull request as ready for review December 2, 2020 09:08
@pepoluan pepoluan requested a review from waynew December 2, 2020 09:15
Previous behavior _seems to be_ non-compliant with SMTP Server
specification. Although the rebuilt smtp_DATA collection loop has fixed
this non-compliance, we want to make sure that future changes will not
regress to non-compliance.

Also one test case has been moved to make the sequence more logical.
Somehow PyCharm's code inspection insists that self.handler.session is
of type Session defined somewhere else and not within the aiosmtpd.smtp
module.
Mostly to silence PyCharm's code inspection that complains of patch not
fulfilling enter_context's signature.

But the side effect is cleaner, simpler code.
This time, affects test_smtp.py
Because patching a variable does NOT return a value, so those @patch
lines do NOT need their respective args.

(Also, correct the ordering on args on some of the test cases)
# Conflicts:
#	aiosmtpd/docs/NEWS.rst
#	aiosmtpd/smtp.py
#	aiosmtpd/tests/test_smtp.py
Seems to be tox+unittest behavior to pass-through WARNING-level
notifications (results of `warnings.warn` or `Logger.warning`)

These warnings are harmless, but messes up tox+unittest output, so we
suppress them.
"alpha" should indicate only "stages" of implementation as we get nearer
to release.

NEWS.rst should instead show/summarize _all_ the new (and good) things
in release. Or, in other words, a sum total of all the "alpha"s and
"beta"s and "rc"s prior to release.

"aiosmtpd-next" stays because it's the 'date' of release. Kind of like
"HEAD" in git.
This second change is actually even more important,
standards-compliance-wise.
Copy link
Collaborator

@pepoluan pepoluan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments on the latest changes.

if text.startswith(b'.'):
data[i] = text[1:]
content = original_content = EMPTYBYTES.join(data)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If content is going to be overwritten anyways, why set its value here.

Moved actual setting-of-value inside the if branches.

result.append(sock.recv(1024))
else:
break
return b"".join(result)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need this func to perform low-level non-hanging send-and-receive for each message line during DATA phase.

@pepoluan
Copy link
Collaborator

pepoluan commented Dec 7, 2020

Testing progress on my systems:

  • Windows 10 (via PyCharm runner): (ALL)
  • Ubuntu 18.04 on WSL: (ALL) + pypy3-{nocov,cov}
  • FreeBSD 12.1 on VBOx: (ALL) + pypy3-nocov
  • Windows 10 (via PSCore 7.0.3): (ALL)
  • Windows 10 (Cygwin): qa,py36-{nocov,cov}

I think I'll skip testing on my VBox Ubuntu 18.04+20.04 combo. There should be no difference compared to the GitHub Actions ones.

@pepoluan
Copy link
Collaborator

pepoluan commented Dec 8, 2020

With your blessings @waynew , we can get this merged :-)

@pepoluan
Copy link
Collaborator

Come Monday I'll merge this if no objection.

Protocol violation (see #9 ) is a Serious Bug™ and should be fixed asap.

@pepoluan pepoluan merged commit 73b7284 into aio-libs:master Dec 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Protocol violation and message receive when exceeding data limit size
6 participants