Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

STDIN fails on slow input #1472

Closed
arnested opened this issue May 16, 2017 · 28 comments
Closed

STDIN fails on slow input #1472

arnested opened this issue May 16, 2017 · 28 comments
Milestone

Comments

@arnested
Copy link
Contributor

I noticed that PHP CodeSniffer will fail if receiving the file on STDIN and the input on STDIN is slow.

I discovered this using gulp-phpcs but I'm able to reproduce the issue just by piping the input (and using pv(1) to slow down the pipe).

This might be related to (some of) #993.

Reproducing
We'll use the coding standard from PHP CodeSniffer itself and we'll also check a file from PHP CodeSniffer. The problem seems to appear at file sizes above 8192 bytes. Reporter.php is ~11.7 kB.

$ wget https://github.com/squizlabs/PHP_CodeSniffer/raw/master/phpcs.xml.dist
$ wget https://github.com/squizlabs/PHP_CodeSniffer/raw/master/src/Reporter.php

Just cat'ing the file on standard in succeeds as expected:

$ cat Reporter.php | phpcs -
.

Time: 147ms; Memory: 4Mb

Making the pipe slow using pv(1) will make the check fail.

$ pv -qL 3k Reporter.php | phpcs -
E


FILE: STDIN
----------------------------------------------------------------------
FOUND 1 ERROR AFFECTING 1 LINE
----------------------------------------------------------------------
 12 | ERROR | [x] Expected 1 newline at end of file; 0 found
----------------------------------------------------------------------
PHPCBF CAN FIX THE 1 MARKED SNIFF VIOLATIONS AUTOMATICALLY
----------------------------------------------------------------------

Time: 69ms; Memory: 4Mb

I have reproduced this both on my Macbook and a linux server.

I'm using "PHP_CodeSniffer version 3.0.0 (stable) by Squiz (http://www.squiz.net)." and also tried with current master (5847603) with same result.

@arnested
Copy link
Contributor Author

arnested commented May 16, 2017

Maybe it wasn't obvious that I think what happens is PHPCS will close the pipe after receiving the first 8192 bytes and only sniff that part of the file.

I considered introducing a rule saying "no PHP file must be larger than 8000 bytes" to avoid the problem and although smaller files should be a good thing I don't quite think it's a good solution 😄

@gsherwood
Copy link
Member

gsherwood commented May 16, 2017

I'm pretty sure that the only way to allow for slow STDIN is to increase the wait time between reads, which is here: https://github.com/squizlabs/PHP_CodeSniffer/blob/master/src/Config.php#L363

You can try tweaking that value if you want, but I have no idea how high it would need to be for your specific case. If you find a value and it happens to slow down normal STDIN usage, I could make that an option.

Any chance you could try tweaking that value?

@arnested
Copy link
Contributor Author

I'll try this out tomorrow and report back. Thank you for the feedback.

@arnested
Copy link
Contributor Author

arnested commented May 17, 2017

Raising the usleep() to 13000-14000 makes the error go away with the 3k limit from pv.

It also impacts the runtime for checking Reporter.php to 5-6 seconds compare to 68 ms with usleep(10).

That's not very usable but on the other hand the 3k limit is also very random.

On my "real life" example with gulp-phpcs I can get around the problem by raising the usleep to 350. That's more or less OK I think.

So one option would be to make the value configurable. But that doesn't smell right and would result in people having to tweek some "magic" value over and over again as performance characteristicas chance over time.

If you make the - option required for streaming on stdin (I think I read somewhere you considered that?) you can drop making the read non-blocking and avoid the usleep():

--- src/Config.php~	2017-05-04 02:33:04.000000000 +0200
+++ src/Config.php	2017-05-17 10:50:52.000000000 +0200
@@ -347,13 +347,12 @@
         }

         // Check for content on STDIN.
-        if ($checkStdin === true) {
+        if ($this->stdin === true) {
             $handle = fopen('php://stdin', 'r');
-            if (stream_set_blocking($handle, false) === true) {
+            if (stream_set_blocking($handle, true) === true) {
                 $fileContents = '';
                 while (($line = fgets($handle)) !== false) {
                     $fileContents .= $line;
-                    usleep(10);
                 }

                 stream_set_blocking($handle, true);

(there might be other gotchas in the above solution -- for now it's just an idea).

@arnested
Copy link
Contributor Author

Oh. - being required for stdin is actually in the release notes for 3.0.0a1.

@gsherwood
Copy link
Member

It's required if you want PHPCS to wait for STDIN when nothing is supplied. In version 2.x, PHPCS would wait forever so you could type content on the command line. In version 3, you need to use - to have it behave like that. By default, it will throw an error and say that no filenames were supplied.

@arnested
Copy link
Contributor Author

OK. My suggested fix would make - be always required for piping on stdin but will otherwise work with large files on slow pipes and still make you be able to type manually on stdin.

arnested added a commit to arnested/PHP_CodeSniffer that referenced this issue May 17, 2017
Make it required to explicitly use `-` for reading from stdin.

Making it explicitly required to use will simplify the code by not
needing a lot of edge cases.

Non-blocking reading from stdin was introduced in
80b156d because we had problems with
auto detecting whether we should read from stdin or not. This is avoided
by requiring `-`.

By going back to blocking read from stdin we can avoid the extra check
introduced in  for Windows.

We can also avoid the "read by line" and "usleep" introduced
8d669c5 and
63fafe0. They where introduced trying
to solve what is possible the same problem still reported in squizlabs#1472.

Fixes squizlabs#1472.
@arnested
Copy link
Contributor Author

I digged into why this part of the code came into being like it is and I believe a lot things can be better if we always require - when we want to read from stdin.

I submitted PR #1474 for this.

arnested added a commit to arnested/PHP_CodeSniffer that referenced this issue May 17, 2017
Make it required to explicitly use `-` for reading from stdin.

Making it explicitly required to use will simplify the code by not
needing a lot of edge cases.

Non-blocking reading from stdin was introduced in
80b156d because we had problems with
auto detecting whether we should read from stdin or not. This is avoided
by requiring `-`.

By going back to blocking read from stdin we can avoid the extra check
introduced in  for Windows.

We can also avoid the "read by line" and "usleep" introduced
8d669c5 and
63fafe0. They where introduced trying
to solve what is possible the same problem still reported in squizlabs#1472.

Fixes squizlabs#1472.
arnested added a commit to arnested/PHP_CodeSniffer that referenced this issue May 17, 2017
Make it required to explicitly use `-` for reading from stdin.

Making it explicitly required to use will simplify the code by not
needing a lot of edge cases.

Non-blocking reading from stdin was introduced in
80b156d because we had problems with
auto detecting whether we should read from stdin or not. This is avoided
by requiring `-`.

By going back to blocking read from stdin we can avoid the extra check
introduced in 23bd023 for Windows.

We can also avoid the "read by line" and "usleep" introduced
8d669c5 and
63fafe0. They where introduced trying
to solve what is possible the same problem still reported in squizlabs#1472.

Fixes squizlabs#1472.
@gsherwood
Copy link
Member

[ Repost from the PR ]

I don't want people to have to use - just to pipe content into PHPCS.

This was the easy way out I could have taken, but it forces people to have to know an obscure CLI arg to do what I consider pretty basic stuff. Having PHPCS sit and wait for STDIN while someone types content on the screen is a serious edge case, so I added a switch for that.

What I want to do is find a way to, without any special options, have people able to use PHPCS with commands like echo, cat, and xargs for checking content.

The way to do this is possibly to change the behaviour so that STDIN is only ever looked at if PHPCS can't find anything to check on the command line. This gets more complex because the coding standard itself (normally inside phpcs.xml) might define files and folders to check, so there are times when PHPCS has things to check, but they are all sourced from a ruleset. It would need to wait a short time to see if anything can be found on STDIN before deciding to go ahead and process whatever the ruleset said to process. That "short time" is obviously going to be seen as "lag", so it would need to be really short.

I'm going to close this PR because this isn't how I want PHPCS to work. It would be better to have a discussion on the main issue (issue #1472), so I'll repost my comment there as well.

@arnested
Copy link
Contributor Author

You have a valid point. Requiring - is obscured compared to usual unix commands.

I tried another go at this to without having to do the complex stuff checking if anything is defined on the command line or in phpcs.xml.

We keep the explicit - only for typing thing manually on stdout.

In all other cases we read from stdin if stdin is not a tty (that is the case handled by -) and stdin doesn't point to a EOF.

  • The current test cases still succeeds.
  • manual typing on stdin is still possible with -
  • cat files is still possible
  • echo code is still possible
  • find and xargs is still possible
  • and of course also specifying files on the command line is still possible (even combined with the above readings from stdin)

We can still avoid non-blocking read which means we can avoid the extra check for Windows, reading line-by-line and usleep. And we avoid the problem with slow data on stdin.

I think this covers all the use cases and problems described here and in #993.

diff --git a/src/Config.php b/src/Config.php
index 502ffaece..348bcc4dd 100644
--- a/src/Config.php
+++ b/src/Config.php
@@ -6,7 +6,7 @@
  * and provides functions to access data stored in config files.
  *
  * @author    Greg Sherwood <gsherwood@squiz.net>
- * @copyright 2006-2015 Squiz Pty Ltd (ABN 77 084 670 600)
+ * @copyright 2006-2017 Squiz Pty Ltd (ABN 77 084 670 600)
  * @license   https://github.com/squizlabs/PHP_CodeSniffer/blob/master/licence.txt BSD Licence
  */

@@ -318,11 +318,9 @@ class Config
             $this->dieOnUnknownArg = $dieOnUnknownArg;
         }

-        $checkStdin = false;
         if (empty($cliArgs) === true) {
             $cliArgs = $_SERVER['argv'];
             array_shift($cliArgs);
-            $checkStdin = true;
         }

         $this->restoreDefaults();
@@ -353,27 +351,22 @@ class Config
             } while ($currentDir !== '.' && $currentDir !== $lastDir);
         }//end if

-        // Check for content on STDIN.
-        if ($checkStdin === true) {
-            $handle = fopen('php://stdin', 'r');
-            if (stream_set_blocking($handle, false) === true) {
-                $fileContents = '';
-                while (($line = fgets($handle)) !== false) {
-                    $fileContents .= $line;
-                    usleep(10);
-                }
+        $handle = fopen('php://stdin', 'r');

-                stream_set_blocking($handle, true);
-                fclose($handle);
-                if (trim($fileContents) !== '') {
-                    $this->stdin        = true;
-                    $this->stdinContent = $fileContents;
-                    $this->overriddenDefaults['stdin']        = true;
-                    $this->overriddenDefaults['stdinContent'] = true;
-                }
+        // Check for content on STDIN.
+        if (($this->stdin === true) || ((posix_isatty($handle) === false) && (feof($handle) === false))) {
+            $fileContents = stream_get_contents($handle);
+
+            if (trim($fileContents) !== '') {
+                $this->stdin        = true;
+                $this->stdinContent = $fileContents;
+                $this->overriddenDefaults['stdin']        = true;
+                $this->overriddenDefaults['stdinContent'] = true;
             }
         }

+        fclose($handle);
+
     }//end __construct()

arnested added a commit to arnested/PHP_CodeSniffer that referenced this issue May 18, 2017
Non-blocking reading from stdin was introduced in
80b156d because we had problems with
auto detecting whether we should read from stdin or not. Combining
`posix_isatty()` and `feof()` can get us back to blocking read.

By going back to blocking read from stdin we can avoid the extra check
introduced in 23bd023 for Windows.

We can also avoid the "read by line" and "usleep" introduced
8d669c5 and
63fafe0. They where introduced trying
to solve what is possible the same problem still reported in squizlabs#1472.

Fixes squizlabs#993.
Fixes squizlabs#1472.
arnested added a commit to arnested/PHP_CodeSniffer that referenced this issue May 18, 2017
Non-blocking reading from stdin was introduced in
80b156d because we had problems with
auto detecting whether we should read from stdin or not. Combining
`posix_isatty()` and `feof()` can get us back to blocking read.

By going back to blocking read from stdin we can avoid the extra check
introduced in 23bd023 for Windows.

We can also avoid the "read by line" and "usleep" introduced
8d669c5 and
63fafe0. They where introduced trying
to solve what is possible the same problem still reported in squizlabs#1472.

Fixes squizlabs#993.
Fixes squizlabs#1472.
@rbayliss
Copy link

I can confirm this is an issue, and has come up across several projects (also using gulp-phpcs). Any workaround for 2.x that doesn't require forking would be much appreciated.

@arnested
Copy link
Contributor Author

Hi @rbayliss, I don't think there are other workarounds besides writing smaller files...

I went ahead with my last suggestion and wrote some testcases for testing the CLI's stdin behaviour.

I have added tests

  • cat files on stdin
  • echo code on stdin
  • Use find and pipe result to xargs
  • Use pv to "slow cat" files on stdin

I haven't found a way to simulate typing on stdin (the case where stdin is a tty).

All cases (except slow cat) succeeds on current master and my suggest fix succeeds on all test cases (both the new one and existing ones).

Typing on stdin when explicitly selected (-) also works with my fix (manually tested).

Since I have received no other feedback on the suggested fix I'm going to open a pull request with the test cases and the suggested fix.

arnested added a commit to arnested/PHP_CodeSniffer that referenced this issue May 27, 2017
Non-blocking reading from stdin was introduced in
80b156d because we had problems with
auto detecting whether we should read from stdin or not. Combining
`posix_isatty()` and `feof()` can get us back to blocking read.

By going back to blocking read from stdin we can avoid the extra check
introduced in 23bd023 for Windows.

We can also avoid the "read by line" and "usleep" introduced
8d669c5 and
63fafe0. They where introduced trying
to solve what is possible the same problem still reported in squizlabs#1472.

Fixes squizlabs#993.
Fixes squizlabs#1472.
@synthetiv
Copy link

Just want to chime in that I've also been running into this issue using Atom and the linter-phpcs package. The common denominator here seems to be Node, so that's interesting, I guess. Currently I'm working around this issue by using a patched copy of PHPCS.

@arnested
Copy link
Contributor Author

arnested commented Jun 1, 2017

Thank you for confirming the problem @synthetiv. Any chance you'll be able to verify if my pull request also works for you?

@synthetiv
Copy link

Sure, I'll try it out later – right now the patched version I'm using is based on PHPCS version 2.9.1, because I mostly use the WordPress coding standards and they aren't compatible with 3.0 yet.

The patch I'm running now is basically a dumbed-down version of your fix:

diff --git a/CodeSniffer/CLI.php b/CodeSniffer/CLI.php
index 29403cf..ab2a885 100644
--- a/CodeSniffer/CLI.php
+++ b/CodeSniffer/CLI.php
@@ -404,18 +404,11 @@ public function getCommandLineValues()

         // Check for content on STDIN.
         $handle = fopen('php://stdin', 'r');
-        if (stream_set_blocking($handle, false) === true) {
-            $fileContents = '';
-            while (($line = fgets($handle)) !== false) {
-                $fileContents .= $line;
-                usleep(10);
-            }
-
-            stream_set_blocking($handle, true);
-            fclose($handle);
-            if (trim($fileContents) !== '') {
-                $this->values['stdin'] = $fileContents;
-            }
+        stream_set_blocking($handle, true);
+        $fileContents = stream_get_contents($handle);
+        fclose($handle);
+        if (trim($fileContents) !== '') {
+            $this->values['stdin'] = $fileContents;
         }

         return $this->values;

This always waits for STDIN, even if you specify a filename, so it's not a good general-purpose solution, but it's working fine for me.

Thank you for writing up this issue and coming up with a fix for it – it was really driving me crazy.

@arnested
Copy link
Contributor Author

arnested commented Jun 2, 2017

@gsherwood Are more releases planned from the 2.9 branch? If yes I will try to backport my fix from #1490.

@gsherwood
Copy link
Member

Are more releases planned from the 2.9 branch? If yes I will try to backport my fix from #1490.

Sorry, I missed this.

I'll only do serious bug fix releases on 2.9, so I wouldn't merge a core change like this in.

@gsherwood gsherwood added this to the 3.2.0 milestone Jun 13, 2017
@arnested
Copy link
Contributor Author

Seems reasonable.

arnested added a commit to arnested/PHP_CodeSniffer that referenced this issue Sep 26, 2017
Non-blocking reading from stdin was introduced in
80b156d because we had problems with
auto detecting whether we should read from stdin or not. Combining
`posix_isatty()` and `feof()` can get us back to blocking read.

By going back to blocking read from stdin we can avoid the extra check
introduced in 23bd023 for Windows.

We can also avoid the "read by line" and "usleep" introduced
8d669c5 and
63fafe0. They where introduced trying
to solve what is possible the same problem still reported in squizlabs#1472.

Fixes squizlabs#993.
Fixes squizlabs#1472.
arnested added a commit to arnested/PHP_CodeSniffer that referenced this issue Oct 12, 2017
Non-blocking reading from stdin was introduced in
80b156d because we had problems with
auto detecting whether we should read from stdin or not. Combining
`posix_isatty()` and `feof()` can get us back to blocking read.

By going back to blocking read from stdin we can avoid the extra check
introduced in 23bd023 for Windows.

We can also avoid the "read by line" and "usleep" introduced
8d669c5 and
63fafe0. They where introduced trying
to solve what is possible the same problem still reported in squizlabs#1472.

Fixes squizlabs#993.
Fixes squizlabs#1472.
arnested added a commit to arnested/PHP_CodeSniffer that referenced this issue Oct 22, 2017
Non-blocking reading from stdin was introduced in
80b156d because we had problems with
auto detecting whether we should read from stdin or not. Combining
`posix_isatty()` and `feof()` can get us back to blocking read.

By going back to blocking read from stdin we can avoid the extra check
introduced in 23bd023 for Windows.

We can also avoid the "read by line" and "usleep" introduced
8d669c5 and
63fafe0. They where introduced trying
to solve what is possible the same problem still reported in squizlabs#1472.

Fixes squizlabs#993.
Fixes squizlabs#1472.
@gsherwood
Copy link
Member

gsherwood commented Oct 27, 2017

[Replicating a comment from the PR]

The posix_isatty function is not available on windows, so I imagine this would throw undefined function errors when run on windows. This change would also make the POSIX extension a new requirement for PHPCS.

At the very least, this needs some guard code and an alternative implementation that works without the extension to re-add support for STDIN. But that other implementation is going to suffer from the same issues that this PR is trying to fix, so the real solution is a different change that doesnt use posix functions.

arnested added a commit to arnested/PHP_CodeSniffer that referenced this issue Oct 28, 2017
Non-blocking reading from stdin was introduced in
80b156d because we had problems with
auto detecting whether we should read from stdin or not. Combining
`posix_isatty()` and `feof()` can get us back to blocking read.

By going back to blocking read from stdin we can avoid the extra check
introduced in 23bd023 for Windows.

We can also avoid the "read by line" and "usleep" introduced
8d669c5 and
63fafe0. They where introduced trying
to solve what is possible the same problem still reported in squizlabs#1472.

Fixes squizlabs#993.
Fixes squizlabs#1472.
@arnested
Copy link
Contributor Author

Thank you for the feedback.

Of course you are right about posix_isatty() not being available on Windows. I forgot to think about that.

I installed Windows/PHP in a Virtualbox and tried other workarounds. That discovered some inconsistensies between Windows native console app and Mingw/Cygwin-based terminal apps.

I have come up with a solution to replace the posix_isatty() call and have put it into a isStdinATTY() helper method.

arnested added a commit to arnested/PHP_CodeSniffer that referenced this issue Oct 28, 2017
Non-blocking reading from stdin was introduced in
80b156d because we had problems with
auto detecting whether we should read from stdin or not. Combining
`posix_isatty()` and `feof()` can get us back to blocking read.

By going back to blocking read from stdin we can avoid the extra check
introduced in 23bd023 for Windows.

We can also avoid the "read by line" and "usleep" introduced
8d669c5 and
63fafe0. They where introduced trying
to solve what is possible the same problem still reported in squizlabs#1472.

Fixes squizlabs#993.
Fixes squizlabs#1472.
@arnested
Copy link
Contributor Author

arnested commented Oct 28, 2017

Btw. there are some failures with the test cases in master on Windows. I'll see if I can find the time this weekend to look into them too while I have a Windows environment up and running.

arnested added a commit to arnested/PHP_CodeSniffer that referenced this issue Oct 28, 2017
Non-blocking reading from stdin was introduced in
80b156d because we had problems with
auto detecting whether we should read from stdin or not. Combining
`posix_isatty()`, `tty`, and, `feof()` can get us back to blocking read.

By going back to blocking read from stdin we can avoid the extra check
introduced in 23bd023 for Windows.

We can also avoid the "read by line" and "usleep" introduced
8d669c5 and
63fafe0. They where introduced trying
to solve what is possible the same problem still reported in squizlabs#1472.

Fixes squizlabs#993.
Fixes squizlabs#1472.
arnested added a commit to arnested/PHP_CodeSniffer that referenced this issue Oct 28, 2017
Non-blocking reading from stdin was introduced in
80b156d because we had problems with
auto detecting whether we should read from stdin or not. Combining
`posix_isatty()`, `tty`, and, `feof()` can get us back to blocking read.

By going back to blocking read from stdin we can avoid the extra check
introduced in 23bd023 for Windows.

We can also avoid the "read by line" and "usleep" introduced
8d669c5 and
63fafe0. They where introduced trying
to solve what is possible the same problem still reported in squizlabs#1472.

Fixes squizlabs#993.
Fixes squizlabs#1472.
arnested added a commit to arnested/PHP_CodeSniffer that referenced this issue Oct 28, 2017
Non-blocking reading from stdin was introduced in
80b156d because we had problems with
auto detecting whether we should read from stdin or not. Combining
`posix_isatty()`, `tty`, and, `feof()` can get us back to blocking read.

By going back to blocking read from stdin we can avoid the extra check
introduced in 23bd023 for Windows.

We can also avoid the "read by line" and "usleep" introduced
8d669c5 and
63fafe0. They where introduced trying
to solve what is possible the same problem still reported in squizlabs#1472.

Fixes squizlabs#993.
Fixes squizlabs#1472.
@jrfnl
Copy link
Contributor

jrfnl commented Oct 28, 2017

Btw. there are some failures with the test cases in master on Windows.

Interesting. Have you got some more details to share about this ?

arnested added a commit to arnested/PHP_CodeSniffer that referenced this issue Nov 2, 2017
Non-blocking reading from stdin was introduced in
80b156d because we had problems with
auto detecting whether we should read from stdin or not. Combining
`posix_isatty()`, `tty`, and, `feof()` can get us back to blocking read.

By going back to blocking read from stdin we can avoid the extra check
introduced in 23bd023 for Windows.

We can also avoid the "read by line" and "usleep" introduced
8d669c5 and
63fafe0. They where introduced trying
to solve what is possible the same problem still reported in squizlabs#1472.

Fixes squizlabs#993.
Fixes squizlabs#1472.
@gsherwood gsherwood reopened this Nov 29, 2017
@gsherwood
Copy link
Member

This was auto-closed after I merged a commit from @arnested. Re-opening so other people can test before release.

If anyone experiencing issues is able to try master and see if this change fixes things for you, I'd really appreciate it. I'll leave this open until the 3.2.0 release in case anyone has time.

@synthetiv
Copy link

I'm running master now (as of a few minutes ago); so far so good! I'll post here if I start having issues again.

@gsherwood
Copy link
Member

It's been a couple of weeks, so I'm going to close this as fixed. Thanks @synthetiv for testing. I've also tested this internally with a wrapper script we use for another product and it is working there as well.

@synthetiv
Copy link

Sounds good to me! I was on vacation last week & wasn't doing any PHP sniffing, but as of today it's still working well.

@gsherwood
Copy link
Member

The fix for this broke the PHPStorm plugin because it doesn't provide a terminal, but does provide a STDIN stream, but never passes anything through. So the process would hang forever. I've reinstated a timeout, but done it in a different way to ensure all the STDIN tests are still passing, and the plugin working again.

@arnested
Copy link
Contributor Author

FYI: I ran my "CLI tests" on master and they still work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants