Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open Changes and quickly clicking on Stage Changes on a large json file blocks / hangs for 10 seconds #40681

Closed
joshunger opened this issue Dec 22, 2017 · 27 comments
Assignees
Labels
bug Issue identified by VS Code Team member as probable bug macos Issues with VS Code on MAC/OS X perf verified Verification succeeded
Milestone

Comments

@joshunger
Copy link
Contributor

  • VSCode Version: Version 1.19.1 (1.19.1)
  • OS Version: MacOS Sierra 10.12.6

Open Changes and quickly clicking on Stage Changes on a large json file is blocked / hangs for 10 seconds

Steps to Reproduce:

dec-21-2017 17-14-51

  1. Create a large json file. In this exact example the size is 3.3M.
  2. ⌃⇧G to Show Source Control
  3. Quickly click on file and then click Stage Changes.

EXPECTING: added in under 1 second
ACTUAL: takes ~10 seconds

If I repeat it a second time it adds in under 1 second.

Reproduces without extensions: Yes

@vscodebot vscodebot bot added the git GIT issues label Dec 22, 2017
@joshunger joshunger changed the title Open Changes and quickly clicking on Stage Changes on a large json file is blocked for 10 seconds Open Changes and quickly clicking on Stage Changes on a large json file blocks / hanges for 10 seconds Dec 22, 2017
@joshunger joshunger changed the title Open Changes and quickly clicking on Stage Changes on a large json file blocks / hanges for 10 seconds Open Changes and quickly clicking on Stage Changes on a large json file blocks / hangs for 10 seconds Dec 22, 2017
@joaomoreno
Copy link
Member

joaomoreno commented Dec 22, 2017

How fast is git add vendor-data.json in the command line?

Also can you run everything again, but this time:

  • Open Developer: Show Running Extensions
  • Click the Start Extension Host Profile button at the top right, it's the circle
  • Perform the 10s seconds action
  • Stop profile

Which extension took up most of the CPU?

@joaomoreno joaomoreno added the info-needed Issue requires more information from poster label Dec 22, 2017
@joshunger
Copy link
Contributor Author

@joaomoreno

🐥 $ time git add data/vendor-data.json

real	0m0.078s
user	0m0.031s
sys	0m0.042s

image

@joaomoreno
Copy link
Member

joaomoreno commented Jan 8, 2018

@joshunger I assume the operation is really fast if you don't click the file first? Can you share the file with us?

@jrieken @alexandrudima I suspect this 10s delay occurs due to sending the 3.3MB text model over to the extension host. Any idea how can I confirm my suspicions?

@alexdima
Copy link
Member

alexdima commented Jan 8, 2018

@joaomoreno I can only imagine being able to reproduce out of source and adding log statements at the right places. You never know, but I don't suspect sending a 3.3 MB string between processes takes 10s.

@joshunger
Copy link
Contributor Author

@joaomoreno you got it. It takes < 1 second if I don't click the file first.

This is an export of a Google Spreadsheet. Yes, I can share it but not on this issue. Can I email you the file or share via OneDrive? If so what email? Or, do you have any large Google Spreadsheet JSON exports?

@joshunger
Copy link
Contributor Author

@joaomoreno @alexandrudima I sent you an email with the file

@joaomoreno
Copy link
Member

@joshunger Thanks for the file, got it. But I can't reproduce it over here. Are you sure this issue reproduces with code --disable-extensions?

@joshunger
Copy link
Contributor Author

@joaomoreno did you check in the file, make tweaks to it, and then check in the changes? GIF of repro -

jan-09-2018 16-57-52

@joaomoreno
Copy link
Member

@joshunger This is a great bug! I was able to reproduce and did a very deep investigation with @alexandrudima.

We still don't have a fix but understand why this happens.

There is a lot of IPC communication back and forth between our renderer process and extension host process. This is implemented over UNIX domain sockets (macOS, Linux) or named pipes (Windows). It seems that, for some reason, on macOS, large data sent via this mechanism gets split into 8192 byte size chunks. When sending this large JSON file back and forth, that ends up taking a LOT of chunks. In between each chunk, the reading process seems to yield to give way to other tasks to run. All of this would be fine if it weren't for... tokenization.

This file takes quite a while to tokenize. That should still be fine because our tokenization methods don't take a single JavaScript frame... we yield often such that other tasks get to run.

This leads us to your bug. Tokenization yields often, but not enough times to account for all those 8192 byte chunks to be read, since there are so many of them.

Note that this bug doesn't reproduce in Linux or Windows. It also doesn't reproduce if we increase the timeout with which we yield from tokenization to ~50ms.

Again, great bug! We'll get back on it with a fix...

@joaomoreno joaomoreno added electron Issues and items related to Electron bug Issue identified by VS Code Team member as probable bug and removed git GIT issues info-needed Issue requires more information from poster labels Jan 10, 2018
@joaomoreno joaomoreno added this to the January 2018 milestone Jan 10, 2018
@joshunger
Copy link
Contributor Author

joshunger commented Jan 10, 2018

Thanks! You're welcome! You should send out free stickers to top bugs I need a VS Code sticker for my laptop 🤔 Haha. Well, the great part about VS Code is that it is really performant compared to Atom!

@alexdima alexdima self-assigned this Jan 10, 2018
@joaomoreno
Copy link
Member

This reproduces in Node too.

I was able to debug Node 7.9.0 itself and trace it down to this line in libuv, in which the syscall write of a large buffer can only write 8192 bytes at a time. 😢

It also reproduces on Node 8.


Here's a sample which shows the behaviour:

const net = require('net');
const path = require('path');
const fs = require('fs');
const os = require('os');

const socketPath = path.join(os.tmpdir(), 'bugsocket');

try {
	fs.unlinkSync(socketPath);
} catch (err) {
	//noop
}

const buffer = new Buffer(1024 * 1024);

const server = net.createServer(socket => {
	socket.write(buffer);
	socket.end();
	server.close();
});

server.listen(socketPath);

const socket = net.createConnection(socketPath);
const map = new Map();
socket.on('data', data => {
	map.set(data.byteLength, (map.get(data.byteLength) || 0) + 1);
});

socket.on('close', () => {
	for (const [size, count] of map) {
		console.log(`got ${count} blocks of ${size} bytes`);
	}
});

On macOS I get:

got 128 blocks of 8192 bytes

While on Linux I get:

got 14 blocks of 65536 bytes
got 4 blocks of 22656 bytes
got 1 blocks of 40448 bytes

I found another poor soul having this issue, with no answers: https://stackoverflow.com/questions/44026984/macos-sierra-increase-named-pipe-capacity

I can't find any information whatsoever of how to change this buffer size.

@joaomoreno
Copy link
Member

joaomoreno commented Jan 11, 2018

I can reproduce it in C++ too:

#include <iostream>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <fcntl.h>
#include <thread>

#define BUFFER_SIZE 1024*1024
#define FIFO_PATH "/Users/joao/Desktop/fifo"

void readFifo() {
    auto fd = open(FIFO_PATH, O_RDONLY);
    
    if (fd < 0) {
        std::cerr << "Failed to open fifo for reading" << std::endl;
        return;
    }
    
    std::cout << "Ready to read!" << std::endl;
    
    char buffer[BUFFER_SIZE];
    auto total_bytes_read = 0;
    
    while (total_bytes_read < BUFFER_SIZE) {
        auto bytes_read = read(fd, buffer, BUFFER_SIZE);
        std::cout << "Read " << bytes_read << " bytes" << std::endl;
        
        total_bytes_read += bytes_read;
    }
    
    close(fd);
}

int main(int argc, const char * argv[]) {
    unlink(FIFO_PATH);
    
    if (mkfifo(FIFO_PATH, 0666) != 0) {
        std::cerr << "Failed to create fifo" << std::endl;
        return 1;
    }
    
    std::thread reader(readFifo);
    
    auto fd = open(FIFO_PATH, O_WRONLY);
    
    if (fd < 0) {
        std::cerr << "Failed to open fifo for writing" << std::endl;
        return 1;
    }
    
    char buffer[BUFFER_SIZE];
    
    for (int i = 0; i < BUFFER_SIZE; i++) {
        buffer[i] = 'a';
    }
    
    auto bytes_written = write(fd, buffer, BUFFER_SIZE);
    
    reader.join();
    std::cout << "Wrote " << bytes_written << " bytes" << std::endl;
    
    unlink(FIFO_PATH);
    return 0;
}

Interesting that here the writer reports that it managed to write everything at once, though the reader was only getting 8129 bytes at a time.

@joshunger
Copy link
Contributor Author

@joaomoreno nodejs/node#12921 related?

@joaomoreno
Copy link
Member

I've contacted the Darwin-kernel mailing list https://lists.apple.com/mailman/listinfo/darwin-kernel

@joaomoreno
Copy link
Member

@joshunger Yeah I was there already, I don't think it's related.

@joaomoreno joaomoreno added macos Issues with VS Code on MAC/OS X perf and removed electron Issues and items related to Electron labels Jan 26, 2018
@ghost
Copy link

ghost commented Jan 26, 2018

@joaomoreno I'm very happy you're investigating this. Thank you for your time!

@joaomoreno
Copy link
Member

Unfortunately I got no reply from the Darwin-kernel mailing list... Trying Darwin-userlevel and Filesystem-dev...

@joaomoreno
Copy link
Member

One month after and no answer from the mailing lists. Went the Stack Overflow way: https://stackoverflow.com/questions/48945547/change-named-pipe-buffer-size-in-macos

@joaomoreno joaomoreno modified the milestones: February 2018, Backlog Feb 23, 2018
@ghost
Copy link

ghost commented Feb 23, 2018

Thanks a lot, @joaomoreno. So as far as I understand you found a way how to fix it?

EDIT: oh my bad, you mean you created topic in Stack Overflow to get help since Darwin is not answering your emails...

@alexdima alexdima added the help wanted Issues identified as good community contribution opportunities label May 3, 2018
@svipas
Copy link
Contributor

svipas commented Jul 4, 2018

I'm on macOS and everything seems fine with Git except with yarn.lock (I guess with package-lock.json is the same) it takes about 10 seconds or even more to load it inside VS Code to show changes. And I don't want to see them because of this problem, but I accidentally press on it and it hangs whole Git for VS Code. It's a little bit annoying... Maybe somehow I can help?

@joaomoreno In https://stackoverflow.com/questions/48945547/change-named-pipe-buffer-size-in-macos I read the answer was saying You change the value by editing pipe.h and sys_pipe.c and recompiling the operating system's kernel. by Eljay. Which sounds very bad, so there's no solution for this? It only happens on macOS?

@joaomoreno
Copy link
Member

Yup, it's a macOS issue. There appears to be no solution for this.

@svipas
Copy link
Contributor

svipas commented Jul 4, 2018

But what about Git clients like GitKraken, Sourcetree, even IDEs like WebStorm. They seems do not have such issue, maybe it’s possible to fix?

@joaomoreno
Copy link
Member

They do not do things the same way as we do... read this for an analysis: #40681 (comment)

@svipas
Copy link
Contributor

svipas commented Jul 5, 2018

@joaomoreno thank you very much for explaining. I really appreciate your work. Sad, maybe somehow we can expect fix in new macOS releases...

@borekb
Copy link

borekb commented Sep 14, 2018

Hi, #57697 was closed as a dupe of this, I'm not 100% sure it's the same but when I accidentally click yarn.lock in the SCM panel, I can basically restart VSCode as the wait times are typically between a minute and two. My yarn.lock has ~3700 lines.

I observe this on both Mac and Windows. VSCode is not frozen – it is still responsive but diffing does nothing.

Does the fact that it's reproducible on Windows as well suggest anything?

@alexdima
Copy link
Member

@borekb I will reopen #57697, this issue is indeed mac specific.

@alexdima
Copy link
Member

This will not get fixed upstream anytime soon, so I have pushed the following mitigations on our side:

  • yield from tokenization using setImmediate instead of setTimeout(0), which is not delayed needlessly by 4ms.
  • yield from tokenization every 1ms.
  • this makes the "wait" 20 times faster, which is reasonable for large files e.g. checker.ts.

@alexdima alexdima removed the help wanted Issues identified as good community contribution opportunities label Oct 10, 2019
@alexdima alexdima modified the milestones: Backlog, October 2019 Oct 10, 2019
@roblourens roblourens added the verified Verification succeeded label Oct 30, 2019
@vscodebot vscodebot bot locked and limited conversation to collaborators Nov 25, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Issue identified by VS Code Team member as probable bug macos Issues with VS Code on MAC/OS X perf verified Verification succeeded
Projects
None yet
Development

No branches or pull requests

6 participants