-
Notifications
You must be signed in to change notification settings - Fork 285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Best way to avoid (sliced string)
?
#711
Comments
Not sure I understand what your question is. A sliced string itself is a small object, it's a pointer to the parent string + offset and length. In that respect you shouldn't worry about seeing them show up in heap snapshots. Slices do however prevent the parent string from reclaimed by the garbage collector. If that is your concern, see if |
@bnoordhuis However, I don't want to disable the usage of sliced strings in the entirety of VS Code (for 99% of the code base, I fully agree, they are indeed ignorable from a memory usage point of view). But I would like to avoid them in a specific place, when constructing a file in VS Code, so I would need a localized solution. As with any small number, when multiplied with a large number, it yields impressive results. Avoiding sliced strings leads to a save of 36MB for a file with more than 3MM lines. I was wondering if there is something more efficient than Thank you! |
Right, I see. There are a number of operations that flatten strings -
|
There’s also https://github.com/davidmarkclements/flatstr – It’s a side effect of |
Thank you! ❤️ |
Hi, I'm working on VS Code (based on Electron), and I'm looking into improving our memory usage when dealing with large files in microsoft/vscode#30180.
Our buffer implementation is basically using an array of lines. I am aware of the advantages and disadvantages of that, but I would still like to push it to its limits. Our file reading involves reading chunks and pushing those through
iconv-lite
to handle file encoding. Long story short, we have a bunch of ~64KB strings that we need to split into lines.The fastest way (that doesn't involve a native C++ node module) I've found so far is a using a simple
str.split(\r\n|\r|\n)
. This works very well, but it ends up creating a(sliced string)
for each line, all of which point to theparent
chunk. When dealing with files of 3MM lines, these objects add up and eliminating them can mean a few extra tens of MB of memory savings.Our current workaround to rid ourselves of the
(sliced string)
is here:I don't know if the above takes advantage of string interning or if it is the most efficient way to do this short of writing a native node module.
Do you have any idea? Thank you.
The text was updated successfully, but these errors were encountered: