-
Notifications
You must be signed in to change notification settings - Fork 30k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lazy stat dates #12818
Lazy stat dates #12818
Conversation
Linter: fs.js
266:2 error Missing semicolon semi |
Perhaps if we used symbols instead of underscore-prefixed properties this wouldn't be semver-major? |
It definatly needs further thinking 🤔 since all test pass, and it is semantically equivalent... |
@sciolist 1 test fails on Windows test\parallel\test-fs-stat.jsstating: c:\workspace\node-test-binary-windows\RUN_SUBSET\0\VS_VERSION\vcbt2015\label\win10\test\parallel\test-fs-stat.js Stats { dev: 2329316316, mode: 16822, nlink: 1, uid: 0, gid: 0, rdev: 0, blksize: undefined, ino: 119908340078909120, size: 0, blocks: undefined, _atim_msec: 1493853008513, _mtim_msec: 1493853008513, _ctim_msec: 1493853008513, _birthtim_msec: 1492182300137 } Stats { dev: 2329316316, mode: 33206, nlink: 1, uid: 0, gid: 0, rdev: 0, blksize: undefined, ino: 41939771530044640, size: 3910, blocks: undefined, _atim_msec: 1493852676143, _mtim_msec: 1493852676150, _ctim_msec: 1493852676150, _birthtim_msec: 1492182460203 } isDirectory: false isFile: true isSocket: false isBlockDevice: false isCharacterDevice: false isFIFO: false isSymbolicLink: false (node:168) [DEP0013] DeprecationWarning: Calling an asynchronous function without callback is deprecated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM if CI & CITGM are happy, and I’d prefer semver-major too
lib/fs.js
Outdated
this._atim_msec = atim_msec; | ||
this._mtim_msec = mtim_msec; | ||
this._ctim_msec = ctim_msec; | ||
this._birthtim_msec = birthtim_msec; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might as well make these public? ¯\_(ツ)_/¯ If not, I agree that using Symbols may be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer the Symbol approach for these.
I would rather replace rather than add duplicate public properties. |
@mscdex It would still be semver-major unless we can force the atime/mtime/etc properties to show up in
(compared to underscore-prefixed), so it's doable, but it does erase about half the performance improvement. The preferable approach would be just changing.. |
return this._atime !== undefined ? | ||
this._atime : | ||
(this._atime = new Date(this._atim_msec + 0.5)); | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can eliminate some code duplication here by having a factory function for the getters... e.g.
function makeGetter(name, src) {
return function() {
return this[name] !== undefined ?
this[name] : (this[name] = new Date(this[src + 0.5));
};
}
// ...
Object.defineProperties(Stats.prototype, {
atime: {
configurable: true,
enumerable: true,
get: makeGetter('_atime', '_atim_msec'),
// etc
}
});
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another idea here would be to use another defineProperty within the getter to replace the value once the date is created...
var m = {};
Object.defineProperty(m, 'a', {
configurable: true,
enumerable: true,
get() {
const val = 1;
delete this.a;
Object.defineProperty(this, 'a', { configurable: true, enumerable: true, value: val } );
return val;
}
});
The getter is called once, during which time it is deleted, and the static value is set... and will be returned every time after. Should have a slightly better performance profile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@TimothyGu suggested that in the previous PR, it wasn't performant enough for the "single access" case - #12607 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, that's fine then. I'd still suggest the factory function to avoid the code duplication
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, yeah I have tried these variations.
Unfortunately a single defineProperty
is more expensive than 4 date creations, so you'd lose the benefits of the laziness if you need to access even a single date.
this[somevariable]
is slow enough compared to this.mtime
that the difference becomes pretty significant in a function like stat, that isn't very heavy to begin with. That drops accessing one time-field twice (my test is if (stat.ctime.getTime() !== stat.ctime.getTime()) throw err;
) to a 65% improvement over the non-lazy code, down from a 105% improvement if done the ugly way. But I can definitely change to that if it's preferable.. I didn't write it this way because I think it's beautiful. :)
@mscdex If you'd like to run the test suite on it, I can push in a change where we just change to.. |
My current preference is this: For v8.x and older: use symbols to hide the raw values better and use the getters to return For v9.x (or whatever future major version) and newer: remove the getters and change the symbols to normal properties, using the same names as the old |
Can you detail what changes ( As the PR currently stands, there are multiple issues I have with it. In fact, I think after resolving all the issues below it might just be better for us to keep the code as it is.
|
@TimothyGu FWIW in the past when working on a node addon I observed |
If the symbols usage really does cause that much of a performance regression, then I would rather not do this at all and just wait until v9.x or later to switch from |
Interesting. I've never really used it myself before so I'll take your word for it, though for this issue I'm really talking about the (new)
I'd be more comfortable if those properties are suffixed with |
I doubt it. It would probably be about the same I imagine. |
@mscdex why not add the raw number as full properties (no underscore) and have it |
@TimothyGu I agree that using symbols is usually not an issue.. and in this case it's not the symbols that are a problem, it's using bracket notation to access the properties vs dot notation (using strings with brackets gives nearly exactly the same results.) Diff: diff --git a/lib/fs.js b/lib/fs.js
index e834f25..d8acf2e 100644
--- a/lib/fs.js
+++ b/lib/fs.js
@@ -170,6 +170,16 @@ function isFd(path) {
return (path >>> 0) === path;
}
+const atim_msec_sym = Symbol('_atim_msec');
+const mtim_msec_sym = Symbol('_mtim_msec');
+const ctim_msec_sym = Symbol('_ctim_msec');
+const birthtim_msec_sym = Symbol('_birthtim_msec');
+
+const atime_sym = Symbol('_atime');
+const mtime_sym = Symbol('_mtime');
+const ctime_sym = Symbol('_ctime');
+const birthtime_sym = Symbol('_birthtime');
+
// Constructor for file stats.
function Stats(
dev,
@@ -196,54 +206,37 @@ function Stats(
this.ino = ino;
this.size = size;
this.blocks = blocks;
- this._atim_msec = atim_msec;
- this._mtim_msec = mtim_msec;
- this._ctim_msec = ctim_msec;
- this._birthtim_msec = birthtim_msec;
+ this[atim_msec_sym] = atim_msec;
+ this[mtim_msec_sym] = mtim_msec;
+ this[ctim_msec_sym] = ctim_msec;
+ this[birthtim_msec_sym] = birthtim_msec;
+ this[atime_sym] = undefined;
+ this[mtime_sym] = undefined;
+ this[ctime_sym] = undefined;
+ this[birthtime_sym] = undefined;
}
fs.Stats = Stats;
-Object.defineProperties(Stats.prototype, {
- atime: {
- configurable: true,
- enumerable: true,
- get() {
- return this._atime !== undefined ?
- this._atime :
- (this._atime = new Date(this._atim_msec + 0.5));
- },
- set(value) { return this._atime = value; }
- },
- mtime: {
- configurable: true,
- enumerable: true,
- get() {
- return this._mtime !== undefined ?
- this._mtime :
- (this._mtime = new Date(this._mtim_msec + 0.5));
- },
- set(value) { return this._mtime = value; }
- },
- ctime: {
- configurable: true,
- enumerable: true,
- get() {
- return this._ctime !== undefined ?
- this._ctime :
- (this._ctime = new Date(this._ctim_msec + 0.5));
- },
- set(value) { return this._ctime = value; }
- },
- birthtime: {
+function makeStatTime(cacheSym, msecSym) {
+ return {
configurable: true,
enumerable: true,
get() {
- return this._birthtime !== undefined ?
- this._birthtime :
- (this._birthtime = new Date(this._birthtim_msec + 0.5));
+ let value = this[cacheSym];
+ if (value === undefined) {
+ value = this[cacheSym] = new Date(this[msecSym] + 0.5);
+ }
+ return value;
},
- set(value) { return this._birthtime = value; }
- },
+ set(value) { return this[cacheSym] = value; }
+ };
+}
+
+Object.defineProperties(Stats.prototype, {
+ atime: makeStatTime(atime_sym, atim_msec_sym),
+ mtime: makeStatTime(mtime_sym, mtim_msec_sym),
+ ctime: makeStatTime(ctime_sym, ctim_msec_sym),
+ birthtime: makeStatTime(birthtime_sym, birthtim_msec_sym),
});
Stats.prototype.toJSON = function toJSON() { After some testing, if I use a local variable to avoid hitting I agree with all your 4 points, using underscores is ugly, Date should probably never have been used here. I would expect this PR as is to break a fair amount of user code. It'd be interesting to see if
I'd say adding properties throwing deprecation errors could work. just removing the current properties would probably lead to a lot of problems as well.. |
I read through a 100 or so instances, there's mainly cloning using https://www.npmjs.com/package/clone-stats. I did not see anything that would obviously break. We also ran CITGM, and it seemed fine (still digging through the logs)
👍
Is there reason not to just add them in addition to the "legacy" Date values? |
Because I don't think that having the duplicate data is useful. It especially doesn't make much sense if the plan is to eventually remove the |
@mscdex I really doubt it’s feasible to remove the |
@addaleax ok, what if we add doc and then runtime deprecations at some point to notify of the upcoming data type change? |
@mscdex We’d have to do it anyway, but I still think this would cause too much trouble :/ Sorry for being so negative here, I know “this won’t work” isn’t helpful, but I really think it won’t. |
I'm also pessimistic... that's why I'd rather have the new members, and compatibility lazy getters, it's a reasonable compromise.... We get the raw number, and don't pay for the |
FWIW I tried using ObjectTemplate for Stats objects, and indeed it's not very performant even compared to the status quo. |
This needs a rebase. 😄 |
rebased. So if this is going to happen, we'd need a way of keeping the fields around in The somewhat reasonable options I can think of:
It would be nice with some real world benchmark thats particularly stat heavy, to see if any potential optimization seems worthwhile. |
My intuition is that |
FWIW I already tried ObjectTemplate in https://gist.github.com/TimothyGu/c70e7cb557d1290b5c2150111eb95a01 with unsatisfactory results. Of course (generic) you are welcome to explore further but from @mscdex's experiences it might not be a fruitful venture. |
* convert ’ to ' to turn md file to ASCII Fixes: nodejs#8276 Refs: nodejs#12607 Refs: nodejs#12818 Refs: nodejs#13256
PR reopened after landed & reverted
PR-URL: nodejs#13173 Fixes: nodejs#8276 Refs: nodejs#12607 Refs: nodejs#12818 Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Brian White <mscdex@mscdex.net>
I think it's pretty clear that at this point there isn't a way to do lazy dates without compromising compatibility or performance. I'll close this PR. @sciolist Thanks a lot for creating this PR and following through on our requests. Sorry it didn't work out at the end, but I guess we all learned something from this process 🥇 |
See #12607
In order to improve performance when using fs *stat-functions, we can delay creation of the 4 date-objects until they are used.
This is a
semver-major
change:Object.keys(statResult)
,Object.getOwnPropertyNames/Descriptors(statResult)
will no longer returnatime
,mtime
,ctime
,birthtime
, and WILL return_atime
,_mtime
,_ctime
,_birthtime
,_atim_msec
,_mtim_msec
,_ctim_msec
,_birthtim_msec
for(x in statResult){}
will iterate over_atime
,_mtime
,_ctime
,_birthtime
,_atim_msec
,_mtim_msec
,_ctim_msec
,_birthtim_msec
Benchmark
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passesAffected core subsystem(s)
lib