-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize JSON #4077
Optimize JSON #4077
Conversation
f10138e
to
ba073f6
Compare
@@ -77,7 +77,7 @@ typedef unsigned char boolean; | |||
#define __JITTypes_h__ | |||
|
|||
// TODO: OOP JIT, how do we make this better? | |||
const int VTABLE_COUNT = 48; | |||
const int VTABLE_COUNT = 49; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh- didn't know this existed...is there a compile assert or test that exists to remind us to update this? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah there is a compile assert #Closed
lib/Runtime/Base/Constants.h
Outdated
@@ -163,6 +163,10 @@ namespace Js | |||
static const size_t GlobalCodeLength = _countof(_u("Global code")) - 1; | |||
static const size_t EvalCodeLength = _countof(_u("eval code")) - 1; | |||
static const size_t UnknownScriptCodeLength = _countof(_u("Unknown script code")) - 1; | |||
|
|||
static const charcount_t NullStringLength = _countof(_u("Null")) - 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The above lengths are all size_t - since you're touching this code anyway, can you update those to be charcount_t too? #Resolved
@@ -538,27 +538,25 @@ namespace Js | |||
} | |||
|
|||
// If exoticToPrim is not undefined, then | |||
if (nullptr != exoticToPrim) | |||
Assert(nullptr != exoticToPrim); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a fail-fast? What guarantees that this is always not null? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is null then result is nullptr and it's not valid to pass nullptr to GetTypeId (prefast took issue with this due to some of the annotations I added).
If we get to this point, we will always have called JavascriptOperators::GetPropertyReference
for varMethod
, and we always assign exoticToPrim
= varMethod
. #Closed
lib/Runtime/Library/LazyJSONString.h
Outdated
Field(JSONProperty*) jsonContent; | ||
Field(const char16*) gap; | ||
|
||
DynamicObject* ParseObject(_In_ JSONObject* valueList) const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: based on what these functions are doing, I'd prefer something like ReconstructVar/ReconstructObject/ReconstructArray for a slight improvement in readability #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Totally agree. We're not really parsing at all, are we? #Closed
{ | ||
const uint elementCount = valueList->Count(); | ||
PropertyIndex requestedInlineSlotCapacity = static_cast<PropertyIndex>(elementCount); | ||
DynamicObject* obj = this->GetLibrary()->CreateObject(true, requestedInlineSlotCapacity); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: annotate true with the parameter name for readability #Resolved
lib/Runtime/Library/LazyJSONString.h
Outdated
String | ||
}; | ||
|
||
typedef DListCounted<JSONProperty, Recycler> JSONObject; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be a dlist? Or would an slist suffice? Best I can tell, we always just iterate these lists from the beginning?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And actually, I wonder why JSONObject and JSONArray need to be separate types- can JSONObject just be represented as JSONArray- they both look like a list of JSONProperties + Count, and you might be able to avoid a bunch of intermediate allocations
In reply to: 147304974 [](ancestors = 147304974)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it could be, but problem with our SLists is that we don't have any way to append. Maybe I'm missing something, but seems like you can only push and pop, which would cause the list to be reversed. I could use SList and then reverse it, but I thought that might be more expensive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I can know up front how many properties I am going to enumerate for Objects? #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah, forgot append semantics were missing in slists. I wonder how expensive doing two passes with the static enumerator is- if its not too expensive, you might get nice locality benefits by precomputing the size and allocating an array instead of a list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although if possible, I'd do it heuristically here- if the number of properties is small, allocate an array- if its large, use a list since it would likely be allocating in a bucket that is generally sparse and cause fragmentation
lib/Runtime/Library/LazyJSONString.h
Outdated
struct JSONProperty | ||
{ | ||
Field(JSONContentType) type; | ||
Field(const PropertyRecord*) propertyRecord; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need both the property record and property name? Can we live with just one? By my rough count, this structure is at 17 bytes which is unfortunate since it'll get allocated in the 32 byte bucket, and the wasted space can add up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right I can probably remove the PropertyRecord case without any real impact. I was overoptimizing for the Parse scenario.
LazyJSONString::Parse() const | ||
{ | ||
// If the gap is a non-space character, then parsing will be non-trivial transformation | ||
for (charcount_t i = 0; i < this->gapLength; ++i) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the reason behind this check but I have a couple of questions:
- If the gap is a non-space character, will JSON.parse ever succeed? If not, can we either throw a straight up syntax error, or maybe cache some position/token info and then throw?
- If the check here fails, rather than just return null, I would eagerly serialize the string here and throw away the metadata since that's no longer needed (actually, GetSz should probably throw away the intermediate metadata anyway since we dont need to extend the lifetime of those objects). We could even just move the check to the constructor (although delay serialization till Parse is called)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I was thinking you might be able to have something perverse like a gap of
{"a":"WHY"},
and so if you do serialize{b:"Because I can!"}
you get serialized string of
{
{"a":"WHY"},{"b": "Because I can!"}
}
2a. I considered throwing away the metadata if this fails but was lazy and thought it wouldn't happen in practice.
2b. I was wary of throwing away the metadata in case people call parse multiple times on the same string (or need the string and then later call parse on it). But could be that isn't a realistic concern. #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yuck at 1. For 2, its probably good to release the metadata as soon as we know they're useless so that we don't waste memory. Also, would be nice to have a AutoPtr setting a flag indicating whether the complex parse failed or not and cache that- if it failed, we can throw the syntax error without having to retry the parse
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like failing here would be so rare that storing any additional data about it is a waste of memory.
However, I will throw away the metadata after flattening in case of complex gap, because that seems like a much more realistic scenario.
bool Has(_In_ RecyclableObject* object) | ||
{ | ||
JSONObjectStack* stack = this; | ||
while (stack != nullptr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Use list iterator macros? #Pending
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a real list, so not sure what macros you mean?
In reply to: 147308869 [](ancestors = 147308869)
class JSONStringBuilder | ||
{ | ||
private: | ||
ScriptContext* scriptContext; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: all of these fields are consts? like char16* const buffer etc? #Pending
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm filling the buffer, so that one isn't really a const. i can change some of these to be const, but tbh i find const members to be more effort then they are worth
In reply to: 147309108 [](ancestors = 147309108)
this->AppendJSONPropertyString(this->jsonContent); | ||
// Null terminate the string | ||
*this->currentLocation = _u('\0'); | ||
Assert(this->currentLocation == buffer + bufferLength - 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is bufferLength
only ever used in this assert? Should it be covered by an #ifdef DEBUG
in that case?
this->SetStringGap(JavascriptString::FromVar(space)); | ||
break; | ||
case TypeIds_StringObject: | ||
this->SetStringGap(JavascriptConversion::ToString(space, this->scriptContext)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this correct? Some experimentation seems to suggest that non-string non-number values here have no effect. #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming this corresponds to the third parameter of JSON.stringify(obj, replacer, space)
#Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh nevermind; I misread this as the case for Object
, not StringObject
. #Closed
const char16* | ||
LazyJSONString::GetSz() | ||
{ | ||
const charcount_t allocSize = SafeSzSize(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this also check for whether m_pszValue
is already set from a previous call to this function, and just return that to avoid re-serializing repeatedly? Or does that already happen somehow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think GetSz() is generally called, and GetString() does this check, but there might be some cases where this could happen so I'll add in this check.
Var toJSON = nullptr; | ||
PolymorphicInlineCache* cache = this->scriptContext->Cache()->toJSONCache; | ||
PropertyValueInfo info; | ||
PropertyValueInfo::SetCacheInfo(&info, nullptr, cache, false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like with your annotation changes SetCacheInfo
expects a non-null second parameter, so prefast is complaining #Resolved
} | ||
if (value == nullptr) | ||
{ | ||
Assert(holder != nullptr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this might need to be an AnalysisAssert
since prefast is complaining on the next line that holder
might be null. #Resolved
Field(JSONArray*) arr; | ||
}; | ||
|
||
JSONProperty() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like clang is complaining about a lack of a copy constructor for JSONProperty
which is needed for its use in a DList
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specifically because of const-ness: candidate constructor (the implicit copy constructor) not viable: 1st argument ('const Js::JSONProperty') would lose const qualifier
@MikeHolman did you look at / do you remember the change in memory consumption ? |
b25030c
to
274a1e9
Compare
@obastemur Under what scenario are you interested? I didn't do that much memory testing, but in cases where JSON string is a temporary object it doesn't seem to make much difference in memory. |
0222743
to
e3870e2
Compare
@MikeHolman I didn't test it yet but the description says 3% acme-air win. Curious if number of allocations is the reason. |
bf47771
to
dfd0d38
Compare
@digitalinfinity I think you're right that we can do better than SList in some cases for objects, but maybe that is worth saving for a future checkin? |
9d12daf
to
8a4074f
Compare
lib/Common/DataStructures/DList.h
Outdated
@@ -594,6 +594,17 @@ class DListCounted : public DList<TData, TAllocator, RealCount> | |||
} \ | |||
} | |||
|
|||
#define FOREACH_DLISTCOUNTED_ENTRY(T, alloc, data, list) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this isn't used any more now that you switched to SLIST
s
Merge pull request #4077 from MikeHolman:lazyjson This change represents a major overhaul to JSON.stringify. The new architecture works as follows: We first calculate the length of the string we need and create a side data structure with the values we will need, but do not actually create any concat strings, etc. like we used to do. If/when the string is needed, we create a buffer using the length and metadata collected in the first pass, filling it element by element. If the string is passed to JSON.parse, we use our metadata to construct an object without needing to parse the string. By avoiding creation of all the intermediate concat strings, this makes JSON.stringify much more efficient. In micro-benchmarks I'm seeing JSON.stringify take around 40% less time than today. And in case the string is actually sent to JSON.parse, that now takes 70% less time. This improves Speedometer 2 geomean by about 3%. Also improves AcmeAir by about 3%.
This change represents a major overhaul to JSON.stringify.
The new architecture works as follows:
We first calculate the length of the string we need and create a side data structure with the values we will need, but do not actually create any concat strings, etc. like we used to do.
If/when the string is needed, we create a buffer using the length and metadata collected in the first pass, filling it element by element.
If the string is passed to JSON.parse, we use our metadata to construct an object without needing to parse the string.
By avoiding creation of all the intermediate concat strings, this makes JSON.stringify much more efficient. In micro-benchmarks I'm seeing JSON.stringify take around 40% less time than today.
And in case the string is actually sent to JSON.parse, that now takes 70% less time.
This improves Speedometer 2 geomean by about 3%. Also improves AcmeAir by about 3%.