2,889,352 events, 1,460,294 push events, 2,280,528 commit messages, 163,244,341 characters
Holy shit this one was annoying. Implementing 327 - Evaluating Simple C Expressions @ UVa. Closes. Closes #276.
Modified the way Feebas spawns in Route 119 based on ORAS Thanks to Surskitty for the help adding a dup of the regular water tile and the inspiration to just go and do this. Really, the whole 6 random tiles gimmick freaking sucks and it doesn't even make sense.
Also added a special to manipulate the friendship value of a Pokémon in the heat of the moment.
god fucking shit i hate this i hate bloodsuckers i want this to end
Fuck You Twitter :-(
See #2639 and https://git.io/Jvhw6 for more info
Update index.html
love it when you forget 1 damn thing
thx to my super pc for crashing last night, love you baby girl
"9:30am. I am up. I had some inspiration. No, I am not ready to start. I am just a tiny bit short.
9:35am. Before I move on to NNs, I actually need to test some of my implicit policy averaging ideas in the tabular case. That is the only thing that will give me a peace of mind. Also instead of assuming it has not been done, I should look for CFR variants that have already done this.
https://papers.nips.cc/paper/2012/file/3df1d4b96d8976ff5986393e8767f5b2-Paper.pdf Efficient Monte Carlo Counterfactual Regret Minimization in Games with Many Player Actions
In this paper, we present a new MCCFR algorithm, Average Strategy Sampling (AS), that samples a subset of the player’s actions according to the player’s average strategy.
Yes, I've definitely been wondering whether it would be possible to sample according to the average strategy and how it would work.
I mean, I myself could conjecture that it would work, but I cannot build a foundation on dreams. Hard data is what a foundation is built on.
I'll look into average strategy sampling. After that, randomized linear algebra. I want to understand what happens when you optimize a noise vector. I'll revisit the AdaHessian paper as well. I do not understand why multiplying it with a random vector works.
My 'key the data' idea is just something I thought up in the spur of the moment. If only I could go a tiny bit further the foundation could be established.
I'll also need to test the duality gap method for optimizing GANs.
I do not really need a method that beats CFR+ in the tabular case. A constant factor here or there is no a big deal. It is more important for me that the algorithm translates cleanly to NNs. Regular CFR does not. But it is not that far. I can imagine a variant being closer.
I need to find it.
9:50am. Though the GMN paper does not have samples, I doubt the authors are lying. They put a great deal of effort elsewhere. Therefore, it is sooner than I thought that GAN training issues would be resolved. This has large implications.
Things are moving forward. They are not the same as they were in 2018.
I need to work harder in order to find my power. Yesterday's break did me good. Let me start with average strategy sampling. I'll move from there.
Start!
10:05am. Using the average policy as the sampling policy would be really easy to implement. But this paper is a bit different than what I want. I want to see if I can get rid of the current policy and optimize the average one directly. Ultimately I am the one who is going to have to figure that out.
10:15am. 5/9. The algorithm here is a lot more complicated than I thought it would be.
I can't see how this would be doing average strategy sampling. It definitely feels like it is sampling the current policy...Ah, no wait. It is using the s
when calculating p
. That is good.
10:30am. https://openreview.net/forum?id=YMsbeG6FqBU The Advantage Regret-Matching Actor-Critic
This rejected paper has some interesting discussion.
We have not compared our work with DREAM experimentally as DREAM is not an asymptotically consistent implementation of CFR: it does not preserve the convergence guarantees of MCCFR. DREAM would give biased results even in tabular settings.
I had no idea about this. Was this mentioned in the DREAM paper? I think not.
CFR calculates an average policy weighted by player’s i reach probability only. Such averaging only works when the policy of an opponent is not changing in time.
10:55am. Had to take a short break. Let me take a look at this paper. It should be interesting.
Very similar problem applies when an average policy is calculated in section 5.3 . CFR calculates an average policy weighted by player’s i reach probability only. Such averaging only works when the policy of an opponent is not changing in time. Such calculation was not problematic for DeepCFR, as external sampling emulates a time independent uniform sampling policy. However, when outcome sampling is used, both policies within the replay memory get correlated. Technically, DREAM paper averages policy weighting it by the total reach and not only player’s i reach and thus is incompatible with CFR. This can be fixed by using appropriate importance weights, but this would lead to a lot of variance and has not been done in DREAM’s paper.
Oh wow. So I did do the right thing by calculating by player's i reach and not doing the thing in the paper. I did not see any difference in variance when I tried it out though.
We spoke to the original authors and realized that there was a misunderstanding on our part. We retract our claim of DREAM not being sound. Apologies to the authors and reviewers.
So it does the right thing after all?
11:10am. I'll have my work cut out for me. If I manage to make that NN agent by the time the yearly review comes around, I can count myself lucky. I'll do my best.
The problem is to define a model-free variant of CFR using only sampled trajectories and general (domain-independent) generalization via functional approximation, without the use of importance sampling commonly used in Monte Carlo CFR, as it can cause excessive variance in large domains.
Hmmm, yeah, maybe the importance sampling corrections in the replay buffer I will be doing will end up magnifying the variance.
Consider if I passed an one hot vector like in the tabular setting. Then the combination of the optimizer input adjustments and the replay buffer resampling would end up double multiplying themselves.
11:20am. No, the way I intended to reweight by divign the self player probability so it goes towards 1 is not right. I should be reweighting towards an uniform distribution after all. My own way of doing it, especially after factoring in the division I will do based on the inputs would sharply amplify the non-root nodes. I thought about this months ago and came to the wrong conclussion after all.
Stability should come over all.
11:35am. My focus is low. I see that this method saves old weights. I do not like that.
11:45am. No it has to be right. Maybe it is the initial node which needs to be adjusted. Forget that. I can work the weighting scheme out when the time comes.
https://openreview.net/forum?id=rJx4p3NYDB Lazy-CFR: fast and near-optimal regret minimization for extensive games with imperfect information
Let me just skim these few papers and then I'll take a look at regression CFR. Maybe it does something smart with noise.
The ARMAC paper would require a lot of time for me to reason it all out, and given that it does not do implicit policy averaging, I'll give it a pass.
11:50am. Ah, yes, in the probmods book, wasn't there a Bayesian thing for adapting to different distributions. My mind is sparse on the details. Well, nevermind.
My main goal right now is to find some data to corroborate my hypothesis whether adding noise will allow me to implicitly average policies.
11:55am. LazyCFR is complex. Forget it.
https://www.cs.cmu.edu/~sandholm/dynamicThresholding.aaai17.pdf Dynamic Thresholding and Pruning for Regret Minimization
Let me check out this paper
https://arxiv.org/abs/1810.03063 Solving Large Sequential Games with the Excessive Gap Technique
There has been tremendous recent progress on equilibrium-finding algorithms for zero-sum imperfect-information extensive-form games, but there has been a puzzling gap between theory and practice. First-order methods have significantly better theoretical convergence rates than any counterfactual-regret minimization (CFR) variant. Despite this, CFR variants have been favored in practice. Experiments with first-order methods have only been conducted on small- and medium-sized games because those methods are complicated to implement in this setting, and because CFR variants have been enhanced extensively for over a decade they perform well in practice. In this paper we show that a particular first-order method, a state-of-the-art variant of the excessive gap technique---instantiated with the dilated entropy distance function---can efficiently solve large real-world problems competitively with CFR and its variants. We show this on large endgames encountered by the Libratus poker AI, which recently beat top human poker specialist professionals at no-limit Texas hold'em. We show experimental results on our variant of the excessive gap technique as well as a prior version. We introduce a numerically friendly implementation of the smoothed best response computation associated with first-order methods for extensive-form game solving. We present, to our knowledge, the first GPU implementation of a first-order method for extensive-form games. We present comparisons of several excessive gap technique and CFR variants.
I should look into what the excessive gap technique is.
12:05pm. The algorithm is complex. Skip.
12:30pm. Had to take a short break.
https://arxiv.org/abs/1608.04481 Lecture Notes on Randomized Linear Algebra
https://arxiv.org/abs/2002.01387 Randomized Numerical Linear Algebra - Foundations & Algorithms
My sense is that I won't get what I want simply by looking up CFR papers. For this subject, I really should ask Marc Lanctot if there has been work on implicit policy averaging.
But not just yet. I'll leave asking other people what could be dumb questions as the last resort.
Topics covered include norm estimation; matrix approximation by sampling; structured and unstructured random embeddings
Those last two seem like what I'd want!
12:40pm. Let me stop here for a bit. I think that first paper might be a bit out of data, so I'll go with the later one. I'll dedicate myself to studying RNLA for a day or two. The way the Hessian trace is estimated in AdaHessian is something I want to study more in depth anyway and the paper starts off with trace estimation. It is like it put all the things I am actually interested in right at the start. That is always a good sign.
I get the feeling that I mostly wasted the morning apart from discovering average strategy sampling in CFR. I am really happy that it works better than sampling the current policy.
In fact, if my ideas works out, average strategy sampling is something that will happen naturally in the NN setting.
Time for breakfast."
05/03/2021
You will break my heart if you take advantage of my kindness and open heart to make me regret sharing. Be kind and forgive my desire to display enough anti-patterns to make sure parents don't try to encourage their kids to be like me. Everytime I was in their shoes I promised myself I would do everything I can to make sure that I display my love through my work but never let myself become a brand. It's hard enough having a 4 letter last name.
SIXTEN DEMO BUILD. Finished imp state graph, removed old/incomplete enemies from the field temporarily. Added the jankiest title screen of my goddamn life. Added two sample tracks. Hacksaw bandaided bounds into the game for Azumi. Fixed hitboxes flipping correctly for Azumi, might still be broken for enemies, but nobody will notice hopefully :)
Hair/beard control
Jesus holy Christ let this cursed shit be done this time
Lots of Bugfixes 🐜🕷🦟
- Safari seems to have some issues with service worker registrations. [shocker]
Sometimes, unpredictably, serviceWorker.getRegistration() or unregister() might take ages to complete, (and evidently has no built in timeout mechanism) leaving users in a limbo during updates if Safari casually decides to take the day off.
And since it’s an async task, and promises don’t have a built-in timeout mechanism, I had to add a promisified 15 second timeout using Promise.race.
It’s less than ideal, and it’s hacky, and I hate it. But come on Apple … please stop underfunding the Safari team… It’s an incredibly promising browser that is ruined by your ambition to become the first quadrillion dollar company.
—
-
Fixes #116
-
Fixes #118
-
increased small font sized things in a few places
-
this should fix documentActionButtons hints not showing up. We somehow didn’t take a note for this – but iirc this was overflow:hidden to prevent a layout problem on certain machines (i.e. safari on mac or windows specific sth) to prevent overscrolling in a specific scenario and OS. But having it overflow:hidden fucks up documentActionButtons’ tooltips from showing.
-
A better mobile document tooltip experience for mobile devices. Made it so that mobile document tooltip is wider on devices with screens larger than 368px, and took out list item actions from devices that have screens smaller than 368px. This way we can have 2 super clean rows instead of 3 rows, which makes typing / styling very difficult on not-so-tall devices like iPhone 6/7/8/11/12 mini etc.
-
update button better line height
-
better font rendering for menu & upgrade
-
mini fix to photos upgrade button
-
fixes pin sidebar icon position bug
-
fixes move button for safari
-
if there are <100 tagged search results, we show photos themselves in results, and clicking on photos now load photos by first loading the album.
-
We now pass “jpg” as fallback extension for photos
-
Don’t load album for search results if it’s already loaded
-
moved download one up on landing page
-
Won’t be checking ECONNABORTED / timeouts anymore
I remember(ed)
when I first went to school They said, Don't be a joker, don't be a fool Pledge your allegiance to the red, white and blue Don't expect your country to do nothin' for you They said your forefathers loved you but I only had one And I watched him die in the heat of the sun Suckin' up the bullets of a policeman's gun There was nothin' I could do but stay away In the U.S.A. you gotta take your chances If you plan on stayin' free They call this the land of the living But they're trying to make a dead man out of me I remember when I first hit the road I was runnin' from the lenders of the money I owed Came to Austin, heard a knock at my door Crossed another border and I'm runnin' some more They say they're gonna get me but it ain't happened yet Burnin' my fingers on my last cigarette It's time to remember time to forget Nothin' I could do but get away I remember all the stories I heard How a man's supposed to be 'bout as free as a bird My brother's in prison, my father's dead Me, I'm tired of living with a price on my head I wonder if there's a place to be In the whole wide country for a fella like me Name's in the paper, face on T.V. Nothin'I can do but get away
1.2 - 'A happy little Family'
-
People are now less likely to marry their cousins
-
Improved code where I could
-
Tweaked some numbers
-
Prayed to the Machine Spirit that no bugs where introduced
-Note: making the cousin system work was a fucking pain in the ass because it feels like I could optimize it somehow more
nyaa, pup selector fixed, shows all 75 results per page now elinks config better fucking st scrollback again, scroll sucks ass uhosts fixup
Fixes donators not being able to use the donation tab because kmc is a shitty coder and oh my fucking god why are you so bad i hate this i hate savefiles who came up with the idea of savefiles theres like 30 migrations in them because WHY THE FUCK NOT hi dad hi mom (#10750)
-
Reset donor hat and item because someone doesnt know how to code
-
Part 2
-
Update preferences_savefile.dm
Update base for Update on "[codegen] split out backend-specific information from NativeFunction in the model"
Data model change in the codegen, which splits backend-specific information out of NativeFunction
Currently in the codegen, native_functions.yaml has backend-specific information about each operator that is encoded directly into the data model, in the NativeFunction
object. That's reasonable, since the native_functions.yaml is the source of truth for information about an operator, and the data model encodes that information into types.
Now that external backends can use the codegen though, that information is technically incomplete/inaccurate. In another PR, I tried patching the information on the NativeFunction
object with the additional external information, by updating the dispatch
entry to contain the external backend kernel name and dispatch key.
Instead, this PR tries to split out that information. The NativeFunction
class contains all information about an operator from native_functions.yaml that's backend-independent and is known never to change regardless of what extra information backends provide. We also build up a backend "index", which is basically a mapping from [backend] -> [backend-specific-metadata]. Reading in an external backend yaml just involves updating that index with the new backend.
There were a few places where NativeFunction
used the dispatch table directly, that I encoded as properties directly on the NativeFunction object (e.g. is_abstract
). They were mostly around whether or not the operator has a composite kernel, which isn't something that's going to change for any external backends.
This has a few advantages:
- We can more easily re-use the existing logic in
native_function.py
andregister_dispatch_key.py
for both native and external backends, since they both involve a NativeFunction + a particular backend index - The data in the data model will be the same regardless of how the codegen is run. Running the codegen with a new external backend doesn't change the data inside of NativeFunction or an existing backend index. It just adds a new index for that backend.
- There are several of codegen areas that don't care about backend-specific information: mostly the tracing and autograd codegen. We can reason about the codegen there more easily, knowing that backend-specific info is entirely uninvolved.
An alternative to this split would be to augment the NativeFunction objects with external backend information at the time that we create them. So the external codegen could read both native_functions.yaml and the external backend's yaml at the same time, and construct a NativeObject with a full dispatch table (including the XLA entry), and the correct setting of structured (taking into account both yamls). One disadvantage to this approach is that NativeFunction objects now contain different stuff depending on how you ran the codegen, and you have to make sure that any changes to the codegen can properly handle all the different variants.
Removed 3 classes, which are used by the external codegen:
- ExternalBackendFunction
- ExternalBackendFunctionsGroup
- ExternalBackendMetadata
And added two new ones:
- BackendIndex
- BackendMetadata
BackendIndex
contains any info that's specific to that backend, plus a mapping from operator names to backend specific metadata about the operator. One example of backend-specific info that's not operator-dependent is the fact that XLA prefers to implement functional kernels instead of out kernels (and so when they eventually mark an op as structured, they're going to mark the functional op and not the out op).
BackendMetadata
contains info specific to an (operator, backend) pair. Right now, that's just (a) the name of the kernel, and (b) whether or not that operator is structured.
I wanted to get this PR up earlier so I could get feedback, but there are a few things I want to call out:
Dealing with structured
.
This PR separates out the notion of structured
into two bits of information:
- Does [operator] have a meta() function. This is backend-agnostic, and is represented by the
structured
property onNativeFunction
, same as before. This is used, e.g., to decide what signatures to add toMetaFunctions.h
. - Does [operator, backend] have an impl() function. This is backend dependent; even though technically all in-tree backends are forced to write impl() functions for an operator when we port the op to structured in native_functions.yaml, out-of-tree backends can decide to opt in independently. This is represented as a property on
BackendMetadata
. This is used in most other cases, e.g. inRegisterDispatchKey
when we're deciding whether or not to gen a structured or unstructured wrapper.
I also baked is_structured_dispatch_key
directly into each BackendIndex. So for operators marked "structured" in native_functions.yaml, their corresponding CPU/CUDA BackendIndex entries will be marked structured, and all others (except for potentially external backends) will not.
I ended up trying to deal with structured
in this change since it's technically backend dependent (XLA can opt kernels into structured separately from in-tree ops), but that may have been too ambitious: it's technically not relevant until we actually add support for structured external kernels. If it's not clear that this is the right path for dealing with structured and we want to push that off, I'm fine with backing out the bits of this PR that make structured
backend-dependent. I don't see anything too controversial related to structured in the change, but I tried to call out any areas in the comments
Localizing the fact that external backends follow Dispatcher convention.
Another thing that's sort of backend specific that I didn't totally address in this PR is the fact the fact that in-tree backends follow the Native API while external backends follow the Dispatcher API. I painted over that in native_functions.py
by adding a helper, kernel_signature
, that takes in a native function and gives you the "correct" signature for the specified backend- NativeSignature for in-tree backends, and DispatcherSignature for out-of-tree backends. In order to make that fully useable though, we'll need NativeSignature
and DispatcherSignature
to have matching interfaces. I didn't bother with that in this PR, which is why gen_external_aten_fallbacks.py
still has a bunch of direct references to the dispatcher API. Thinking of adding it in a later PR but wanted to see if anyone has other opinions.
Maybe is_external()
shouldn't even be a property on the BackendMetadata, and anything the codegen does that requires asking for that information should just be better abstracted away.
Thoughts on the BackendIndex
/ BackendMetadata
breakdown.
One thing that's annoying right now is that to query for various pieces of metadata, you call helper functions like backend_index.structured(f)
, which queries that particular backend and tells you if that specific NativeFunctionGroup is structured for that backend. It has to return an Optional[bool]
though, since you have to handle the case where that operator doesn't have a kernel for that backend at all. So users of those helpers end up with a bunch of optionals that they need to unpack, even if they know at some point that the result isn't None. I think it would be easier instead to just store the NativeFunction object as a field directly on the BackendMetadata. Curious if there are any other opinions on a better way to model it though.
[ghstack-poisoned]