-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLOWDER and tests #25
base: main
Are you sure you want to change the base?
Conversation
Pull Request Test Coverage Report for Build 11807958523Details
💛 - Coveralls |
can you describe what's the purpose of this PR? |
Co-authored-by: Adam Richie-Halford <richford@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nevermind. I implemented my own comments since I was in there changing other things anyway.
The test coverage indicates that there are some branches that are not tested in stopping.ts. But I don't think that's a blocker for this PR. I think it's ready for review by @AnyaWMa and to start implementing in ROAR-Letter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi I read through a few places, and added some questions. Many of them are probably due to my ignorance, but i just need to know more details how it works. Thank you!
items: Stimulus[], | ||
catNames: string[], | ||
delimiter: '.' | string, | ||
itemParameterFormat: 'symbolic' | 'semantic' = 'symbolic', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can it accept: "a, b, guessing, and d"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When using the prepareClowderCorpus
function, the parameters need to be consistent—either all symbolic (a, b, c, d) or all semantic (discrimination, difficulty, guessing, slipping). We can't mix and match between symbolic and semantic formats in the same input. The function uses the itemParameterFormat
option to convert everything to the desired format, but we need to pass one type for the input keys.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this requirement documented somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would suggest document it in README with sample code
} | ||
|
||
/** | ||
* Class implementing early stopping based on a plateau in standard error of measurement. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how is this different from line 195?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get it, do you mean we have repetitive classes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one stops if the SEMeasurement plateaus, meaning that is fails to decrease for a certain number of trials (called patience
). E.g., if you have
trial number | SEMeasurement |
---|---|
1 | 1.0 |
2 | 0.8 |
3 | 0.6 |
4 | 0.4 |
5 | 0.4 |
6 | 0.4 |
7 | 0.4 |
If the patience
was set to 3, then the early stopping would have been triggered on trial 6 because the SE had failed to decrease for 3 consecutive trials.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it! thank you!
/** | ||
* Interface for input parameters to EarlyStopping classes. | ||
*/ | ||
export interface EarlyStoppingInput { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i am confused how many combinations of early stopping criteria are available here? can you document this somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a few combinations we can use for early stopping:
Logical Operations: We can choose between and, or, and only for combining multiple stopping criteria:
- and: All conditions need to be met to trigger stopping.
- or: Any one condition being met will trigger stopping.
- only: Only a specific condition is considered (requires you to specify the cat to evaluate).
Stopping Criteria Classes:
- StopAfterNItems: Stops after a specified number of items.
- StopOnSEMeasurementPlateau: Stops if the standard error (SE) of measurement remains stable (within a tolerance) for a specified number of items.
- StopIfSEMeasurementBelowThreshold: Stops if the SE measurement drops below a set threshold.
I added these on the readme file :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think providing sample code in README would be helpful.
beforeEach(() => { | ||
const clowderInput: ClowderInput = { | ||
cats: { | ||
cat1: { method: 'MLE', theta: 0.5 }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is contained in the cat object? should cat be defined as a cat class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In reality, it is a Cat
instance. But for unit testing, it is enough to provide an object that "looks" like a Cat
instance.
|
||
const abilityPrior = normal(); | ||
|
||
export interface CatInput { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the difference between CatInput and Cat
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CatInput is just a configuration interface that defines the setup options for creating a Cat, like method or theta. On the other hand, Cat is the actual class that manages the behavior and state.
Basically, CatInput is for setup, and Cat does the work!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i am confused. does it mean the user need to create CatInput instead of Cat?
if yes, can you add documentation in the README to explain how to use jsClowder
And will that conflict the use if user only wants to use jsCAT instead of jsClowder?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we clarified this in our Slack huddle but I'll just document the conversation here. CatInput
is an interface defining the input format that Cat
expects when it is being instantiated. The user creates a new Cat
instance by passing in parameters that conform to the CatInput
interface. It's actually already defined in jsCat. Emily didn't add it in this PR. @AnyaWMa , you added it two years ago in this commit.
items: clowder.corpus[0], | ||
answers: 1, | ||
}); | ||
expect(nextItem).toBeDefined(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does "toBeDefined()" mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a Jest "matcher" used for unit testing https://jestjs.io/docs/expect#tobedefined
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, thanks for explaining.
StopIfSEMeasurementBelowThreshold, | ||
StopIfSEMeasurementBelowThresholdInput, | ||
StopOnSEMeasurementPlateau, | ||
StopOnSEMeasurementPlateauInput, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is StopOnSEMeasurementPlateau?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In simpler terms, if the SE doesn't significantly change for a set number of items, the process stops early because it indicates that the ability estimate is no longer improving. It's used to avoid unnecessary trials once the measurement has plateaued.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment here: #25 (comment)
@@ -0,0 +1,739 @@ | |||
import { Cat } from '..'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we have tests the cat will stop when item bank is used up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, all lines are covered now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is the test when item bank is used up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
line 303 is the test if you want a set number, if you don't add early stopping it will go trough all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And line 98 in src/__tests__/clowder.test.ts
tests that Clowder will return undefined when an item bank for a specified catToSelect
is used up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for lots of improvement! I understand the code better now, but I will still appreciate more detailed documentation in the README. Specficially, I want to know will jsClowder will conflict any use if the user will want to use jsCAT?
- Using **`or`** with `StopOnSEMeasurementPlateau` and `StopAfterNItems` allows early stopping if either condition is met. | ||
|
||
If you need more details or a specific example documented, feel free to ask! | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the document. can you give sample lines of code here?
@@ -42,22 +42,56 @@ const stimuli = [{difficulty: -3, item: 'item1'}, {difficulty: -2, item: 'item2 | |||
const nextItem = cat.findNextItem(stimuli, 'MFI'); | |||
``` | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add documentation about the accepted stimuli types?
|
||
const abilityPrior = normal(); | ||
|
||
export interface CatInput { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i am confused. does it mean the user need to create CatInput instead of Cat?
if yes, can you add documentation in the README to explain how to use jsClowder
And will that conflict the use if user only wants to use jsCAT instead of jsClowder?
items: Stimulus[], | ||
catNames: string[], | ||
delimiter: '.' | string, | ||
itemParameterFormat: 'symbolic' | 'semantic' = 'symbolic', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would suggest document it in README with sample code
/** | ||
* Interface for input parameters to EarlyStopping classes. | ||
*/ | ||
export interface EarlyStoppingInput { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think providing sample code in README would be helpful.
@@ -42,22 +42,56 @@ const stimuli = [{difficulty: -3, item: 'item1'}, {difficulty: -2, item: 'item2 | |||
const nextItem = cat.findNextItem(stimuli, 'MFI'); | |||
``` | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
more broadly, can we document and show sample code: how to set up a jsClowder, and some basic functions to run a clowder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry for another round of clarification.
I am asking because I found in my current jsCAT: the desired zeta type is actually a mix of symbolic and semantic (a, difficulty, c, and d.). I am okay to just be all semantic or all symbolic, but I want to make sure all places that are hard-coded in the code will function as expected for this transition. Thanks!
// for mfi, we sort the arr by fisher information in the private function to select the best item, | ||
// and then sort by difficulty to return the remainingStimuli | ||
// for fixed, we want to keep the corpus order as input | ||
arr.sort((a: Stimulus, b: Stimulus) => a.difficulty! - b.difficulty!); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will anything break here because it asks for difficulty not b?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope. Nine lines above, Emily ensures that the item array is in the semantic format using the fillZetaDefaults
method.
|
||
it.each` | ||
deepCopy | ||
${true} | ||
${false} | ||
`("correctly suggests the next item (closest method) with deepCopy='$deepCopy'", ({ deepCopy }) => { | ||
const expected = { nextStimulus: s5, remainingStimuli: [s4, s1, s3, s2] }; | ||
const received = cat1.findNextItem(stimuli, 'closest', deepCopy); | ||
expect(received).toEqual(expected); | ||
}); | ||
|
||
it.each` | ||
deepCopy | ||
${true} | ||
${false} | ||
`("correctly suggests the next item (mfi method) with deepCopy='$deepCopy'", ({ deepCopy }) => { | ||
const expected = { nextStimulus: s1, remainingStimuli: [s4, s5, s3, s2] }; | ||
const received = cat3.findNextItem(stimuli, 'MFI', deepCopy); | ||
expect(received).toEqual(expected); | ||
}); | ||
|
||
it.each` | ||
deepCopy | ||
${true} | ||
${false} | ||
`("correctly suggests the next item (middle method) with deepCopy='$deepCopy'", ({ deepCopy }) => { | ||
const expected = { nextStimulus: s1, remainingStimuli: [s4, s5, s3, s2] }; | ||
const received = cat5.findNextItem(stimuli, undefined, deepCopy); | ||
expect(received).toEqual(expected); | ||
}); | ||
|
||
it.each` | ||
deepCopy | ||
${true} | ||
${false} | ||
`("correctly suggests the next item (fixed method) with deepCopy='$deepCopy'", ({ deepCopy }) => { | ||
expect(cat8.itemSelect).toBe('fixed'); | ||
const expected = { nextStimulus: s1, remainingStimuli: [s2, s3, s4, s5] }; | ||
const received = cat8.findNextItem(stimuli, undefined, deepCopy); | ||
expect(received).toEqual(expected); | ||
}); | ||
|
||
it.each` | ||
deepCopy | ||
${true} | ||
${false} | ||
`("correctly suggests the next item (random method) with deepCopy='$deepCopy'", ({ deepCopy }) => { | ||
let received; | ||
const stimuliSorted = stimuli.sort((a: Stimulus, b: Stimulus) => a.difficulty! - b.difficulty!); // ask | ||
let index = Math.floor(rng() * stimuliSorted.length); | ||
received = cat4.findNextItem(stimuliSorted, undefined, deepCopy); | ||
expect(received.nextStimulus).toEqual(stimuliSorted[index]); | ||
|
||
for (let i = 0; i < 3; i++) { | ||
const remainingStimuli = received.remainingStimuli; | ||
index = Math.floor(rng() * remainingStimuli.length); | ||
received = cat4.findNextItem(remainingStimuli, undefined, deepCopy); | ||
expect(received.nextStimulus).toEqual(remainingStimuli[index]); | ||
} | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry for more questions.
I just realized i haven't reviewed these new tests before.
What does the "deepcopy" do here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deepCopy
is part of the original jsCat code. It dictates whether the input array should be deep copied before sorting in the findNextItem
method. While the deepCopy
parameter itself isn't new (it's been around since this commit, Emily noticed that it wasn't being tested, so now the unit tests test whether findNextItem
works as expected with both deepCopy=true
and deepCopy=false
.
private selectorMFI(inputStimuli: Stimulus[]) { | ||
const stimuli = inputStimuli.map((stim) => fillZetaDefaults(stim, 'semantic')); | ||
const stimuliAddFisher = stimuli.map((element: Stimulus) => ({ | ||
fisherInformation: fisherInformation(this._theta, fillZetaDefaults(element, 'symbolic')), | ||
...element, | ||
})); | ||
|
||
stimuliAddFisher.sort((a, b) => b.fisherInformation - a.fisherInformation); | ||
stimuliAddFisher.forEach((stimulus: Stimulus) => { | ||
delete stimulus['fisherInformation']; | ||
}); | ||
return { | ||
nextStimulus: stimuliAddFisher[0], | ||
remainingStimuli: stimuliAddFisher.slice(1).sort((a: Stimulus, b: Stimulus) => a.difficulty! - b.difficulty!), | ||
}; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need a clarification here: the returned remainingStimuli will be semantic or symbolic, or it will be consistent with original zeta type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The returned stimuli will be semantic, in keeping with the legacy behavior of jsCat.
i approved the PR, but I will mention this version will break current use of jsCAT in apps. The structure of taking zetas is completely different, and the README is not valid too. it is important to let the users know (swr, vocab, comp, and maybe levante). Thank you! |
The next steps are: