-
Notifications
You must be signed in to change notification settings - Fork 0
sim verb
#simverb -simlex had 999 word pairs scored acc to sim of words in a pair -...wanted to do sth with words in a pair -decided let's just annotate a large dataset -,did3,500 verb pairs -result is here -idea is assign sim score to 2 words in pair -words like reply-respond 9.79 -ongoing debate what to do with antonyms -only1 particular rel, similarity -sim. of antonyms 'not sim' it's 0 not neg
Wicked means good, homonyms and non-unitary definitions. No good
-some previous sim eval sets -,every set has problems -rare words dset has lots of rare words anno'd for rel -simlex only 222 words -eval. set rep of wide range of concepts from nat lang -targets in dset -consistent and reliable -people do job, native speakers understand what relation -gut feeling as native speakers -2 words rated 0 to 10 -able to understand instructions -c1 representative c2 clear c3 consistent -...simlex had c2 and c3 -verbnet is old style repo of verbs clustered acc to subcategoric ... frames, ... temporal properties etc -...to cover wide range of verbs -, of {?} only 60 {...?} -simverb-3500 large-scale anno'n of verbs -don't have time
-crowdsourcing with Prolific Academic as someone took issues with it at university -...annotators better than other platforms -...recommend company -acquired >65k ratings of 800 participants -post-proc. along, hard job for Daniella but did in the end -native eng speakers, test if wrong disqualified -which of 3 pairs most easy -if wrong,don't take into account -removed suspicious rating patterns -finally 84% annotator rate -each of pairs has at least 10 accepted evaluations -final score average but always find all annotations -computed inter-annotator agreements
Surely you could find this with graphical models without this basic basic understanding of what it means to be synonymous, this is literally already well-annotated
-best results on state of the art surpassed by wsim ->question if useful on simlex -divided into dev and testing [unavail on prev as too small] -bc of size can split and test -measure how difficult to model section is to model -remains q, NLP mainly talking about similarity when talking about semantics -...eval should be interested in more -.'.HyperLex, graded lexical entailment: is X a type of Y -,further, to what degree is X a Y -certain prototypical words will have high scores cf other words
Everything isn't a hierarchy, this silly idea
- scientism!
- "to what degree is chemistry a science'
-entailment relation
"Hypermony hyponymy?"
-didn't sample rand, did mult word pairs in mult rships -!!! -...{???}... -^^^ -...cite please
-had to be representative -measured if can observe any prototypicality in human judgements -when comes to food people think sandwich rice than oregano or rabbit or dinner -more protoypical animals than cats
Clearly suggestion bias issue in experiment not valid, or at least should refine what attention model is drawing out. It's not proto
-how well lex entailment correlates with similarity -some overlap -!not same -eval state o.t.a. sim models -excel, -! still score antonyms quite low -synonyms high in some dsets -some reasonable numbers -interanno agreement same as in simverb -best model eval gets 0.320 -...basically no machinery in NLP to tackle this -1 final thought: -we don't have really good intrinsic relation (EN?) eval sets (ltd:in size focus, rel's covered etc) -,altho it is getting bettter with LAMBADA
---Q&A--- -§why higher anno -=>maybe relied on ...standard Amazon Mechanical Turk workers -...bunch of features, Prolific Academic much better
-§aware of anyone trying to find out if hyponym or meronymy corresp to {...?} in embedding spaces -=>Hymerich? Shchuser?, -,trying to extract subspaces
-§curious:pairs were usually bare nouns and some hyponym -§§,considered using adj-noun cpd's? -§§red motorbike, vehicle yes -§§fake dollar, money no -=>is red bike a type of bike, non-trivial
-§visual model -=>visual models capable of entailment tasks -...3 or 4 models in paper on par with textual models esp with concrete concepts
-§from 0-10 how well defined is that as a question -§§...in sports well-defined models of matches -=>people better in ranking than assigning scores -,basically using Spearman's Correlation -§§take mean ranking and convert...? -=>>Y, take ranks, not actual scores, somehow smooth
-§frau in German woman or wife -=>took off shelf embeddings, paragram can see still captures something in entailment (Hyperlex bar plot half size) -,low entailment, synonyms high scores in both dsets -if eval. sim specialised embeds in entailment tasks...