Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange groupings with BERT embeddings, link to "foo.c" #8

Open
collinfb opened this issue Feb 4, 2021 · 0 comments
Open

Strange groupings with BERT embeddings, link to "foo.c" #8

collinfb opened this issue Feb 4, 2021 · 0 comments

Comments

@collinfb
Copy link
Member

collinfb commented Feb 4, 2021

Doing some testing on Spanish, using the pattern "trav" to select frames, so we select Traversing.{en,es} and Intentionally_traversing.{en,es}. Using MUSE LU matching, with threshold set ad .226
we get the following frames matching:
EN Intentionally_traversing, Passing, Redirecting, and Traversing and
ES Motion_up_down, Traversing
and the corresponding LU matches look reasonable
Switching to LU match by BERT annotation vectors, with the same threshold, we get frame matches:
EN EN Intentionally_traversing and Traversing and
ES Locale_by_use Locative_relation, and Terrorism
The matches between the EN frames and ES Locative_relation are at least vaguely sensible since you could have phrases where they occur near each other like "ascend,en en frente de.es" or "climb.en sobre.es"
The matches between EN frames and ES Locale_by_use are stranger: ford.v.en (to cross a stream) - aeropuerto.n.es, and ascend.v.en - asentamiento.n.es (EN a settlement). and the one match to Terrorism.es shows all the EN LUs linking to ES "foo.c".
Foo.c looks like a dummy LU put in as some kind of hack. But I have also seen (and can't find again right now) EN "if.v" -- if can certainly never be a verb, so I suspect that's similar to "foo.c"
What do you think is going on with the BERT vectors?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant