Should Spacy move to UniversalDependencies (controversy)? #13738
Unanswered
ivan-kleshnin
asked this question in
Help: Other Questions
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
The topic of UD was raised multiple times, e.g in #2485
But mostly as "How soon Spacy will switch to Universal Dependencies (UD)?" My question is different, hovewer.
Should Spacy transition to Universal Dependencies? 🤔
So I've been comparing Spacy graphs with CoreNLP graphs for a while... I've initially found that it's trivial to get to a master verb (to check if it's negated, its tense, etc.) from some matched token in Spacy and not so much in CoreNLP. Then I got a bad general sensation that a new approach will be harder and less performant to work with, at least for my tasks. And then I found this rabbit hole of UD vs DG (dependency grammar) – a polarising topic amonst linguists.
For non-specialists, to simplify: UD puts semantics over grammar and DG puts grammar over semantics. Imagine a Python parser favoring semantics over syntax... Sounds disturbing.
Here's an authoritative research with solid counter-arguments against UD:
The status of function words in dependency grammar: A critique of Universal Dependencies
Most of all, I'm concerned that this topic is casually discussed in other threads, like it's not a big deal, just a matter of some corpus refactoring 😨 I'm not a linguist, but my engineering experience is enough to see that UD is a huge breaking change. It also revisits foundations of linguistics since, I dunno, 1980 for benefits mostly focused on language translation. Undoubtedly an important topic, but not all linguistics and NLP boils down to that.
A migration to Spacy V4, shall it be UD-based, might be very hard for larger systems. I imagine a lot of graph-traversal algorithms would have to be revisited and replaced. One potential solution would be to support both approaches in parallel, but I'm not sure if the amount of work for that is tolerable.
More resources:
Assessing Theoretical and Practical Issues of Universal Dependencies
UD are fundamentally flawed
As an outsider, I can't speak for trends. Maybe UD is clearly winning in minds, so it's already decided in 2025.
But for me, at the moment, it doesn't look like that. It seems to be primarily pushed by Google and Stanford University.
Not sure if I should tag Matthew for this. Maybe it was discussed previously and I've simply failed to find the thread...
Beta Was this translation helpful? Give feedback.
All reactions