Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interpretation of repeated infinitives? #1

Open
mitchblank opened this issue Dec 23, 2019 · 1 comment
Open

Interpretation of repeated infinitives? #1

mitchblank opened this issue Dec 23, 2019 · 1 comment

Comments

@mitchblank
Copy link

The french-verb-conjugation.csv file has places where the same infinitive (column 1) appears multiple times in the table:

$ cut -d, -f1 french-verb-conjugation.csv | grep . | sort | uniq -c | awk '$1>1' | wc -l
      14

There seem to be multiple sources of this.

First, there are 4 cases where 100% identical lines appear in the CSV file:

% sort < french-verb-conjugation.csv | uniq -c | sort -nr | grep -v ' 1 ' | cut -d, -f1
   2 tomber
   2 recroître
   2 dédoubler
   2 croître

Those are easy to ignore.

The remaining ones are cases where the same infinitive appears, but the rest of the verbs include a prefix. For example, there is a normal entry for pouvoir but another line that is the entry I would expect for repouvoir:

% grep '^pouvoir,' french-verb-conjugation.csv | cut -d, -f1-9
pouvoir,pouvant,pouvant,pu,avoir,peux,peux,peut,pouvons
pouvoir,repouvant,repouvant,repu,avoir,repeux,repeux,repeut,repouvons

There doesn't seem to be a normal entry for repouvior in the CSV, so it seems that the prefix just is stripped from the infinitive form?

That pattern seems to hold for four of the other duplicated infinitives (including moudre which appears as an infinitive 3 times)

  • clore -> forclore
  • éclore -> déclore
  • moudre -> émoudre, remoudre
  • pouvior -> repouvior

I am far from a native French speaker, but this looks strange to me. Other verbs with prefixes have the infinitive prefixed as well (there are over 300 re- infinitives in the file, for example) so I don't know why these repeat the infinitive.

Then there are 6 other cases where an infinitive appears twice with actually different data:

  • accroître
  • décroître
  • départir
  • faillir
  • parfumer
  • ressortir
    Again, I don't know enough French to say whether these entries are correct (in the sense that there are two distinct verb conjugations -- perhaps for reflexive vs non-reflexive use?) However, take accroître as an example. The two entries in the CSV file basically differ on whether the past-participle is accru or accrû. This appears to just be a genuine spelling controversy (the recent dictionaries I checked all give it as accrû, but I have a 1972 copy of Harrap's which lists it as accru) Other entries in the CSV file indicate this with a semicolon-delimited list of alternatives (e.g. paye;paie)
@W1Real
Copy link

W1Real commented Jun 9, 2024

I wouldn't trust a lot this dataset, there is no info how he generated it. There is some dataset out there created by a trusted French institute, but I forgot the source right now. It would just need some reformatting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants