Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process to avoid content overlap in original lessons in different languages #2141

Closed
jenniferisasi opened this issue May 25, 2021 · 37 comments · Fixed by #2394
Closed

Process to avoid content overlap in original lessons in different languages #2141

jenniferisasi opened this issue May 25, 2021 · 37 comments · Fixed by #2394

Comments

@jenniferisasi
Copy link
Contributor

jenniferisasi commented May 25, 2021

With more original lessons arriving in now our four languages, a new and exciting issue arises: how to avoid content/methodology overlap in new lessons instead of getting the already published ones translated.

Originally, there was no such problem because EN lessons were being translated into ES|FR|PT with datasets or examples sometimes changed (localized). Now we are receiving new lessons that are not being translated (as of today) to EN, for example.

Examples of current potential overlap:

In the AGM meeting today (May 25, 2021) a few ideas came up:

  • Having a spreadsheet with available lessons;
  • @rivaquiroga suggested making it explicit on the author guidelines for authors to check on the translation concordance to see what methods have been already published on in any language.
  • How can MEs or editors at large be aware / control that this overlap doesn't happen remains the bigger question, as not to put further work, on particularly, MEs.
  • It might be a 'project' for the Global Team.

I'm adding a question:

  • what can we do to get people interested in translating between the rest of possible pairs of languages: ES:EN; ES:FR; ES:PT; FR:EN; FR:ES, and so forth?

Finally, we all realize/know that complete overlap is not going to happen because many of the lessons are guided by they research question and a dataset or set of materials, or because working in a particular language might need a substantial modification of the method (i.e. for NLP analysis). However, in some cases we have instructions on how to use an standalone platform and its use is not tied to the language of the content (map warper, omeka, etc.). In those cases, some of us believe, there is no reason for a new lesson but for a translation, which would highly add value to everyone and recognize that non-English speakers are also creators of knowledge/methods in DH.

@drjwbaker
Copy link
Member

  • Having a spreadsheet with available lessons;
  • this is already a thing in our Translation concordance page but I am guessing it is not checked (given the idea)
  • @rivaquiroga suggested making it explicit on the author guidelines for authors to check on the translation concordance to see what methods have been already published on in any language.

I agree with @rivaquiroga on this.

@drjwbaker
Copy link
Member

  • How can MEs or editors at large be aware / control that this overlap doesn't happen remains the bigger question, as not to put further work, on particularly, MEs.

Keep it simple? - e.g., MEs check their articles + concordance for potential overlap, and if they think there is overlap in another publication but aren't sure, write to them to discuss.

@drjwbaker
Copy link
Member

drjwbaker commented May 26, 2021

what can we do to get people interested in translating between the rest of possible pairs of languages: ES:EN; ES:FR; ES:PT; FR:EN; FR:ES, and so forth?

This is the one about mission and ownership. Two observations:

  1. My understanding is that the 'missions' of the ES, FR and PT publications are to publish articles (in translation or otherwise) for their language audiences given the historic/systemic under-representations of digital history/humanities methods publications in those languages.

  2. My understanding is also that none of our publications have 'missions' to amplify historically/systematically under-represented non-Anglophone scholarly voices by translating their works submitted to PH into languages other than their first/working language, typically into English.

If I'm right in these observations, we need to attend to 2. This could be a role and a Ltd service with a clear 'for good' mission. I suspect this is a separate ticket. It could form part of our funding drive, tapping into the global/development philanthropic ambitions of large anglophone institutions.

@drjwbaker
Copy link
Member

(and many many thanks @jenniferisasi for so skillfully summarising our discussion!)

@acrymble
Copy link

If this were happening in biomedical sciences, or another history research journal, I think we'd expect the editor to be responsible - with the help of the peer reviewers - to make sure that tutorials made a substantial new contribution to knowledge.

When I submit an article I always have to put my contribution in the wider context of what we already know about that topic (the historiographical context). If we made that more central to our requirements, both authors and editors would get more used to checking the state of the field (including in other languages) and this problem might resolve itself?

We Anglophones are really not used to thinking about what's happening in other languages, so while we're very willing, we probably collectively need to be reminded more than everyone else. As all review tickets are open for public comment, if you do notice potential overlap with another language topic, it would be helpful to mention it on the ticket.

@drjwbaker
Copy link
Member

Agreed. But we can still help the reviewers by, say, pushing them to look at the concordance as @rivaquiroga describes doing so for authors, no?

@acrymble
Copy link

Yes. That feels like a good training opportunity. Maybe we can host an upskilling workshop from time to time aimed at editors and best practices, with issues decided by the full team in advance based on need?

@drjwbaker
Copy link
Member

And/or put - ~"look at the concordance" - in the review guidelines wrt assessing it for originality.

@DanielAlvesLABDH
Copy link
Contributor

Hi to all! I agree that it has to be more clear the need for a first check about overlapping lessons, and I agree that some extra sentences or words on translation, author, editor and reviewer guidelines will help. But that is also the task of the editor. A couple of months ago the PT team received a new lesson about Python. Although it was original and applied to a specific Brazilian online resource there has some overlapping with other lessons already published in EN and ES. The author agreed in reformulating the lesson.

@mariajoafana
Copy link
Contributor

2. My understanding is also that none of our publications have 'missions' to amplify historically/systematically under-represented non-Anglophone scholarly voices by translating their works submitted to PH into languages other than their first/working language, typically into English.

I'm not sure that I'm getting your point here. We could say that tutorials written in languages other than English are not only aimed at amplifying those under-represented voices, but those tutorials can make substantial new contributions to the field and to global anglophone audiences and as such are worthy of translation as much as the English language tutorials are worthy of translation into other languages.

@drjwbaker
Copy link
Member

those tutorials can make substantial new contributions to the field and to global anglophone audiences and as such are worthy of translation as much as the English language tutorials are worthy of translation into other languages.

Totally agree! My point is that none of our 4 publications currently have a mission to do just that. I'm suggesting we consider creating a role dedicated to doing that.

@jenniferisasi
Copy link
Contributor Author

The situation that @DanielAlvesLABDH shares is both the issue at hand and the ideal solution: checking that there is overlap, ask to reformulate, and (I would add) cite the existing lesson. And so that would be another note for the author guidelines when submitting a lesson.

@drjwbaker yes, we need to make this mission explicit. And to @acrymble's point, exactly, one is supposed to check for existing papers/research on what you are about to write about, but doesn't hurt to remind people that said research might already exist also in other languages. I'd like to add that we are working in some of the top languages in the world, thus, google translate does wonders with them at this point in time.

@jenniferisasi
Copy link
Contributor Author

This is a good problem to have, by the way.

@walshbr
Copy link
Contributor

walshbr commented May 26, 2021

Just a note on "look at the concordance" in the guidelines. I like this in spirit, but the translation concordance would get you close but probably not quite serve as is for what you're describing. I'm imagining two situations, in particular (languages just sort of picked arbitrarily for the examples below):

  1. Author who does not speak French proposes a lesson on a particular tool in Spanish, is able to parse that a French lesson title at least has the same tool's name and so is likely to overlap. They'd have to then talk with the French language team (or we'd have to do so on their behalf) to figure out the degree to which there is overlap, because they wouldn't be able to determine on their own w/o knowing French.
  2. Lessons on methods or concepts more than tools pose a different problem. Something like…"Preserving Your Research Data". If that lesson had originated in French first as "Préserver ses données de recherche", it would be a heavier lift for an author who cannot read French to tell if their own proposed lesson on research data preservation will overlap. There's no shared tool or method named in the title that would draw the eye to the lesson that overlaps. Does that make sense? It's not as easy as just looking for the word "Twitter." I think to do it right you would want to have the titles at least translated across languages (regardless of whether or not the whole article has been translated) somewhere to help catch these kinds of cases, whereas now they only show up in the concordance if they have already been published. The concordance could maybe be a space for that conceptually, but implementing it might pose a couple design and technical challenges that would need to be thought through.

That'd be a bunch of translating work, of course, but it could be helpful for helping to encourage the kind of awareness of activity across the journals you're describing.

@jenniferisasi
Copy link
Contributor Author

jenniferisasi commented May 26, 2021

@walshbr, that's a good observation. I think on n1 is where we need to decide what the MEs do and the proposed role by James, for example. There is already a case in the works, and @mariajoafana put a comment on the EN ticket for the editors.
On your second thought, I think having the titles translated is a very good idea and not extremely difficult or a lot of work. I am already thinking of the technical solution: leave the link for the lessons that are available and only text with no link on those that are not translated (and add an explanation to that above); will be an overhaul to the translation concordance page but doable. Will consult with tech-team about that.

@walshbr
Copy link
Contributor

walshbr commented May 26, 2021

Yep that was what I was thinking too @jenniferisasi. Though I'd have to look at the logic for the concordance to see how easy that is to slide in.

@jenniferisasi
Copy link
Contributor Author

Adding to the translation concordance improvement: I would also add a link to the issue in ph-submissions of translations that are under review, to point at those lessons that are on their way. The links my have a different color or we could add a parenthesis for those (under review).

And then we would have to make that concordance more prominent on our site.

@jenniferisasi
Copy link
Contributor Author

Screen Shot 2021-05-27 at 6 39 38 PM

This is how a new translation concordance will look with all titles translated (I used google translate with no editing at all for FR and PT so ignore any mistake in there)

@DanielAlvesLABDH
Copy link
Contributor

Thanks @jenniferisasi! I think this is a good solution. We can collaborate in the translation

@drjwbaker
Copy link
Member

drjwbaker commented May 28, 2021

So it looks like this ticket has a solution to the issue of overlap. I'll make a ticket for ownership of translation out of ES/FR/PT.

@jenniferisasi
Copy link
Contributor Author

To clarify, that preview/image is a .md file that I typed, vs. the current one that is automatically created with a couple of coding sentences. So we still need to decide what would be the easiest way of creating such a file with the least amount of work - and tech team is on it.

@svmelton
Copy link
Contributor

Thanks, all—I agree that this is a good problem to have! I love @jenniferisasi's concordance draft and agree that it would be useful to add this to the submission guidelines. It sounds like we're reaching a consensus: is there anything else we need to do to move this forward, or anything I'm missing?

@jenniferisasi
Copy link
Contributor Author

Hey @svmelton no need to do anything else here if you like the concordance and future addition of it to guidelines. But you might want to comment on #2143 as the decision whether/how to accept translations into EN and a pipeline for it is a decision the EN team needs to make

@acrymble
Copy link

Can this conversation be summarized into any actions? Otherwise I fear it will just sit open.

@jenniferisasi
Copy link
Contributor Author

jenniferisasi commented Jun 27, 2021

Let's try, but add if I'm missing anything:

  • 1. Create/edit concordance page to facilitate translations: existing lessons have a link, the rest show what would be the title in another language. @jenniferisasi via @programminghistorian/technical-team started to work on it, will require linguistic review
  • 2. Add a note on author guidelines to check for existing lessons in other languages (concordance would help)
  • 3. Plan on something to get people interested in translating other pairs of languages; maybe @programminghistorian/communication-team and @programminghistorian/global-team

@drjwbaker
Copy link
Member

The third bullet here effectively closes/supersedes #2143. Linking here as it provides useful context/discussion.

@jenniferisasi
Copy link
Contributor Author

For the record, I have started to work on item 1, create the 4-lang titles in a gSheets as it will be easier to share for language checks (and it's easy to transform into .md later if needed).

@anisa-hawes
Copy link
Contributor

Hello @jenniferisasi. Would you like to share a link to the Google Sheets document you've created? If it's useful I can include it within the Minutes of today's Project Team Meeting. Do you envisage that this list of lesson titles/concordance document will eventually be part of the Wiki?

The other aspect which has been discussed above, is how this might be woven into our Lesson Proposal Form and Step 1 of the Author Guidelines. You mentioned this afternoon, that if authors find a lesson already exists on their chosen topic, we could use this as a springboard to suggest a new translation rather than a new original lesson.

@jenniferisasi
Copy link
Contributor Author

Sure thing, here it is: ph-lesson-concordance

On n.2, I agree that a warning of shorts should go in the Author Guidelines. We could link to this spreadsheet once it is ready (and it will update continuously) or the translation concordance page if we at the tech-team decide it is plausible to update it constantly. @Anisa-ProgHist the problem with both options would be that the editors would have to add the new lesson and ask for translation, however on the spreadsheet there is no need to do that on a pull-request.

What hasn't been decided yet, I think, is what to do if an author brings up an overlapping lesson but they cannot themselves translate the existing lesson. I think a clear statement on that regard should appear on the guidelines as well.

@anisa-hawes
Copy link
Contributor

Hello @jenniferisasi! I'm keen to incorporate this concordance checking as a step within the Editorial Guidelines I am drafting.

My understanding from this thread, is that:

  • Authors/MEs could refer to it during the pre-submission / proposal phase
  • Editors could add a new row when they start working on a new lesson *this is the step I would like to add now

Is the Google Sheet you created accessible to everyone in the Google Group?

I can't see the latest edit date... so I'm unsure how regularly it is being updated/used at the moment — Is this still an active document?

I think the steps towards closing this Issue will be:

  • Add this step to my draft of revised Editorial Guidelines
  • Link this document to Step 1: Proposing a New Lesson of the Author Guidelines, alongside a note asking authors to check whether we already have a lesson closely relating/possibly overlapping their chosen subject.

@jenniferisasi
Copy link
Contributor Author

Hi @anisa-hawes, happy to hear this idea might be added to the editorial guidelines.

I haven't updated it since July 29th because it was not "approved" to be used yet, as you/they figured new Editorial Guidelines and how to check for overlap. I can make a note to update it soon, and in the meantime I am also sharing the document with the editors and the PH gmail.

@anisa-hawes
Copy link
Contributor

Thank you, @jenniferisasi. Let's work on this together!

@anisa-hawes
Copy link
Contributor

anisa-hawes commented Nov 19, 2021

Hello all @programminghistorian/english-team @programminghistorian/spanish-team @programminghistorian/french-team @programminghistorian/portuguese-team,

@jenniferisasi and I have been working on the Lesson Concordance document and it is now ready to use.

  • We've linked all published lessons to the live website
  • We've linked all lessons in progress / under review to the relevant Issue in Submissions
  • We've created a 'key' so that lessons in progress / under review are marked with a ± plus-minus symbol, and retried lessons are marked with an * asterisk.

Our hope is that this document can be referred to by Authors and MEs during the pre-submission / proposal phase.

  • ACTION: Do you think we could add a link to the Concordance Document at Step 1 of the Author Guidelines?

As we revise the Editorial Guidelines, we could ask editors to create a new row whenever they start working on a new lesson, so that other teams know what is in the pipeline.

I'm happy to help keep this document up to date by regularly checking in to link new Issues in Submissions, and linking new lessons that I hear have been published when I am preparing our Newsletters.

--

After some conversations, we think that a Google Sheet is the most practical format, because it will allow each team to sort the data alphabetically in their own language. A markdown table wouldn't offer us this.

Please let me know if you have any suggestions for how this document could be better or easier to use.

Thank you to @jenniferisasi for translations, and thank you @DanielAlvesLABDH for checking the Lição-pt column. We've done our best with the leçon-fr column, so we hope you will forgive us for any small errors there.

@spapastamkou
Copy link
Contributor

spapastamkou commented Nov 19, 2021 via email

@anisa-hawes
Copy link
Contributor

Ah! Thank you, @spapastamkou! 🙂

@DanielAlvesLABDH
Copy link
Contributor

Dear @anisa-hawes i have made a revision, smaller changes. I agree with the idea of a link in the guidelines. Thanks for all the work!

@jenniferisasi
Copy link
Contributor Author

Thanks for the update @anisa-hawes. I hope this eases the overlapping issue a bit, and at the same time it might prompt translations and new ideas.

@anisa-hawes anisa-hawes self-assigned this Nov 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment