Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PRE REVIEW]: FreeStylo: An easy-to-use stylistic device detection tool for stylometry #7443

Open
editorialbot opened this issue Nov 9, 2024 · 11 comments
Labels
HTML pre-review Python query-scope Submissions of uncertain scope for JOSS TeX Track: 5 (DSAIS) Data Science, Artificial Intelligence, and Machine Learning

Comments

@editorialbot
Copy link
Collaborator

Submitting author: @fschncvg (Felix Schneider)
Repository: https://github.com/cvjena/freestylo
Branch with paper.md (empty if default branch):
Version: v0.5.0
Editor: Pending
Reviewers: Pending
Managing EiC: Chris Vernon

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/eb3cc3f453aabe48306c4e81f42a4133"><img src="https://joss.theoj.org/papers/eb3cc3f453aabe48306c4e81f42a4133/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/eb3cc3f453aabe48306c4e81f42a4133/status.svg)](https://joss.theoj.org/papers/eb3cc3f453aabe48306c4e81f42a4133)

Author instructions

Thanks for submitting your paper to JOSS @fschncvg. Currently, there isn't a JOSS editor assigned to your paper.

@fschncvg if you have any suggestions for potential reviewers then please mention them here in this thread (without tagging them with an @). You can search the list of people that have already agreed to review and may be suitable for this submission.

Editor instructions

The JOSS submission bot @editorialbot is here to help you find and assign reviewers and start the main review. To find out what @editorialbot can do for you type:

@editorialbot commands
@editorialbot editorialbot added pre-review Track: 5 (DSAIS) Data Science, Artificial Intelligence, and Machine Learning labels Nov 9, 2024
@editorialbot
Copy link
Collaborator Author

Hello human, I'm @editorialbot, a robot that can help you with some common editorial tasks.

For a list of things I can do to help you, just type:

@editorialbot commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@editorialbot generate pdf

@editorialbot
Copy link
Collaborator Author

Software report:

github.com/AlDanial/cloc v 1.90  T=0.04 s (1475.5 files/s, 194763.2 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
HTML                            22            188              0           3584
Python                          19            382            814           1042
JSON                             4              0              0            349
Markdown                         2             54              0            227
YAML                             1             12              3            112
TeX                              1              9              0             73
Bourne Shell                     2              2              0              7
TOML                             1              1              0              5
-------------------------------------------------------------------------------
SUM:                            52            648            817           5399
-------------------------------------------------------------------------------

Commit count by author:

    41	schneider
     4	Felix Schneider

@editorialbot
Copy link
Collaborator Author

Paper file info:

📄 Wordcount for paper.md is 1853

✅ The paper includes a Statement of need section

@editorialbot
Copy link
Collaborator Author

License info:

🟡 License found: GNU General Public License v3.0 (Check here for OSI approval)

@editorialbot
Copy link
Collaborator Author

Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

✅ OK DOIs

- 10.18653/v1/2021.latechclfl-1.11 is OK
- 10.14746/amup.9788323241775 is OK
- 10.5281/zenodo.1212303 is OK
- 10.18653/v1/2021.acl-demo.3 is OK

🟡 SKIP DOIs

- No DOI given, and none found for title: Metaphor Detection for Low Resource Languages: Fro...
- No DOI given, and none found for title: Twenty-first century Corpus Workbench: Updating a ...
- No DOI given, and none found for title: Empirical research on association measures: The UC...
- No DOI given, and none found for title: NLTK: The Natural Language Toolkit

❌ MISSING DOIs

- None

❌ INVALID DOIs

- None

@editorialbot
Copy link
Collaborator Author

👉📄 Download article proof 📄 View article proof on GitHub 📄 👈

@editorialbot
Copy link
Collaborator Author

Five most similar historical JOSS papers:

TRUNAJOD: A text complexity library to enhance natural language processing
Submitting author: @dpalmasan
Handling editor: @danielskatz (Active)
Reviewers: @mbdemoraes, @apiad
Similarity score: 0.6678

Arabica: A Python package for exploratory analysis of text data
Submitting author: @PetrKorab
Handling editor: @oliviaguest (Active)
Reviewers: @linuxscout, @amitkumarj441
Similarity score: 0.6554

Augmenty: A Python Library for Structured Text Augmentation
Submitting author: @KennethEnevoldsen
Handling editor: @arfon (Active)
Reviewers: @sap218, @wdduncan
Similarity score: 0.6359

Fast, Consistent Tokenization of Natural Language Text
Submitting author: @lmullen
Handling editor: @arfon (Active)
Reviewers: @arfon
Similarity score: 0.6347

textnets: A Python package for text analysis with networks
Submitting author: @jboynyc
Handling editor: @gkthiruvathukal (Active)
Reviewers: @sara-02, @tresoldi
Similarity score: 0.6316

⚠️ Note to editors: If these papers look like they might be a good match, click through to the review issue for that paper and invite one or more of the authors before considering asking the reviewers of these papers to review again for JOSS.

@crvernon
Copy link

crvernon commented Nov 9, 2024

👋 @fschncvg - You mention the following previous publications:
Chiasmus Detection: https://aclanthology.org/2021.latechclfl-1.11/
Metaphor Detection: https://aclanthology.org/2022.mwe-1.11/
Additional Detectors so far in this software package: Polysyndeton - Epiphora - Alliterative verse A tool for people in stylometry, literary scholars, computational linguists/NLP researchers.

Please clearly state how / if the current submission is different from the above. Thanks.

@fschncvg
Copy link

The previous publications are the base forms of the chiasmus and metaphor detector. While both have code to replicate the experiments and could with some work probably used by someone proficient in programming to also detect those stylistic devices in other texts, this software provides a new implementation of the concept that can be either used as a python library in other programs or be used directly by e.g. literary scholars with no programming knowledge. The additional stylistic devices that can be detected (polysyndeton, epiphora, alliterative verse) do not rely on machine learning and have not been previously published by me, but since this software provides a growing general stylistic device detector collection, it makes sense to include them, especially since they are interesting for various stylometric analysis tasks.

So the main difference is: Two of the previously published papers provide methods that are implemented in this software package. Additionally this package provides three more methods to find other stylistic devices.
This software package provides a command line interface and a library for the(currently 5) methods to enable their use by both people proficient in python and researchers with linguistic, stylometric and literary expertise without programming knowledge.

About the similarity to those previously published papers the bot linked: My software package detect stylistic devices. Those linked do not.

@dpalsman published a software that analyzes other aspects of text (e.g. discourse markers, emotions...) and could be used together with my software to gain information the other does not address. We both use spaCy for preprocessing, which makes the interoperability easy. However, my package also provides a cltk backend for Middle High German which can be easily be extended for other classical languages supported by cltk.

@PetrKorab published a software that analyzes various parts of time-series structured text (e.g. news artices, social media posts) for various things, but no stylistic devices. However, my software could be integrated into theirs to also support stylistic devices.

@KennethEnvoldsen provides a software that augments text by e.g. standardising spelling and grammar. It does not compute stylometric information like the choice of words - or of stylistic devices. This could maybe be used as a preprocessing step for my software, however it may destroy the stylistic devices that my software searches for.

@lmullen provides an R package for various tokenization tasks, but no stylistic device detection. It could be used as a preprocessing step, but since it is written in R, the interoperability to my python package is limited.

@jboynyc uses network analysis on collections of text to find out which words are used by different authors and how this connects and groups the authors. While it would be interesting to also do network analysis to find out how different authors use stylistic devices and how they are connected and grouped by that, their software does not use or find stylistic devices and provides a different service than mine.

@crvernon
Copy link

@editorialbot query scope

Thank you @fschncvg, I am going to run this through scope review with our larger editorial board as well. I'll get back to you ASAP. Thanks!

@editorialbot
Copy link
Collaborator Author

Submission flagged for editorial review.

@editorialbot editorialbot added the query-scope Submissions of uncertain scope for JOSS label Nov 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
HTML pre-review Python query-scope Submissions of uncertain scope for JOSS TeX Track: 5 (DSAIS) Data Science, Artificial Intelligence, and Machine Learning
Projects
None yet
Development

No branches or pull requests

3 participants