-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Assignments for session 12 #41
Comments
The paper mentioned the problem of the conflation between hate and offensive speech, whereas the dataset T1 has three labels (hate speech, offensive, and ordinary). I think that the rules of distinguishing these classes are subjective and sensitive to many factors that could also differ over time. what are the criteria that differ between hate and offensive speech? are the cross_cultural differences taken into consideration? The paper mentioned also the influence of the imbalanced classes on the performance of the classifiers, could expansion of the training data solve this problem in practice (in our case, by adding more hate speech comments)? A very important point that has been broached is the necessity of more focus on the datasets and qualitative analysis rather than models, when we talk about YouTube, Facebook, or Twitter, how efficient is the role of the content moderators from this perspective? |
question 1: In section 2.1 the referenced paper mentioned that by using LSTM+GBDT they get a better result than which only using LSTM. But the author got the opposite result i.e. LSTM better than LSTM+GBDT. Why the more complex model get a bad result? Is that the problem of the model itself or it depends on the datasets they use? A possible reason that I think is, maybe the model is overfitting? question 2: In section 3.3 the author used the word appending method in the Adversarial training. It is certainly useful to add common words to the hate speech and make the dataset more general. But on the other side, If you add hate words to the common sentences, actually the whole speech turns into the "hate speech" class. So does it make sense? what's the meaning of adding hate words to normal speech? I personally don't think it could be useful for getting a better result. question 3: In section 8 the author mentioned that we should focus more on the dataset instead of the models. What should we do exactly about it? Are there any ideas? |
|
|
|
|
|
|
|
|
Question 1: What characterises offensive speech? What are the boundaries between offensive and hate speech? Question 2: How we can construct datasets comprising large variety of hate speech variants from diverse sources? Also, how do we label the data since offensive/hate speech is a subjective matter. Question 3: In the introduction the authors claim that "hate speech detection is largely independent of model architecture." However, in section 4 they say that model selection has influence on model performance in terms of attack resilience. How does this two statements might be true? |
1.How to set clear boundaries between false positive and false negative? |
|
|
|
1. How were the performances of the models finally evaluated? By human beings cross-checking results? 2. A result of this study is, that all models somehow are equally "good" in classifying hate-speech. Therefore, according to the authors, the focus of future research should be on the datasets instead of the models. How would that work? If all models have difficulties classifying new content, how can you improve the classification of hate speech by improving the data set? 3. To what extent could transfer learning help to make the models more efficient? |
1. The paper raises the problem of properly distinguishing between the concepts of hate speech and offensive speech. Although a definition of the former was offered, a clear delimitation between these two concepts was not presented in the paper. It was also mentioned that upon testing each two-class model on offensive ordinary speech, they proved to be susceptible to false positives. What are the criteria applied to distinguish between these two concepts? Can these criteria be considered to be cross-cultural? More examples of words which can be classified in these two groups would be helpful. 2. Regarding the classification of data in hate- and offensive-speech, it seemed to me that this is still rather a subjective matter, basing my assumption on the models of different researchers used in the paper. How can data pertaining to either of these two groups be labeled more efficiently? 3. The paper showed that appending the text with words like "love" or "F" directly affect the toxicity score of a sentence, leading to false positives or false prediction for sentences. How can a classifier be trained to correctly differentiate between the contexts in which such words ("love", "F", etc.) are used so that for example non-hateful sentences which contain an "F" word for example, are not predicted to be hateful? |
1. The authors showed that adversarial attack strategies are very efficient against hate speech classification models. They further discussed the positive impact of adversarial training to prevent misclassification of altered samples. Since one can only add adversarial training samples for known adversarial attack strategies, could you imagine other adversarial attacks than those described in the paper? 2. It was said that all models performed equally well, when they were tested on similar data they were previously trained on. However, when the models were trained on one data set (e.g T1) and then tested on another (e.g T2) the performance was massively reduced. Imagine one would combine the models (all trained on different data) by applying them separately on an arbitrary data set and implement a majority vote for the final classification. Do you think it would result in better classification accuracies? 3. Word-based and character-based approaches differ completely their building structure. Which of the two do you think is the more promising strategy for the future and why? |
|
1. Taking into account the design of a model, what could be the reason for the character-level features to overperform the world-level ones? |
|
Tommi Gröndahl, Luca Pajola, Mika Juuti, Mauro Conti, and N. Asokan. 2018. All You Need is "Love": Evading Hate Speech Detection. In Proceedings of the 11th ACM Workshop on Artificial Intelligence and Security (AISec '18). Association for Computing Machinery, New York, NY, USA, 2–12. DOI:https://doi.org/10.1145/3270101.3270103
1.1 Discussion questions
Write down 3 questions that came up while reading the paper that would be interesting to discuss in the next session. Post your Questions on GitHub as comments under the assignment.
Time slots project presentation:
Find a slot to present you project in session 13 (11.02.) or 14 (18.02.):
https://docs.google.com/spreadsheets/d/1DdkST3KZV4x9D5nGsHgevIASmu_rFkK0Bx2r4AeBGPE/edit#gid=1895482106
The text was updated successfully, but these errors were encountered: