Published in The 15th International Workshop on Semantic Evaluation (SemEval 2021), 2021
In this paper, we propose a system to resolve the task 5 in The 15th International Workshop on Semantic Evaluation (SemEva 2021): Toxic Spans Detection. Our method utilizes a pre-trained language model in toxic-domain and combines two approaches Self-training and Feature-based Learning to achieve a high F1-score of 70.77. Finally, we provide insights into the failure of the system and the task’s potential falsely-negative annotations issue with careful error analysis.
Recommended citation: @inproceedings{nguyen-etal-2021-nlp, title = "{S}-{NLP} at {S}em{E}val-2021 Task 5: An Analysis of Dual Networks for Sequence Tagging", author = "Nguyen, Viet Anh and Nguyen, Tam Minh and Quang Dao, Huy and Huu Pham, Quang", booktitle = "Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)", month = aug, year = "2021", address = "Online", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2021.semeval-1.120", doi = "10.18653/v1/2021.semeval-1.120", pages = "888--897", } https://aclanthology.org/2021.semeval-1.120/