Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Page Not Found

Page not found. Your pixels are in another canvas.

Jupyter notebook markdown generator

Posts

Older Blog Posts

less than 1 minute read

Published: January 01, 2020

For the older blogs, please visit my page at Viblo. I had written all of those blogs while I had been working at Sun Asterisk.

activities

HUST - Volunteer Teaching 2015-2017

Published: January 01, 2015

Since I was a freshman, I spent a lot of time running volunteer classes in basic science subject like Advanced Math, Linear Algebra, and Physics for freshmen at Hanoi University of Science and Technology.

Vietnam Mobile Day 2019: Build your own artificial intelligence assistant similar to Google assistant

Published: June 06, 2019

At Vietnam Mobile Day 2019, I talked on the topic “Build your own artificial intelligence assistant similar to Google assistant”

Vietnam Frontier Summit 2019: OCR techniques for DX revolution

Published: October 06, 2019

At Vietnam Frontier Summit 2019, I talked on the topic “OCR techniques for DX revolution”

Soict Day 2019: AI in Electroencephalography techniques

Published: December 04, 2019

This is my talk at Hanoi University of Science and Technology. During the talk, I discussed the challenges and directions of our company in applying EEG technology in business.

RubikAI Series Talk: Moving AI from PoC Stage to Production

Published: December 07, 2019

This is the panel discussion where I and Mr. Nguyen Xuan Binh together to discuss the difficulties and challenges when moving an AI project from PoC to production.

Natural Language Processing with Deep Learning Course

Published: January 01, 2020

This is a course that I co-hosted with Techmaster Vietnam on Deep Learning and Natural Language Processing.

Natural Language Processing Course

Published: December 13, 2022

I conducted a short-term training program for the “10 Institute” under the Ministry of National Defense. Over one month, I assisted officers in building AI systems tailored to their unique challenges.

Advancing Intelligent Systems with Language Models: Automotive, Robotics, and Beyond

Published: December 22, 2024

I conducted a collaborative class with NTI-VietAI on the topic “Advancing Intelligent Systems with Language Models: Automotive, Robotics, and Beyond”. The session offered fresh perspectives on the applications of LLM in emerging industries such as electric vehicles and robotics.

honors

CMC AI Contest: Face recognition

AI challenge, CMC Global Company Limited., 2019

At CMC Contest 2019, my team won 2nd and 3rd place on the final leaderboard. Here is a short of my presentation :D.

The sixth International Workshop on Vietnamese Language and Speech Processing

NLP Workshop, VLSP, 2019

At VLSP 2019, my team gained:

the 3rd prize at Vietnamese Language and Speech Processing 2019 Text-to-Speech Challenge
the first prize at Vietnamese Language and Speech Processing 2019 Hate Speech Detection on Social Networks Challenge.

KSE 2021: Best Paper Award

Conference, KSE: Knowledge and Systems Engineering, 2019

At KSE 2020, we achieved the Best Paper Award Certificate with paper “From Universal Language Model to Downstream Task: Improving RoBERTa-Based Vietnamese Hate Speech Detection”.

Emotion Recognition Challenge 2019 (ERC2019)

AI challenge, HCMUS and VinTech City, 2019

Emotion Recognition Challenge 2019 is a contest on audio classification and sentiment analysis based on human voices. ERC2019 is co-organized by Vintech City and Vietnam National University Ho Chi Minh City - University of Science at Hochiminh city.

TensorFlow Developer Certificate

Certificate, Tensorflow, Google, 2020

After 2 hours of coding, I have achieved Tensorflow Developer Certificate on 01/06/2020.

The seventh international workshop on Vietnamese Language and Speech Processing (VLSP 2020)

NLP Workshop, VLSP, 2020

At VLSP 2020, my team gained:

the 3rd prize in the shared task of Vietnamese Relation Extraction
the 3rd prize in the shared task of Reliable Intelligence Identification on Vietnamese Social Network Sites
and won the first prize in the shared task of Vietnamese Universal Dependency Parsing.

Sun* Annual Awards 2020: Nominated for Talented Staff Awards

Awards, Sun Asterisk Inc., 2020

Nominated for Talented Staff Awards

ICDAR2021 Competition Multimodal Emotion Recognition on Comics scenes

AI challenge, ICDAR Workshop, 2021

CVPRW: The AI City Challenge Workshop @ CVPR 2021

AI challenge, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2021

SemEval 2021: The 15th International Workshop on Semantic Evaluation

NLP Workshop, The 15th International Workshop on Semantic Evaluation, 2021

portfolio

NLP Research Team - VinBigData

Hanoi, Jan 25, 2024

VAT-VinBigData

Haiduong, Dec 24, 2022

Sun Asterisk AI Reseach Team

Hanoi, Oct 20, 2019

publications

Deep Neural Networks based Invisible Steganography for Audio-into-Image Algorithm

Published in The Global Conference on Consumer Electronics (GCCE 2019), 2019

Steganography is the science of concealing secret information inside usual forms of data. In this paper, the use of deep learning techniques to hide secret audio into the digital images is proposed. Extensive experiments are carried out with a set of 24K images and an audio dataset named VIVOS Corpus. Through experimental results, it has been confirmed that our method is more effective than traditional approaches. The integrity of both image and audio is well preserved while the length of the hidden audio is significantly improved.

Recommended citation: Q. P. Huu, T. H. Dinh, N. N. Tran, T. P. Van and T. T. Minh, "Deep Neural Networks based Invisible Steganography for Audio-into-Image Algorithm," 2019 IEEE 8th Global Conference on Consumer Electronics (GCCE), 2019, pp. 423-427, doi: 10.1109/GCCE46687.2019.9015498. https://ieeexplore.ieee.org/document/9015498

From Universal Language Model to Downstream Task: Improving RoBERTa-Based Vietnamese Hate Speech Detection

Published in The International Conference on Knowledge and Systems Engineering (KSE), 2020

Natural language processing (NLP) is a fast-growing field of artificial intelligence. Since the Transformer was introduced by Google in 2017, a large number of language models such as BERT, GPT, and ELMo have been inspired by this architecture. In this paper, we propose a pipeline to adapt the general-purpose RoBERTa language model to a specific text classification task: Vietnamese Hate Speech Detection. Our experiments proved that our proposed pipeline boosts the performance significantly, achieving a new state-of-the-art on Vietnamese Hate Speech Detection (HSD) campaign.

Recommended citation: Q. H. Pham, V. Anh Nguyen, L. B. Doan, N. N. Tran and T. M. Thanh, "From Universal Language Model to Downstream Task: Improving RoBERTa-Based Vietnamese Hate Speech Detection," 2020 12th International Conference on Knowledge and Systems Engineering (KSE), 2020, pp. 37-42, doi: 10.1109/KSE50997.2020.9287406. https://ieeexplore.ieee.org/document/9287406

Improving BERT-Based Noisy Text Classification with Knowledge of the Data domain

Published in The 6th Workshop on Noisy User-generated Text (WNUT 2020), 2020

This paper proposes an improved custom model for WNUT task 2: Identification of Informative COVID-19 English Tweet. We improve experiment with the effectiveness of fine-tuning methodologies for state-of-the-art language model RoBERTa. We make a preliminary instantiation of this formal model for the text classification approaches. With appropriate training techniques, our model is able to achieve 0.9218 F1-score on public validation set and the ensemble version settles at top 9 F1-score (0.9005) and top 2 Recall (0.9301) on private test set.

Recommended citation: @inproceedings{doan-bao-etal-2020-sunbear, title = "{S}un{B}ear at {WNUT}-2020 Task 2: Improving {BERT}-Based Noisy Text Classification with Knowledge of the Data domain", author = "Doan Bao, Linh and Nguyen, Viet Anh and Pham Huu, Quang", booktitle = "Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020)", month = nov, year = "2020", address = "Online", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2020.wnut-1.73", doi = "10.18653/v1/2020.wnut-1.73", pages = "485--490", } https://aclanthology.org/2020.wnut-1.73/

S-NLP at SemEval-2021 Task 5: An Analysis of Dual Networks for Sequence Tagging

Published in The 15th International Workshop on Semantic Evaluation (SemEval 2021), 2021

In this paper, we propose a system to resolve the task 5 in The 15th International Workshop on Semantic Evaluation (SemEva 2021): Toxic Spans Detection. Our method utilizes a pre-trained language model in toxic-domain and combines two approaches Self-training and Feature-based Learning to achieve a high F1-score of 70.77. Finally, we provide insights into the failure of the system and the task’s potential falsely-negative annotations issue with careful error analysis.

Recommended citation: @inproceedings{nguyen-etal-2021-nlp, title = "{S}-{NLP} at {S}em{E}val-2021 Task 5: An Analysis of Dual Networks for Sequence Tagging", author = "Nguyen, Viet Anh and Nguyen, Tam Minh and Quang Dao, Huy and Huu Pham, Quang", booktitle = "Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)", month = aug, year = "2021", address = "Online", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2021.semeval-1.120", doi = "10.18653/v1/2021.semeval-1.120", pages = "888--897", } https://aclanthology.org/2021.semeval-1.120/

Exploiting Transfer Learning Models for Reliable Intelligence Identification on Vietnamese Social Network Sites

Published in The 7th International Workshop on Vietnamese Language and Speech Processing, 2021

The rapid growth of social media and misinformation posed challenges for authorities. As a result, determining the reliability of an article has become a crucial task. After various ablation studies, we propose a multi-input model that can effectively leverage both tabular metadata and post content for the task. Applying state-of-the-art fine-tuning techniques for the pre-trained component and training strategies for our complete model, we have achieved a 0.9462 ROC-score on the VLSP private test set.

Recommended citation: @inproceedings{thanh-van-2020-reintel, title = "{R}e{INTEL} Challenge 2020: Exploiting Transfer Learning Models for Reliable Intelligence Identification on {V}ietnamese Social Network Sites", author = "Thanh, Kim Nguyen Thi and Van, Kiet Nguyen", booktitle = "Proceedings of the 7th International Workshop on Vietnamese Language and Speech Processing", month = dec, year = "2020", address = "Hanoi, Vietnam", publisher = "Association for Computational Lingustics", url = "https://aclanthology.org/2020.vlsp-1.9", pages = "45--48", } https://aclanthology.org/2020.vlsp-1.9/

S-NLP at AI City Challenge Task 5 2021: Contrastive Learning for Natural Language-Based Retrieval

Published in The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2021

AI City Challenge 2021 Task 5: The Natural Language-Based Vehicle Tracking is a Natural Language-based Vehicle Retrieval task, which requires retrieving a single-camera track using a set of three natural language descriptions of the specific targets. In this paper, we present our methods to tackle the difficulties of the provided task. Experiments with our approaches on the competitive dataset from AICity Challenge 2021 show that our techniques achieve Mean Reciprocal Rank score of 0.1701 on the public test dataset and 0.1571 on the private test dataset.

Recommended citation: @InProceedings{Nguyen_2021_CVPR, author = {Nguyen, Tam Minh and Pham, Quang Huu and Doan, Linh Bao and Trinh, Hoang Viet and Nguyen, Viet-Anh and Phan, Viet-Hoang}, title = {Contrastive Learning for Natural Language-Based Vehicle Retrieval}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2021}, pages = {4245-4252} } https://openaccess.thecvf.com/content/CVPR2021W/AICity/html/Nguyen_Contrastive_Learning_for_Natural_Language-Based_Vehicle_Retrieval_CVPRW_2021_paper.html

Quang Huu Pham