Phân loại người dùng có dấu hiệu ADHD từ ngôn ngữ mạng xã hội: so sánh TF-IDF và DistilBERT

Thị Vi Hài Nguyễn; Thị Huyền Trang Nguyễn

doi:10.62831/202608011

Quản trị - Quản lý Issue:

Phân loại người dùng có dấu hiệu ADHD từ ngôn ngữ mạng xã hội: so sánh TF-IDF và DistilBERT

Thị Vi Hài Nguyễn: Trường Đại học Công nghệ Thành phố Hồ Chí Minh; Thị Huyền Trang Nguyễn: Trường Đại học Công Thương Thành phố Hồ Chí Minh;

doi:10.62831/202608011

Abstract

Nghiên cứu khảo sát khả năng nhận diện người dùng có dấu hiệu rối loạn tăng động giảm chú ý (ADHD) thông qua ngôn ngữ trên mạng xã hội. Dữ liệu được lấy từ bộ Twitter-STMHD và xử lý ở mức người dùng, với 1.999 mẫu sau tiền xử lý. Trên tập dữ liệu này, nghiên cứu so sánh 3 mô hình gồm: Logistic Regression, Linear SVM và DistilBERT. Kết quả cho thấy, cả 3 mô hình đều phân biệt được 2 nhóm ADHD và đối chứng, trong đó Linear SVM đạt hiệu quả cao nhất với Accuracy 0.8733 và F1-score 0.8812. Kết quả gợi ý tín hiệu ngôn ngữ trên mạng xã hội có thể hỗ trợ sàng lọc ADHD, nhưng không thay thế chẩn đoán lâm sàng.

Keywords

ADHD ngôn ngữ mạng xã hội TF-IDF DistilBERT Linear SVM phân loại người dùng+

References

[1] Polanczyk G., de Lima M. S., Horta B. L. et al. (2007). The worldwide prevalence of ADHD: A systematic review and metaregression analysis. American Journal of Psychiatry, 164(6), 942–-948.

[2] Thomas R., Sanders S., Doust J. et al. (2015). Prevalence of attention-deficit/hyperactivity disorder: A systematic review and meta-analysis. Pediatrics, 135(4), e994–-e1001. DOI: 10.1542/peds.2014-3482.

DOI: doi.org/10.1542/peds.2014-3482

[3] Kessler R. C., Adler L., Barkley R. et al. (2006). The prevalence and correlates of adult ADHD in the United States: Results from the National Comorbidity Survey Replication. American Journal of Psychiatry, 163(4), 716–-723. DOI: 10.1176/appi.ajp.163.4.716.

DOI: doi.org/10.1176/appi.ajp.163.4.716

[4] Beyens I., Valkenburg P. M. (2022). Children’s media use and its relation to attention, hyperactivity, and impulsivity. In D. Lemish (Ed.), The Routledge international handbook of children, adolescents, and media (2nd ed., pp. 202–210). Routledge. DOI: 10.4324/9781003118824-26.

DOI: doi.org/10.4324/9781003118824-26

[5] Ra C. K., Cho J., Stone M. D. et al. (2018). Association of digital media use with subsequent symptoms of attention-deficit/hyperactivity disorder among adolescents. JAMA, 320(3), 255–-263. DOI: 10.1001/jama.2018.8931.

DOI: doi.org/10.1001/jama.2018.8931

[6] Chancellor S., De Choudhury M. (2020). Methods in predictive techniques for mental health status on social media: A critical review. NPJ Digital Medicine, 3, Article 43. DOI: 10.1038/s41746-020-0233-7.

DOI: doi.org/10.1038/s41746-020-0233-7

[7] Resnik P., Armstrong W., Claudino L. et al. (2015). Beyond LDA: Exploring supervised topic modeling for depression-related language in Twitter. In Proceedings of the 2nd Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality (pp. 99–-107). Association for Computational Linguistics.

[8] Devlin J., Chang M. W., Lee K. et al. (2018). BERT: Pre-training of deep bidirectional transformers for language understanding [Preprint]. arXiv.

[9] Bokolo B. G., Liu Q. (2023). Deep learning-based depression detection from social media: Comparative evaluation of ML and transformer techniques. Electronics, 12(21), 4396. DOI: 10.3390/electronics12214396.

DOI: doi.org/10.3390/electronics12214396

10.

[10] Suhavi Singh, A. K. Arora U. et al. (2022). Twitter-STMHD: An extensive user-level database of multiple mental health disorders. Proceedings of the International AAAI Conference on Web and Social Media, 16, 1182–-1191.

11.

[11] Kalantari N., Payandeh A., Zampieri M. et al. (2023). Understanding the language of ADHD and autism communities on social media. In 2023 IEEE International Conference on Big Data (BigData) (pp. 2188–-2195). IEEE. DOI: 10.1109/BigData59044.2023.10386833.

DOI: doi.org/10.1109/bigdata59044.2023.10386833

12.

[12] Wiechmann D., Kempa E., Kerz E., et al. (2024). Transparent but powerful: Explainability, accuracy, and generalizability in ADHD detection from social media data [Preprint]. arXiv.

Files

File

Trích dẫn9

Section Quản trị - Quản lý

Issue Tạp chí Công Thương - Các kết quả nghiên cứu khoa học và ứng dụng công nghệ số 8 tháng 4 năm 2026

39.Khung phân tích các nhân tố ảnh hưởng đến ý định sử dụng tư vấn tài chính AI của nhà đầu tư cá nhân tại Việt Nam

Bài viết tập trung xây dựng khung nghiên cứu nhằm xây dựng mô hình ý định chấp nhận tư vấn tài chính AI (Robo-advisor) cho các nhà đầu tư cá...

Chi tiết bài báo

PDF

Lưu Thị Thái Tâm

Adapting to the Brussels Effect in software outsourcing and IT Services: Lessons from India and the Philippines for Vietnam

The concept of the “Brussels Effect” refers to the European Union’s capacity to project its regulatory standards beyond its borders and shape global market practices. In this context, the General Data Protection Regulati...

Chi tiết bài báo

File

DOI: 10.62831/nckh.2026.191.v1

Lê Hoàng Minh

Evaluating Vietnam’s renewable energy development policy: A multi-criteria approach

In the context of the global energy transition, Vietnam experienced rapid growth in renewable energy, particularly between 2017 and 2021 under the feed-in tariff (FIT) mechanism. However, this expansion also generated se...

Chi tiết bài báo

File

DOI: 10.62831/nckh.2026.164.v1

Trương Huy Hoàng

The influence of leadership styles on employee performance in restaurants in Ho Chi Minh City

This study examines the impact of leadership styles on employee performance in restaurants in Ho Chi Minh City. Using a quantitative research approach, data were collected from 300 employees working in various types of r...

Chi tiết bài báo

File

DOI: 10.62831/nckh.2026.163.v1

Đoàn Liêng Diễm; Trịnh Dạ Thảo Vy

Assessing the role of Battery Energy Storage Systems in mitigating power curtailment at solar power plants in Vietnam

The rapid development of solar power in Vietnam has intensified grid congestion and power curtailment in several regions with high solar irradiance. In this context, battery energy storage systems (BESS) are considered a...

Chi tiết bài báo

File

DOI: 10.62831/nckh.2026.162.v1

Đỗ Thị Loan

Article detail

Phân loại người dùng có dấu hiệu ADHD từ ngôn ngữ mạng xã hội: so sánh TF-IDF và DistilBERT

Abstract

References

Files

Related articles

39.Khung phân tích các nhân tố ảnh hưởng đến ý định sử dụng tư vấn tài chính AI của nhà đầu tư cá nhân tại Việt Nam

Adapting to the Brussels Effect in software outsourcing and IT Services: Lessons from India and the Philippines for Vietnam

Evaluating Vietnam’s renewable energy development policy: A multi-criteria approach

The influence of leadership styles on employee performance in restaurants in Ho Chi Minh City

Assessing the role of Battery Energy Storage Systems in mitigating power curtailment at solar power plants in Vietnam

Trích dẫn bài báo

Tìm kiếm bài báo