(+66) 80 6789 276 • weerayut.b_s20@vistec.ac.th • weerayutbu.github.io
I am a fifth-year Ph.D. student at the Natural Language Processing and Representation Learning (NRL) Lab at VISTEC, Thailand, under the supervision of Assoc. Prof. Dr. Sarana Nutanong and co-supervision of Assoc. Prof. Dr. Attapol Rutherford.
My research focuses on information extraction tasks, Named Entity Recognition (NER), and Representation Learning. My work aims to address the challenges in NER, including limited resources for Thai NER, issues related to open class problems with unseen and long-tail entities, and multilingual and domain-specific. My co-authors and I have previously worked on developing a Thai Fine-grained Nested NER dataset to bridge the gap between low-resource and high-resource languages. Additionally, we have explored few-shot learning techniques, leveraging large language models to generate relevant examples and enhance the effectiveness of few-shot NER.
Currently, I am focusing on creating a bilingual finance-NER dataset in Thai and English to study knowledge transfer from high-resource to low-resource languages.
Ph.D. in Information Science and Technology
Vidyasirimedhi Institute of Science and Technology (VISTEC), Aug 2020 - Present
GPA: 4.00/4.00
Relevant coursework: Natural Language Processing, Computational Machine Intelligence and Applications
B.Eng. in Computer Engineering
Rajamangala University of Technology Lanna (RMUTL-CM), Mar 2016 - Mar 2020
GPA: 3.62/4.00 (Top 1)
Relevant coursework: Data Structures and Algorithms, Operating Systems, Software Engineering
VISTEC, Rayong, Thailand: Researcher Assistant
Nov 2019 – Aug 2020
Thai Nested Named Entity Recognition Corpus.
Weerayut Buaphet, Can Udomcharoenchaikit, Peerat Limkonchotiwat, Attapol Rutherford, and Sarana Nutanong. 2022. In Findings of the Association for Computational Linguistics: ACL 2022, pages 1473–1486, Dublin, Ireland. Association for Computational Linguistics.
Aug 2019 – May 2022
Mitigating Spurious Correlation in Natural Language Understanding with Counterfactual Inference.
Can Udomcharoenchaikit, Wuttikorn Ponwitayarat, Patomporn Payoungkhamdee, Kanruethai Masuk, Weerayut Buaphet, Ekapol Chuangsuwanich, and Sarana Nutanong. 2022. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 11308–11321, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Dec 2022
Cross-Lingual Data Augmentation For Thai Question-Answering.
Parinthapat Pengpun, Can Udomcharoenchaikit, Weerayut Buaphet, and Peerat Limkonchotiwat. 2023. In Proceedings of the 1st GenBench Workshop on (Benchmarking) Generalisation in NLP, pages 193–203, Singapore. Association for Computational Linguistics.
Dec 2023
Few-shot Named Entity Recognition
In this project, we focus on leveraging the capabilities of large language models to generate relevant examples, thereby enhancing the effectiveness of few-shot NER.
Under review
Bi-lingual Named Entity Recognition for financing
Creating a finance-NER dataset composed of two languages, Thai and English. This project aims to study the knowledge transfer from high-resource to low-resource languages in the financial domain.
Ongoing project
• 2024: ARR-EMNLP
Part of the development team for Thai NER datasets and models, including a nested NER model for fine-grained classification and a bilingual (TH-EN) NER model for the financial domain
This program aims to develop knowledge in Data Science and Artificial Intelligence (AI) for middle and high school students interested in practical applications. My friend and I are responsible for five students working on various projects, such as question generation, fake news detection, and creating a dataset and system to support a plant tissue laboratory.
We got 2nd place in the RMUT group IoT competition in Thailand. We used an ESP20 to read sensor data and send it via MQTT to a Raspberry Pi server, which visualized the data on a web interface. I programmed the visualization and configured the ESP20 while my friend handled the hardware connections
This program selects national representatives for the International Design Contest RoBoCon (IDC RoBoCon). All teamsare required to design a robot to solve a provided problem. It promotes equality with mixed teams, equal resources, and collaboration, including training sessions at all levels. My friend and I achieved the following: