cv | Khai Tran

Basics

Name	Khai P. Tran
Label	Ph.D. Candidate
Email	tranphan.khai@gmail.com
Url	https://khaitran22.github.io/
Summary	Ph.D. candidate in Australia, researching about Information Extraction in Natural Language Processing (NLP).

Education

2022 - Current

Australia
Doctor of Philosophy

The University of Queensland

Information Extraction from Large-scale Low-quality Data
2020 - 2021

Australia
Master of Information Technology

The University of Queensland

Information Technology
2015 - 2019

Vietnam
Bachelor of Business Administration

University of Economics Ho Chi Minh City (UEH)

Business Administration

Work

2022 - Now
Teaching Assistant

The University of Queensland, Australia

Teaching several Machine Learning-related and Software Engineering-related courses.
- COMP4702
- DATA7703
- CSSE7023
2019.04 - 2020.01
Junior .NET Developer

Hoozing Limited Liability Company

Maintained and developed features for Hoozing Integrated Platform & Systems and Hoozing Website

Awards

2022

Best Poster Presentation

The 34th Australasian Joint Conference on Artificial Intelligence

Awarded for the best poster representing accepted paper at the conference.
2022

UQ Earmarked Scholarship

The University of Queensland, Australia

Funded by the Australian Government to support excellent and innovative research project that addresses a significant problem or gap in knowledge and represents value for money.
2021 . 2022

Dean’s Commendation for Academic Excellence

Faculty of Engineering, Architecture and Information Technology, The University of Queensland.

Awarded to students who have excelled academically and who have shown a strong commitment to their program of study.

Publications

2025.01.24

VaeDiff-DocRE: End-to-end Data Augmentation Framework for Document-level Relation Extraction

The 31st International Conference on Computational Linguistics

Document-level Relation Extraction (DocRE) aims to identify relationships between entity pairs within a document. However, most existing methods assume a uniform label distribution, resulting in suboptimal performance on real-world, imbalanced datasets. To tackle this challenge, we propose a novel data augmentation approach using generative models to enhance data from the embedding space. Our method leverages the Variational Autoencoder (VAE) architecture to capture all relation-wise distributions formed by entity pair representations and augment data for underrepresented relations. To better capture the multi-label nature of DocRE, we parameterize the VAE’s latent space with a Diffusion Model. Additionally, we introduce a hierarchical training framework to integrate the proposed VAE-based augmentation module into DocRE systems. Experiments on two benchmark datasets demonstrate that our method outperforms state-of-the-art models, effectively addressing the long-tail distribution problem in DocRE.
2024.07.16

CDER: Collaborative Evidence Retrieval for Document-Level Relation Extraction

The 16th Asian Conference on Intelligent Information and Database Systems

Document-level Relation Extraction (DocRE) involves identifying relations between entities across multiple sentences in a document. Evidence sentences, crucial for precise entity pair relationships identification, enhance focus on essential text segments, improving DocRE performance. However, existing evidence retrieval systems often overlook the collaborative nature among semantically similar entity pairs in the same document, hindering the effectiveness of the evidence retrieval task. To address this, we propose a novel evidence retrieval framework, namely CDER. CDER employs an attentional graph-based architecture to capture collaborative patterns and incorporates a dynamic sub-structure for additional robustness in evidence retrieval. Experimental results on the benchmark DocRE dataset show that CDER not only excels in the evidence retrieval task but also enhances overall performance of existing DocRE system.
2022.03.19

Improving traffic load prediction with multi-modality: a case study of Brisbane

The 34th Australasian Joint Conference on Artificial Intelligence

Fast and accurate traffic load prediction is a pivotal component of the Intelligent Transport System. It will reduce time spent by commuters and save our environment from vehicle emissions. During the COVID-19 pandemic, people prefer to use private transportation; thus predicting the traffic load becomes more critical. In these years, researchers have developed some traffic load prediction models and have applied these models successfully on data from the US, China or Europe. However, none of these models has been applied to traffic data in Australia. Considering that Australia bears different political, geographical, and climate conditions from other countries, these models may not be suitable to predict the traffic load in Australia. In this paper, we investigate this problem and proposes a multi-modal method that is capable of using Australia-specific data to assist traffic load prediction. Specifically, we use daily social media data together with traffic data to predict the traffic load. We illustrate a protocol to pre-process raw traffic and social media data and then propose a multi-modal model, namely DM2T, which accurately make time-series prediction by using both time-series data and other media data. We validate the effectiveness of our proposed method by a case study on Brisbane city. The result shows that with the help of Australia-specific social media data, our proposed method can make more accurate traffic load prediction for Brisbane than conventional methods.

Skills

	Languages
	Python
	Java
	JavaScript
	HTML
	CSS

	Data Science & Machine Learning
	PyTorch
	Hugging Face Libraries
	Deep Graph Library
	TensorFlow

Languages

	Vietnamese
	Native speaker

	English
	Professional working proficiency

Basics

Education

The University of Queensland

Information Extraction from Large-scale Low-quality Data

The University of Queensland

Information Technology

University of Economics Ho Chi Minh City (UEH)

Business Administration

Work

The University of Queensland, Australia

Teaching several Machine Learning-related and Software Engineering-related courses.

Hoozing Limited Liability Company

Maintained and developed features for Hoozing Integrated Platform & Systems and Hoozing Website

Awards

The 34th Australasian Joint Conference on Artificial Intelligence

Awarded for the best poster representing accepted paper at the conference.

The University of Queensland, Australia

Funded by the Australian Government to support excellent and innovative research project that addresses a significant problem or gap in knowledge and represents value for money.

Faculty of Engineering, Architecture and Information Technology, The University of Queensland.

Awarded to students who have excelled academically and who have shown a strong commitment to their program of study.

Publications

The 31st International Conference on Computational Linguistics

The 16th Asian Conference on Intelligent Information and Database Systems

The 34th Australasian Joint Conference on Artificial Intelligence

Skills

Languages