Enhanced Multimodal Video Retrieval System: Integrating Query Expansion and Cross-modal Temporal Event Retrieval.
In Proceedings of the 14th International Symposium on Information and Communication Technology (SoICT 2025). HCMUT EE Machine Learning & IoT Lab.
We propose a robust video retrieval system that enables cross-modal temporal event querying, introduces an adaptive KDE-GMM thresholding algorithm for optimal keyframe extraction, and enhances search performance through LLM-based query expansion, achieving strong results in the Ho Chi Minh AI Challenge 2025.