The Computational Audio & Video Perception (CAVP) research group at Universiti Teknikal Malaysia Melaka (UTeM) operates under the Centre for Telecommunication Research & Innovation (CeTRI).
This group is dedicated to advancing intelligent systems that can perceive, interpret, and process audio and visual information, closely mimicking human sensory capabilities. Led by experts in signal processing and machine learning, CAVP focuses on key areas such as audio signal enhancement, speech and sound recognition, object detection, video analysis, and multimodal perception systems.
Through the integration of deep learning, computer vision, and acoustic modeling, CAVP aims to develop technologies that support applications in surveillance, smart interaction systems, healthcare, and assistive technologies. The group emphasizes the fusion of audio and visual data to improve decision-making in real-time environments, particularly where single-modality systems may fail.
CAVP is also active in collaborative research with industry partners and academic institutions, striving to bridge theoretical advancements with real-world deployment. The group frequently engages in research innovation showcases, contributing to national and international conferences and journals. Their work continues to push the boundaries of how machines interpret human-like perception through sound and vision, fostering technological solutions that benefit various sectors including security, accessibility, and human-machine interaction.