Skip to main content

RG Logo

CAVP
Computational Audio & Vision Perception

The Computational Audio & Video Perception (CAVP) research group at Universiti Teknikal Malaysia Melaka (UTeM) operates under the Centre for Telecommunication Research & Innovation (CeTRI).

This group is dedicated to advancing intelligent systems that can perceive, interpret, and process audio and visual information, closely mimicking human sensory capabilities. Led by experts in signal processing and machine learning, CAVP focuses on key areas such as audio signal enhancement, speech and sound recognition, object detection, video analysis, and multimodal perception systems.

Through the integration of deep learning, computer vision, and acoustic modeling, CAVP aims to develop technologies that support applications in surveillance, smart interaction systems, healthcare, and assistive technologies. The group emphasizes the fusion of audio and visual data to improve decision-making in real-time environments, particularly where single-modality systems may fail.

CAVP is also active in collaborative research with industry partners and academic institutions, striving to bridge theoretical advancements with real-world deployment. The group frequently engages in research innovation showcases, contributing to national and international conferences and journals. Their work continues to push the boundaries of how machines interpret human-like perception through sound and vision, fostering technological solutions that benefit various sectors including security, accessibility, and human-machine interaction.

Computational Audio & Vision Perception (CAVP)

Vision & Mission

Our CAVP vision is

  •  to be a leading research group in intelligent audio-visual perception systems that advance human-centric technologies for industrial, healthcare, and societal applications. 

Our CAVP mission are to

  • develop state-of-the-art computational models in audio and video signal processing for real-time applications. 
  • foster interdisciplinary and industrial collaborations that bridge academic research and practical deployment. 
  • train future talent in AI-based perception systems through research, innovation, and community-based projects. 
  • contribute actively to the scientific community through high-impact publications, open-source tools, and international engagement. 

Niche Area

Emerging Technology: 

  1. Audio Systems 
  2. Vision algorithm perception 
  3. Computer audiotary scene 
  4. Perceptual computing  
  5. Digital imaging 

Sustainable Development Goals

CAVP’s research aligns with several United Nations Sustainable Development Goals:

10 Years Roadmap