报告人:江沸菠副教授
地点:中和楼338
时间:3月27日(本周四)下午3:00
主题:Large Model-Empowered Multimodal Semantic Communication
摘要:Multimodal signals, including text, audio, image, and video, can be integrated into semantic communication systems to provide a low-latency, high-quality immersive experience at the semantic level. However, multimodal semantic communication faces several challenges, such as data heterogeneity, semantic ambiguity, and signal distortion during transmission.
In recent years, large models, particularly large language models (LLMs), vision-language models (VLMs), and large multimodal models (LMMs), have offered potential solutions to address these challenges. We conduct a systematic study on the application of large models in semantic communication, including a cross-modal semantic communication system based on VLMs, a multimodal semantic communication system empowered by LMMs, a multi-agent system leveraging LLMs, and a VLM-based multimodal, multi-user, and multi-task semantic communication system.
Additionally, we explore knowledge base design schemes based on large models and propose a foundational large model for the communication domain, enhanced with the retrieval-augmented generation (RAG) and knowledge graph. These methods will further enhance the performance of semantic communication, eliminate semantic noise, and provide valuable insights for the advancement of semantic communication technology.