Massachusetts Undergraduate Research Conference (MassURC)

Poster Session 3, 1:15 PM - 2:00 PM: Room 163 [C24]

Enabling Communication in Multi-Agent VLAs

Presenter: Dorian Benhamou Goldfajn

Faculty Sponsor: Shlomo Zilberstein

School: UMass Amherst

Research Area: Computer Science

ABSTRACT

Vision–language–action models (VLAs) have emerged as a promising approach for enabling general-purpose robots by jointly learning visual perception, natural-language understanding, and motor control within a single framework. However, training data requirements increase exponentially as action space grows, exposing a fundamental bottleneck in their ability to execute high-dimensional actions. Additionally, many real-world domains, including search-and-rescue and disaster response, inherently require multiple robots to coordinate actions and share information in real time. Despite recent progress, existing VLA architectures lack explicit mechanisms for inter-agent communication and coordination.

This limitation motivates the development of VLAs that can communicate in multi-agent scenarios to distribute action complexity and extend overall system capability. This proposal aims to develop novel VLA architectures that support explicit communication among multiple agents, enabling real-time coordination and information sharing.

Natural language provides an intuitive starting point for inter-agent communication, leveraging the language-model backbone within VLA architectures. However, long-form language is inefficient for high-throughput, low-latency robotic coordination. More effective alternatives include constrained vocabularies, structured symbolic messages, or compact latent representations that encode agent intent or planned actions, preserving semantic structure while enabling scalable real-time coordination. The proposed communication pipeline consists of annotating offline trajectories with a compact, predefined vocabulary or learning communication protocols from environment interaction through multi-agent reinforcement learning. The resulting models will be evaluated on a suite of scalable multi-agent robotics tasks developed in IsaacLab, designed to test coordination efficiency, generalization, and robustness across increasing task complexity.

RELATED ABSTRACTS

Training a Robotic Arm With AI, Wilson, Caileen Joan, Fitchburg State University, Poster Session 4, 2:15 PM - 3:00 PM, Auditorium, A27
AI in Production Monitoring of Machining Processes, LaValley, Jamesohn, Fitchburg State University, Poster Session 2, 11:30 AM - 12:15 PM, 163, C23
Bridging the Sim-to-Real Gap: Robust Vision-Based Navigation for Robotic Guide Dogs, Patel, Shiven, UMass Amherst, Poster Session 5, 3:15 PM - 4:00 PM, 165, D4
Autonomous Photographer, Vasquez, Carlos, Salem State University, Poster Session 2, 11:30 AM - 12:15 PM, Concourse, B7
Artificial Intelligence in the Operating Room, Patel, Viran, Springfield Technical Community College, Poster Session 2, 11:30 AM - 12:15 PM, Auditorium, A40

301 Commonwealth Honors College
University of Massachusetts Amherst
157 Commonwealth Avenue
Amherst, MA 01003-9253

Tel: 413.545.2483
Fax: 413.577.2620
Monday - Friday, 9:00 a.m. - 5:00 p.m.
General inquiries: info@honors.umass.edu

This page is maintained by the Commonwealth Honors College
© 2026 University of Massachusetts Amherst • Site Policies
05/26/26 22:33