Vision-Language Models Mistake Head Orientation for Gaze Direction
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
Vision-Language Models Mistake Head Orientation for Gaze Direction: Nonverbal Conversation Cues
arXiv:2506.05412v4 Announce Type: replace Abstract: Where someone looks is a nonverbal communication cue that children and adults readily use. How well can Vision-Language Models (VLMs) infer gaze targets? To construct evaluation stimuli, we captured 1,360 real-world photos of scenes in which a person gazes at one of several objects on a table.