Predicting Visual Saliency in Augmented Reality: The SARD Dataset and VQSal‑AR Model

This article introduces the SARD dataset of background and AR images, describes a large‑scale eye‑tracking study with 60 participants, and presents the VQSal‑AR vector‑quantization model that outperforms baseline methods in predicting visual saliency for augmented reality scenes.

Youku Technology
Youku Technology
Youku Technology
Predicting Visual Saliency in Augmented Reality: The SARD Dataset and VQSal‑AR Model

Background

ACM Multimedia is a premier international conference in the multimedia field, bringing together academia and industry to showcase innovative research. With the rapid growth of multimedia technologies, augmented reality (AR) has emerged as a promising next‑generation mobile platform. Understanding how AR overlays influence human visual attention is essential for delivering high quality user experiences, yet research on this interaction remains limited.

Dataset Construction (SARD)

The authors created the Saliency in AR Dataset (SARD), which includes 450 background (BG) images, 450 AR images, and 1,350 composite images generated by overlaying BG and AR images at three mixing levels. A large‑scale eye‑tracking experiment was conducted with 60 participants to collect fixation data on these composite scenes.

Proposed VQSal‑AR Method

To predict saliency in AR environments, the paper proposes a vector‑quantization based saliency model named VQSal‑AR. The approach extracts visual features from both the BG and AR layers, applies vector quantization to encode the features, and produces a saliency map for the combined view.

VQSal‑AR model framework
VQSal‑AR model framework

Benchmarks and Evaluation

For scientific comparison, three baseline methods are defined. All methods, including VQSal‑AR, are evaluated on the SARD dataset using standard saliency metrics. Experimental results demonstrate that VQSal‑AR consistently outperforms the baselines on both general saliency prediction and the AR‑specific saliency prediction tasks.

Release

The dataset, benchmark implementations, and the trained VQSal‑AR model will be publicly released to facilitate future research in AR saliency prediction.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

computer visionaugmented realityDataseteye trackingvisual saliencyVQSal-AR
Youku Technology
Written by

Youku Technology

Discover top-tier entertainment technology here.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.