🪴 Anil's Garden

Home

❯

Clippings

❯

The Second Perception Test Challenge - ECCV Workshop 2024

15 Sept 20251 min read

clippings

Overview

Following the successful 2023 iteration, we organise the second Perception Test Challenge with the goal of benchmarking multimodal perception models on the Perception Test (blog, github) - a diagnostic benchmark created by Google DeepMind to comprehensively probe the abilities of multimodal models across:

three modalities: video, audio, and text
four skill areas: Memory, Abstraction, Physics, Semantics
four types of reasoning: Descriptive, Explanatory, Predictive, Counterfactual
six computational tasks: multiple-choice video-QA, grounded video-QA, object tracking, point tracking, action localisation, sound localisation

You can try yourself the Perception Test here.

Check the Perception Test github repo for details about the data and annotations format, baselines, and metrics.

Check the Computer Perception workshop at ECCV2022 for recorded talks and slides introducing the Perception Test benchmark.

Check the First Perception Test challenge for details of the previous challenge.

Perception Test overview slides from the 2024 workshop here.

Contact: viorica at google.com, perception-test at google.com

Graph View

Backlinks

Vision

Website
Bluesky
Twitter/X
GitHub
LinkedIn
Instagram
Goodreads
Letterboxd
🍋

🪴 Anil's Garden

Explorer

The Second Perception Test Challenge - ECCV Workshop 2024

Overview

Graph View

Backlinks