<HashMap><database>biostudies-literature</database><scores/><additional><submitter>Bomatter P</submitter><funding>NEI NIH HHS</funding><pagination>255-264</pagination><full_dataset_link>https://www.ebi.ac.uk/biostudies/studies/S-EPMC9432425</full_dataset_link><repository>biostudies-literature</repository><omics_type>Unknown</omics_type><volume>2021</volume><pubmed_abstract>Context is of fundamental importance to both human and machine vision; e.g., an object in the air is more likely to be an airplane than a pig. The rich notion of context incorporates several aspects including physics rules, statistical co-occurrences, and relative object sizes, among others. While previous work has focused on crowd-sourced out-of-context photographs from the web to study scene context, controlling the nature and extent of contextual violations has been a daunting task. Here we introduce a diverse, synthetic &lt;b>O&lt;/b>ut-of-&lt;b>C&lt;/b>ontext &lt;b>D&lt;/b>ataset (OCD) with fine-grained control over scene context. By leveraging a 3D simulation engine, we systematically control the gravity, object co-occurrences and relative sizes across 36 object categories in a virtual household environment. We conducted a series of experiments to gain insights into the impact of contextual cues on both human and machine vision using OCD. We conducted psychophysics experiments to establish a human benchmark for out-of-context recognition, and then compared it with state-of-the-art computer vision models to quantify the gap between the two. We propose a context-aware recognition transformer model, fusing object and contextual information via multi-head attention. Our model captures useful information for contextual reasoning, enabling human-level performance and better robustness in out-of-context conditions compared to baseline models across OCD and other out-of-context datasets. All source code and data are publicly available at https://github.com/kreimanlab/WhenPigsFlyContext.</pubmed_abstract><journal>... IEEE International Conference on Computer Vision workshops. IEEE International Conference on Computer Vision</journal><pubmed_title>When Pigs Fly: Contextual Reasoning in Synthetic and Natural Scenes.</pubmed_title><pmcid>PMC9432425</pmcid><funding_grant_id>R01 EY026025</funding_grant_id><funding_grant_id>R21 EY019710</funding_grant_id><pubmed_authors>Bomatter P</pubmed_authors><pubmed_authors>Madan S</pubmed_authors><pubmed_authors>Kreiman G</pubmed_authors><pubmed_authors>Tseng C</pubmed_authors><pubmed_authors>Zhang M</pubmed_authors><pubmed_authors>Karev D</pubmed_authors></additional><is_claimable>false</is_claimable><name>When Pigs Fly: Contextual Reasoning in Synthetic and Natural Scenes.</name><description>Context is of fundamental importance to both human and machine vision; e.g., an object in the air is more likely to be an airplane than a pig. The rich notion of context incorporates several aspects including physics rules, statistical co-occurrences, and relative object sizes, among others. While previous work has focused on crowd-sourced out-of-context photographs from the web to study scene context, controlling the nature and extent of contextual violations has been a daunting task. Here we introduce a diverse, synthetic &lt;b>O&lt;/b>ut-of-&lt;b>C&lt;/b>ontext &lt;b>D&lt;/b>ataset (OCD) with fine-grained control over scene context. By leveraging a 3D simulation engine, we systematically control the gravity, object co-occurrences and relative sizes across 36 object categories in a virtual household environment. We conducted a series of experiments to gain insights into the impact of contextual cues on both human and machine vision using OCD. We conducted psychophysics experiments to establish a human benchmark for out-of-context recognition, and then compared it with state-of-the-art computer vision models to quantify the gap between the two. We propose a context-aware recognition transformer model, fusing object and contextual information via multi-head attention. Our model captures useful information for contextual reasoning, enabling human-level performance and better robustness in out-of-context conditions compared to baseline models across OCD and other out-of-context datasets. All source code and data are publicly available at https://github.com/kreimanlab/WhenPigsFlyContext.</description><dates><release>2021-01-01T00:00:00Z</release><publication>2021 Oct</publication><modification>2025-04-05T12:10:49.944Z</modification><creation>2025-04-05T12:10:49.944Z</creation></dates><accession>S-EPMC9432425</accession><cross_references><pubmed>36051852</pubmed><doi>10.1109/iccv48922.2021.00032</doi></cross_references></HashMap>