1
Visual Relationship Detection with Language Priors
2
Embracing Error to Enable Rapid Crowdsourcing
3
Learning Common Sense through Visual Abstraction
4
Generating Semantically Precise Scene Graphs from Textual Descriptions for Improved Image Retrieval
5
Word sense disambiguation: a survey
6
Building a Large-scale Multimodal Knowledge Base System for Answering Visual Queries
7
Attention alters predictive processing
8
AutoExtend: Extending Word Embeddings to Embeddings for Synsets and Lexemes
9
VisKE: Visual knowledge extraction and question answering by visual verification of relation phrases
10
Learning semantic relationships for better action retrieval in images
11
Describing Common Human Visual Actions in Images
12
Image retrieval using scene graphs
13
Discovering states and transformations in image collections
14
Mind's eye: A recurrent visual representation for image caption generation
15
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
16
Learning to Answer Questions from Image Using Convolutional Neural Network
17
Visual Madlibs: Fill in the blank Image Generation and Question Answering
18
Are You Talking to a Machine? Dataset and Methods for Multilingual Image Question
19
Exploring Models and Data for Image Question Answering
20
Image Question Answering: A Visual Semantic Embedding Model and a New Dataset
21
Ask Your Neurons: A Neural-Based Approach to Answering Questions about Images
22
VQA: Visual Question Answering
24
We Are Dynamo: Overcoming Stalling and Friction in Collective Action for Crowd Workers
25
Microsoft COCO Captions: Data Collection and Evaluation Server
26
Scripts, plans, goals, and understanding: an inquiry into human knowledge structures By Roger C. Schank and Robert P. Abelson (review)
28
RMSProp and equilibrated adaptive learning rates for non-convex optimization
29
Phrase-based Image Captioning
30
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
31
Deep visual-semantic alignments for generating image descriptions
32
CIDEr: Consensus-based image description evaluation
33
From captions to visual concepts and back
34
Show and tell: A neural image caption generator
35
Long-term recurrent convolutional networks for visual recognition and description
36
Explain Images with Multimodal Recurrent Neural Networks
37
A Multi-World Approach to Question Answering about Real-World Scenes based on Uncertain Input
38
A Unified Model for Word Sense Representation and Disambiguation
39
Going deeper with convolutions
40
Reasoning about Object Affordances in a Knowledge Base Representation
41
Zero-Shot Learning via Visual Abstraction
42
Very Deep Convolutional Networks for Large-Scale Image Recognition
43
ImageNet Large Scale Visual Recognition Challenge
44
Relation Classification via Convolutional Deep Neural Network
45
Nonparametric Part Transfer for Fine-Grained Recognition
46
Incorporating Scene Context and Object Layout into Appearance Modeling
47
Multimodal Neural Language Models
48
The Stanford CoreNLP Natural Language Processing Toolkit
49
Semantic Parsing for Text to 3D Scene Generation
50
Meteor Universal: Language Specific Translation Evaluation for Any Target Language
51
Microsoft COCO: Common Objects in Context
52
The SUN Attribute Database: Beyond Categories for Deeper Scene Understanding
53
From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions
54
OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks
55
NEIL: Extracting Visual Knowledge from Web Data
56
Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation
57
Bringing Semantics into Focus Using Visual Abstraction
58
Understanding Indoor Scenes Using 3D Geometric Phrases
59
Framing Image Description as a Ranking Task: Data, Models and Evaluation Metrics
60
Efficient Estimation of Word Representations in Vector Space
61
ImageNet classification with deep convolutional neural networks
62
Indoor Segmentation and Support Inference from RGBD Images
63
Semantic Compositionality through Recursive Matrix-Vector Spaces
64
Elementary: Large-Scale Knowledge-Base Construction via Machine Learning and Statistical Inference
65
Recognizing proxemics in personal photos
66
Pedestrian Detection: An Evaluation of the State of the Art
67
Weakly Supervised Learning of Interactions between Humans and Objects
68
Im2Text: Describing Images Using 1 Million Captioned Photographs
69
The Caltech-UCSD Birds-200-2011 Dataset
70
Recognition using visual phrases
71
Every Picture Tells a Story: Generating Sentences from Images
72
Improving the Fisher Kernel for Large-Scale Image Classification
73
Building Watson: An Overview of the DeepQA Project
74
Modeling mutual context of object and human pose in human-object interaction activities
75
SUN database: Large-scale scene recognition from abbey to zoo
76
Observing Human-Object Interactions: Using Spatial and Functional Compatibility for Recognition
77
Learning to detect unseen object classes by between-class attribute transfer
78
Describing objects by their attributes
79
ImageNet: A large-scale hierarchical image database
80
Recognizing linked events: Searching the space of feasible explanations
81
StatSnowball: a statistical approach to extracting entity relationships
82
Word sense disambiguation: A survey
83
Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks
84
Beyond Nouns: Exploiting Prepositions and Comparative Adjectives for Learning Visual Classifiers
85
Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments
86
Support vector machines
87
Recognition by association via learning per-exemplar distances
88
LabelMe: A Database and Web-Based Tool for Image Annotation
89
Learning Visual Attributes
90
Introduction to a Large-Scale General Purpose Ground Truth Database: Methodology, Annotation Tool and Benchmarks
91
Tree Kernel-Based Relation Extraction with Context-Sensitive Structured Parse Tree Information
92
Caltech-256 Object Category Dataset
93
A Shortest Path Dependency Kernel for Relation Extraction
94
Exploring Various Knowledge in Relation Extraction
95
A Statistical Approach to Texture Classification from Single Images
96
Dependency Tree Kernels for Relation Extraction
97
The Senseval-3 English lexical sample task
98
Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories
99
A Template-Based Approach Toward Acquisition of Logical Sentences
100
Bleu: a Method for Automatic Evaluation of Machine Translation
101
Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
102
The Berkeley FrameNet Project
103
Using Corpus Statistics and WordNet Relations for Sense Identification
104
Long Short-Term Memory
105
WordNet: A Lexical Database for English
106
The second naive physics manifesto
107
Culture and Human Development: A New Look
108
Qualitative Process Theory
109
Scripts, plans, goals and understanding: an inquiry into human knowledge structures
110
Learning (ICML-14), pages 595–603
111
Malinowski, M., Rohrbach, M., and Fritz, M. (2015)
112
A visual Turing test for computer vision systems
113
arXiv preprint arXiv:1504.00325
115
biguation. In EMNLP, pages 1025–1035
116
Wah, C., Branson, S., Welinder, P., Per-ona,
117
Ferrucci, D., Brown, E., Chu-Carroll, J., Fan
118
Toward Never Ending Language Learning
120
NLTK: The Natural Language Toolkit
121
Verbnet: a broad-coverage, comprehensive verb lexicon
122
Varma, M. and Zisserman,
123
A Statistical Approach to Texture Classification from Single Images
124
A Comparison of Document Clustering Techniques
125
A solvable connectionist model of immediate recall of ordered lists
126
The Naive Physics Manifesto
127
International Journal of Computer Vision manuscript No. (will be inserted by the editor) The PASCAL Visual Object Classes (VOC) Challenge
128
Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition
130
of the Association for Computational Linguistics