Research in natural language understanding and textual inference has advanced considerably in recent years, resulting in powerful models that are able to read and understand texts, even outperforming humans in some cases. However, it remains challenging to answer questions that go beyond the texts themselves, requiring the use of additional commonsense knowledge. Previous work has explored using both explicit representations of background knowledge (e.g., ConceptNet or NELL), and latent representations that capture some aspects of commonsense (e.g., OpenAI GPT). These and any other methods for representing and using commonsense in NLP are of interest to this workshop.
The COIN workshop aims at bringing together researchers that are interested in modeling commonsense knowledge, developing computational models thereof, and applying commonsense inference methods in NLP tasks. We are interested in any type of commonsense knowledge representation, and explicitly encourage work that makes use of knowledge bases and approaches developed to mine or learn commonsense from other sources. The workshop is also open for evaluation proposals that explore new ways of evaluating methods of commonsense inference, going beyond established natural language processing tasks.
The workshop will also include two shared tasks on common-sense machine reading comprehension in English, one based on everyday scenarios and one based on news events. See Shared Tasks for more details.
If you are participating or interested in participating in COIN, we welcome you to join the COIN mailing list on Google Groups. Follow the link and click "Join Group" to join.
Despite considerable advances in deep learning, AI remains to be narrow and brittle. One fundamental limitation comes from its lack of commonsense intelligence: reasoning about everyday situations and events, which in turn, requires knowledge about how the physical and social world works. In this talk, I will share some of our recent efforts that attempt to crack commonsense intelligence.
First, I will introduce ATOMIC, the atlas of everyday commonsense knowledge and reasoning, organized as a graph of 877k if-then rules (e.g., “if X pays Y a compliment, then Y will likely return the compliment”). Next, I will introduce COMET, our deep neural networks that can learn from and generalize beyond the ATOMIC commonsense graph. Finally, I will present RAINBOW, a collection of seven benchmarks that aims to cover a wide spectrum of commonsense intelligence from natural language inference to adductive reasoning to visual commonsense reasoning. I will conclude the talk by discussing major open research questions, including the importance of algorithmic solutions to reduce incidental biases in data that can lead to overestimation of true AI capabilities.
9:00 | Opening |
9:10 |
Invited talk Commonsense Intelligence---Cracking the Longstanding Challenge in AI [PDF] Yejin Choi |
10:10 |
Understanding Commonsense Inference Aptitude of Deep Contextual Representations
Jeff Da and Jungo Kasai |
10:30 | Coffee break |
11:00 | A Hybrid Neural Network Model for Commonsense Reasoning
Pengcheng He, Xiaodong Liu, Weizhu Chen, Jianfeng Gao |
11:20 | Towards Generalizable Neuro-Symbolic Systems for Commonsense Question Answering
Kaixin Ma, Jonathan Francis, Quanyang Lu, Eric Nyberg, Alessandro Oltramari |
11:40 | When Choosing Plausible Alternatives, Clever Hans can be Clever
Pride Kavumba, Naoya Inoue, Benjamin Heinzerling, Keshav Singh, Paul Reisert, Kentaro Inui |
12:00 | Commonsense about Human Senses: Labeled Data Collection Processes
Ndapa Nakashole |
12:20 | Lunch break |
14:00 |
Invited talk Learning to Reason: from Question Answering to Problem Solving [PDF] Michael Witbrock |
15:00 |
Extracting Common Inference Patterns from Semi-Structured Explanations
Sebastian Thiem and Peter Jansen |
15:20 | Poster session & coffee break |
Commonsense Inference in Natural Language Processing (COIN) - Shared Task Report
Simon Ostermann, Sheng Zhang, Michael Roth, Peter Clark | |
KARNA at COIN Shared Task 1: Bidirectional Encoder Representations from Transformers with relational knowledge for machine comprehension with common sense
Yash Jain and Chinmay Singh | |
IIT-KGP at COIN 2019: Using pre-trained Language Models for modeling Machine Comprehension
Prakhar Sharma and Sumegh Roychowdhury | |
Jeff Da at COIN - Shared Task
Jeff Da | |
Pingan Smart Health and SJTU at COIN - Shared Task: utilizing Pre-trained Language Models and Common-sense Knowledge in Machine Reading Tasks
Xiepeng Li, Zhexi Zhang, Wei Zhu, Zheng Li, Yuan Ni, Peng Gao, Junchi Yan, Guotong Xie | |
BLCU-NLP at COIN-Shared Task1: Stagewise Fine-tuning BERT for Commonsense Inference in Everyday Narrations
Chunhua Liu and Dong Yu | |
16:20 |
Commonsense inference in human-robot communication
Aliaksandr Huminski, Yan Bin Ng, Kenneth Kwok, Francis Bond |
16:40 |
Diversity-aware Event Prediction based on a Conditional Variational Autoencoder with Reconstruction
Hirokazu Kiyomaru, Kazumasa Omura, Yugo Murawaki, Daisuke Kawahara, Sadao Kurohashi |
17:00 |
Can a Gorilla Ride a Camel? Learning Semantic Plausibility from Text
Ian Porada, Kaheer Suleman, Jackie Chi Kit Cheung |
17:15 |
How Pre-trained Word Representations Capture Commonsense Physical Comparisons
Pranav Goel, Shi Feng, Jordan Boyd-Graber |
This workshop includes two shared tasks on English reading comprehension using commonsense knowledge. The first task is a multiple choice reading comprehension task on everyday narrations. The second task is a cloze task on news texts.
In contrast to other machine comprehension tasks and workshops, our focus will be on the inferences over commonsense knowledge about events and participants that are required for text understanding. Participants are encouraged to use any external resources that could improve their systems. Below we give a list of external resources that we expect to be helpful for the tasks.
If you make submissions to one of the shared tasks, they will be added to the development data leaderboard. The test data for both tasks will not be public, but you will have to submit your models so that we can run them on the test data. During the evaluation pahse (first 3 weeks of June), your submissions will count towards the final ranking on the test data. The final leaderboard will be made public only after the evaluation phase ends.
The development set leaderboard will be updated approx. once a week with all current submissions.
If you want to participate or have any questions, please join the google group for participants. We'll post updates in the group, and answer questions on the shared task and workshop.