Call for Papers: Unshared Task at ACL 2016 Workshop on Argument Mining
Argument mining (also known as “argumentation mining”) is a recent
challenge in corpus-based discourse processing that applies a certain
argumentation theory to model and automatically analyze the data at hand.
Interest in this topic has rapidly increased over the past few years, as
demonstrated by a number of events (Workshops on Argumentation Mining at ACL
2014, NAACL 2015, and ACL 2016; Dagstuhl Seminar on Debating Technologies in
2015; and Frontiers and Connections between Argumentation Theory and Natural
Language Processing in 2014, to name a few). Given the wide the range of
different perspectives and approaches within the community, it is now very
important to consolidate the view and shape the further development of argument
Shared tasks have have become a major driver in boosting research in many NLP
fields. The availability of shared annotated data and clear evaluation
criteria, and the presence of several competing systems, allow for fair and
exact comparison and overall progress tracking, which in turn fosters future
research. However, argument mining, as an evolving research field, suffers not
only from the lack of large data sets but also from the absence of a unified
perspective on the tasks to be fulfilled as well as their evaluation. Given the
variety of argumentation models, various argumentative genres and registers,
granularity (i.e., micro-argumentation and macro-argumentation), dimensions of
argument (logos, pathos, ethos), and the overall social context of persuasion,
the goal of defining a widely accepted task requires a substantial agreement,
supported by empirical evidence.
The concept of a so-called “unshared task” is an alternative to shared
tasks. In an unshared task, neither quantitative performance measures nor a
clearly defined problem to be solved is given. Instead, participants are given
only raw unannotated data and an open-ended prompt. The goals are to explore
possible tasks, provide a rigorous definition, propose an annotation
methodology, and define evaluation metrics, among others. This type of activity
has been successful in several areas, such as PoliInformatics (a broad group of
computer scientists, political scientists, economists, and communications
scholars) (Smith et al., 2014) and clinical psychology and NLP (Coppersmith et
al., 2015). Some venues also run shared and unshared tasks in parallel
(Zampieri et al., 2015). The nature of an unshared task is mainly exploratory
and can eventually lead to a deeper understanding of the matter and to laying
down foundations for a future standard shared task.
Given a corpus of various argumentative raw texts, the aim of this unshared
task is to tackle several fundamental research questions in argument mining. In
short, the participants should propose a task with a corresponding annotation
model (scheme), conduct an annotation experiment, and critically evaluate the
overall strengths and weaknesses of the task. In particular, the participants
should focus on:
– Proposing a task to be tackled, supported by a strong motivation for it and
accompanied by a rigorous definition and thorough explanation. We encourage
(but do not require) going beyond existing tasks in argumentation mining
research (such as argument component identification, argument structure
parsing, and stance detection). Some novel perspectives might involve capturing
more of the pragmatic layer (such as the activity, purpose, or roles), user
interactions, attribution, or patterns of debating. On the other hand, tasks
involving detecting claims and premises have already been applied across
registers and domains, so they might also serve as a basis for further
– Properly describing the annotation scheme and its grounding in existing
research in argumentation or communication studies (if applicable). This might
involve applying an existing model of argument/argumentation (Toulmin’s,
Walton’s, Dung’s, Pragma-Dialectic) and its critical assessment by empirical
evidence, or a newly created annotation scheme that fits the task at hand.
– Assessing the strengths and weaknesses of the annotation scheme with respect
to its dependence on domain/register and ease or difficulty to apply while
annotating the data. We suggest reporting inter-annotator agreement.
Experimenting with various annotation set-ups, such as expert-, layman-, or
crowdsourcing-based ones, is also encouraged, where applicable. The level of
required expertise is an important factor for any annotation project as well as
the requirements for supporting annotation tools.
– As we provide raw data for four different registers (see below), we encourage
to propose tasks that can be applied across registers, if possible.
Domain-restricted tasks and models are one of the current bottlenecks in
Papers should be submitted as standard workshop short papers (see
http://argmining2016.arg.tech/index.php/home/call-for-papers/ ). They will be
reviewed by the program committee and non-anonymous reviews will be made
available at the workshop website. Papers should mention “unshared task” in
their titles to distinguish themselves from other workshop submissions. We plan
to organize a special poster session for unshared task papers at the workshop,
followed by an open discussion. The acceptance of submissions will be based on
reviewers’ assessment with focus on novelty, thoroughness, and the overall
Where applicable, we encourage submission of the annotated data along with the
paper as supplementary material, so that the community has a chance to explore
the entire dataset rather than the few samples most likely shown in the papers.
We provide four variants of the task across various registers.
– Variant A: Debate portals – Samples from two-sided debate portal
– Variant B: Debate transcripts – Two speeches (opening and closing) for both
the proponent and the opponent from Intelligence squared debates (the entire
debate has more participants, but the entire transcript would be extremely long
for the purposes of the unshared task)
– Variant C: Opinionated news articles – Editorial articles from Room for
debate from N.Y.Times
– Variant D: Discussions under opinionated articles — Discussions from Room
for debate (The IDs of the debates correspond to IDs of the articles from
Each variant is split into three parts:
– Development set: We encourage participants to perform the exploratory
analysis, task definition, annotation experiments, etc. on this set.
– Test set: This small set might serve as a benchmark for testing the
annotation model (or even a computational model, for participants who go that
far) and reporting agreement measures (if applicable).
– Crowdsourcing set: a bit larger set suitable for participants who plan on
Note that these various splits are only a recommendation and not obligatory;
participants are absolutely free to use the entire dataset if their task so
The data is available at GitHub:
It can be easily forked, checked out, or downloaded as a ZIP file.
Same as workshop paper submissions, see
Ivan Habernal, UKP Lab, TU Darmstadt
Iryna Gurevych, UKP Lab, TU Darmstadt
Coppersmith, G., Dredze, M., Harman, C., Hollingshead, K., & Mitchell, M.
(2015). CLPsych 2015 Shared Task: Depression and PTSD on Twitter. In
Proceedings of the 2nd Workshop on Computational Linguistics and Clinical
Psychology: From Linguistic Signal to Clinical Reality (pp. 31-39). Denver,
Colorado: Association for Computational Linguistics.
Smith, N. A., Cardie, C., Washington, A., & Wilkerson, J. (2014). Overview of
the 2014 NLP Unshared Task in PoliInformatics. In Proceedings of the ACL 2014
Workshop on Language Technologies and Computational Social Science (pp. 5-7).
Baltimore, MD, USA: Association for Computational Linguistics.
Zampieri, M., Tan, L., Ljubešić, N., Tiedemann, J., & Nakov, P. (2015).
Overview of the DSL Shared Task 2015. In Proceedings of the Joint Workshop on
Language Technology for Closely Related Languages, Varieties and Dialects (pp.
1-9). Hissar, Bulgaria: Association for Computational Linguistics.