EMNLP 2022 谣言/假新闻检测、事实核查论文汇总（附链接 ...

艾尔quit · 发表于 2022-11-2 18:47:49

EMNLP 2022组委会近日放出了接收论文清单。大会共接收829篇主会论文（含715篇长文、114篇短文），552篇Findings论文（含456篇长文、96篇短文）。由于总投稿数未知，暂时无法计算录用率。在EMNLP 2022收录论文中，有12.4\%的长文和11.9\%的短文来自ACL Rolling Review机制。EMNLP 2022将于2022年12月7日至11日在阿联酋首都阿布扎比举办。详细收录清单见

本文主要整理了事实核查、谣言和假新闻检测方向的论文，并尽可能找到了对应的预印版本，共计14篇。整体而言，今年EMNLP收录论文的多样性较强，刨坑型文章多于跟随型文章。
主会长文（MAIn Conference Long Paper）

1. [基于立场、多文化虚假新闻] Stanceosaurus: Classifying Stance Towards Multicultural Misinformation
作者：Jonathan Qiaoyi Zheng, Ashutosh Baheti, Tarek Naous, Wei Xu and Alan Ritter （佐治亚理工学院）
分类：Resources and Evaluation
摘要：We present Stanceosaurus, a new corpus of 28,033 tweets in English, Hindi and Arabic annotated with stance towards 250 misinformation claims. As far as we are aware, it is the largest corpus annotated with stance towards misinformation claims. The claims in Stanceosaurus originate from 15 fact-checking sources that cover diverse geographical regions and cultures. Unlike existing stance datasets, we introduce a more fine-grained 5-class labeling strategy with additional subcategories to distinguish implicit stance. Pre-trained transformer-based stance classifiers that are fine-tuned on our corpus show good generalization on unseen claims and regional claims from countries outside the training data. Cross-lingual experiments demonstrate Stanceosaurus&#39; capability of training multilingual models, achieving 53.1 F1 on Hindi and 50.4 F1 on Arabic without any target-language fine-tuning. Finally, we show how a domain adaptation method can be used to improve performance on Stanceosaurus using additional RumourEval-2019 data. We will make Stanceosaurus publicly available to the research community upon publication and hope it will encourage further work on misinformation identification across languages and cultures.
链接：
2. [事实核查问题生成] Varifocal Question Generation for Fact-checking
作者：Nedjma Djouhra Ousidhoum, Zhangdie Yuan and Andreas Vlachos （剑桥大学）
分类：NLP Applications
摘要：Fact-checking requires retrieving evidence related to a claim under investigation. The task can be formulated as question generation based on a claim, followed by question answering. However, recent question generation approaches assume that the answer is known and typically contained in a passage given as input, whereas such passages are what is being sought when verifying a claim. In this paper, we present Varifocal, a method that generates questions based on different focal points within a given claim, i.e. different spans of the claim and its metadata, such as its source and date. Our method outperforms previous work on a fact-checking question generation dataset on a wide range of automatic evaluation metrics. These results are corroborated by our manual evaluation, which indicates that our method generates more relevant and informative questions. We further demonstrate the potential of focal points in generating sets of clarification questions for product descriptions.
链接：
3. [科研成果在新闻表达中的信息变化] Modeling Information Change in Science Communication with Semantically Matched Paraphrases
作者：Dustin Wright, Jiaxin Pei, David Jurgens and Isabelle Augenstein （哥本哈根大学、密歇根大学安娜堡分校）
分类：Computational Social Science and Cultural Analytics
摘要：Whether the media faithfully communicate scientific information has long been a core issue to the science community. Automatically identifying paraphrased scientific findings could enable large-scale tracking and analysis of information changes in the science communication process, but this requires systems to understand the similarity between scientific information across multiple domains. To this end, we present the SCIENTIFIC PARAPHRASE AND INFORMATION CHANGE DATASET (SPICED), the first paraphrase dataset of scientific findings annotated for degree of information change. SPICED contains 6,000 scientific finding pairs extracted from news stories, social media discussions, and full texts of original papers. We demonstrate that SPICED poses a challenging task and that models trained on SPICED improve downstream performance on evidence retrieval for fact checking of real-world scientific claims. Finally, we show that models trained on SPICED can reveal large-scale trends in the degrees to which people and organizations faithfully communicate new scientific findings. Data, code, and pre-trained models are available at http://www.copenlu.com/publication/2022_emnlp_wright/.
链接：
4. [识别Twitter中的claim] Empowering the Fact-checkers! Automatic Identification of Claim Spans on Twitter
作者：Megha Sundriyal, Atharva Kulkarni, Vaibhav Pulastya, Md. Shad Akhtar and Tanmoy Chakraborty （印度德里信息技术研究所、德里印度理工学院）
分类：Computational Social Science and Cultural Analytics
摘要：The widespread diffusion of medical and political claims in the wake of COVID-19 has led to a voluminous rise in misinformation and fake news. The current vogue is to employ manual fact-checkers to efficiently classify and verify such data to combat this avalanche of claim-ridden misinformation. However, the rate of information dissemination is such that it vastly outpaces the fact-checkers&#39; strength. Therefore, to aid manual fact-checkers in eliminating the superfluous content, it becomes imperative to automatically identify and extract the snippets of claim-worthy (mis)information present in a post. In this work, we introduce the novel task of Claim Span Identification (CSI). We propose CURT, a large-scale Twitter corpus with token-level claim spans on more than 7.5k tweets. Furthermore, along with the standard token classification baselines, we benchmark our dataset with DABERTa, an adapter-based variation of RoBERTa. The experimental results attest that DABERTa outperforms the baseline systems across several evaluation metrics, improving by about 1.5 points. We also report detailed error analysis to validate the model&#39;s performance along with the ablation studies. Lastly, we release our comprehensive span annotation guidelines for public use.
链接：
5. [复杂事实核查] Generating Literal and Implied Subquestions to Fact-check Complex Claims
作者：Jifan Chen, Aniruddh Sriram, Eunsol Choi and Greg Durrett（得克萨斯大学奥斯汀分校）
分类：Semantics: Lexical, Sentence level, Textual Inference and Other areas
摘要：Verifying political claims is a challenging task, as politicians can use various tactics to subtly misrepresent the facts for their agenda. Existing automatic fact-checking systems fall short here, and their predictions like &#34;&#34;half-true&#39;&#39; are not very useful in isolation, since it is unclear which parts of a claim are true and which are not. In this work, we focus on decomposing a complex claim into a comprehensive set of yes-no subquestions whose answers influence the veracity of the claim. We present CLAIMDECOMP, a dataset of decompositions for over 1000 claims. Given a claim and its verification paragraph written by fact-checkers, our trained annotators write subquestions covering both explicit propositions of the original claim and its implicit facets, such as asking about additional political context that changes our view of the claim&#39;s veracity. We study whether state-of-the-art models can generate such subquestions, showing that these models generate reasonable questions to ask, but predicting the comprehensive set of subquestions from the original claim without evidence remains challenging. We further show that these subquestions can help identify relevant evidence to fact-check the full claim and derive the veracity through their answers, suggesting that they can be useful pieces of a fact-checking pipeline.
链接：
6. [基于逻辑、多跳推理的事实验证] Natural Logic-guided Autoregressive Multi-hop Document Retrieval for Fact Verification
作者：Rami Aly and Andreas Vlachos （剑桥大学）
分类：NLP Applications
摘要：A key component of fact verification is the evidence retrieval, often from multiple documents.  Recent approaches use dense representations and condition the retrieval of each document on the previously retrieved ones. The latter step is performed over all the documents in the collection, requiring storing their dense representations in an index, thus incurring a high memory footprint.  An alternative paradigm is retrieve-and-rerank, where documents are retrieved using methods such as BM25, their sentences are reranked, and further documents are retrieved conditioned on these sentences, reducing the memory requirements. However, such approaches can be brittle as they rely on heuristics and assume hyperlinks between documents. We propose a novel retrieve-and-rerank method for multi-hop retrieval, that consists of a retriever that jointly scores documents in the knowledge source and sentences from previously retrieved documents using an autoregressive formulation and is guided by a proof system based on natural logic that dynamically terminates the retrieval process if the evidence is deemed sufficient. This method exceeds or is on par with the current state-of-the-art on FEVER, HoVer and FEVEROUS-S,  while using 5 to 10 times less memory than competing systems. Evaluation on an adversarial dataset indicates improved stability of our approach compared to commonly deployed threshold-based methods. Finally, the proof system helps humans predict model decisions correctly more often than using the evidence alone.
7. [表格事实验证] PASTA: Table-Operations Aware Fact Verification via Sentence-Table Cloze Pre-training
作者：Zihui Gu, Ju Fan, Nan Tang, Preslav Nakov, Xiaoman Zhao and Xiaoyong Du （中国人民大学、MBZUAI）
分类：Unsupervised and Weakly-Supervised Methods in NLP
摘要：Fact verification has attracted a lot of attention recently, e.g., in journalism, marketing, and policymaking, as misinformation and dis- information can sway one&#39;s opinion and affect one&#39;s actions. While fact-checking is a hard task in general, in many cases, false statements can be easily debunked based on analytics over tables with reliable information. Hence, table- based fact verification has recently emerged as an important and growing research area. Yet, progress has been limited due to the lack of datasets that can be used to pre-train language models (LMs) to be aware of common table operations, such as aggregating a column or comparing tuples. To bridge this gap, this paper introduces PASTA for table-based fact verification via pre-training with synthesized sentence–table cloze questions. In particular, we design six types of common sentence–table cloze tasks, including Filter, Aggregation, Superlative, Comparative, Ordinal, and Unique, based on which we synthesize a large corpus consisting of 1.2 million sentence–table pairs from WikiTables. PASTA uses a recent pre-trained LM, DeBERTaV3, and further pre- trains it on our corpus. Our experimental results show that PASTA achieves new state-of-the-art (SOTA) performance on two table-based fact verification datasets TabFact and SEM-TAB- FACTS. In particular, on the complex set of TabFact, which contains multiple operations, PASTA largely outperforms previous SOTA by 4.7% (85.6% vs. 80.9%), and the gap between PASTA and human performance on the small test set is narrowed to just 1.5% (90.6% vs. 92.1%)
8. [真实世界事实验证] Missing Counter-Evidence Renders NLP Fact-Checking Unrealistic for Misinformation
作者：Max Glockner, Yufang Hou and Iryna Gurevych （德国达姆施塔特工业大学、IBM欧洲研究院）
分类：Theme Track
摘要：Misinformation emerges in times of uncertainty when credible information is limited. This is challenging for NLP-based fact-checking as it relies on counter-evidence, which may not yet be available. Despite increasing interest in automatic fact-checking, it is still unclear if automated approaches can realistically refute harmful real-world misinformation. Here, we contrast and compare NLP fact-checking with how professional fact-checkers combat misinformation in the absence of counter-evidence. In our analysis, we show that, by design, existing NLP task definitions for fact-checking cannot refute misinformation as professional fact-checkers do for the majority of claims. We then define two requirements that the evidence in datasets must fulfill for realistic fact-checking: It must be (1) sufficient to refute the claim and (2) not leaked from existing fact-checking articles. We survey existing fact-checking datasets and find that all of them fail to satisfy both criteria. Finally, we perform experiments to demonstrate that models trained on a large-scale fact-checking dataset rely on leaked evidence, which makes them unsuitable in real-world scenarios. Taken together, we show that current NLP fact-checking cannot realistically combat real-world misinformation because it depends on unrealistic assumptions about counter-evidence in the data.&#34;
链接：
9. [新闻媒体可信度侧写] GREENER: Graph Neural Networks for News Media Profiling
作者：Panayot Panayotov, Utsav Shukla, Husrev Taha Sencar, Mohamed Nabeel and Preslav Nakov （MBZUAI）
分类：NLP Applications
摘要：We study the problem of profiling news media on the Web with respect to their factuality of reporting and bias. This is an important but under-studied problem related to disinformation and ``fake news&#39;&#39; detection, but it addresses the issue at a coarser granularity compared to looking at an individual article or an individual claim. This is useful as it allows to profile entire media outlets in advance. Unlike previous work, which has focused primarily on text (e.g.,~on the text of the articles published by the target website, or on the textual description in their social media profiles or in Wikipedia), here our main focus is on modeling the similarity between media outlets based on the overlap of their audience. This is motivated by homophily considerations, i.e.,~the tendency of people to have connections to people with similar interests, which we extend to media, hypothesizing that similar types of media would be read by similar kinds of users. In particular, we propose GREENER (GRaph nEural nEtwork for News mEdia pRofiling), a model that builds a graph of inter-media connections based on their audience overlap, and then uses graph neural networks to represent each medium.  We find that such representations are quite useful for predicting the factuality and the bias of news media outlets, yielding improvements over state-of-the-art results reported on two datasets.  When augmented with conventionally used representations obtained from news articles, Twitter, YouTube, Facebook, and Wikipedia, prediction accuracy is found to improve by 2.5-27 macro-F1 points for the two tasks.
链接：
Findings长文（Findings Long Paper）

1.  [以俄乌冲突中俄媒体为例、信息操纵检测] Challenges and Opportunities in Information Manipulation Detection: An Examination of Wartime Russian Media
作者：Chan Young Park, Julia Mendelsohn, Anjalie Field and Yulia Tsvetkov (卡内基梅隆大学、密歇根大学、斯坦福大学、华盛顿大学)
分类：Theme Track
摘要：NLP research on public opinion manipulation campaigns has primarily focused on detecting overt strategies such as fake news and disinformation. However, information manipulation in the ongoing Russia-Ukraine war exemplifies how governments and media also employ more nuanced strategies. We release a new dataset, VoynaSlov, containing 38M+ posts from Russian media outlets on Twitter and VKontakte, as well as public activity and responses, immediately preceding and during the 2022 Russia-Ukraine war. We apply standard and recently-developed NLP models on VoynaSlov to examine agenda setting, framing, and priming, several strategies underlying information manipulation, and reveal variation across media outlet control, social media platform, and time. Our examination of these media effects and extensive discussion of current approaches&#39; limitations encourage further development of NLP models for understanding information manipulation in emerging crises, as well as other real-world and interdisciplinary tasks.
链接：
2. [基于问答的事实验证] QaDialMoE: Question-answering Dialogue based Fact Verification with Mixture of Experts
作者：Longzheng Wang, Peng Zhang, Xiaoyu Sean Lu, Lei Zhang, Chaoyang Yan and Chuang Zhang （南京理工大学）
分类：Semantics: Lexical, Sentence level, Textual Inference and Other areas
摘要：Fact verification is an essential tool to mitigate the spread of false information online, which has gained a widespread attention recently. However, a fact verification in the question-answering dialogue is still underexplored. In this paper, we propose a neural network based approach called question-answering dialogue based fact verification with mixture of experts (QaDialMoE). It exploits questions and evidence effectively in the verification process and can significantly improve the performance of fact verification. Specifically, we exploit the mixture of experts to focus on various interactions among responses, questions and evidence. A manager with an attention guidance module is implemented to guide the training of experts and assign a reasonable attention score to each expert. A prompt module is developed to generate synthetic questions that make our approach more generalizable. Finally, we evaluate the QaDialMoE and conduct a comparative study on three benchmark datasets. The experimental results demonstrate that our QaDialMoE outperforms previous approaches by a large margin and achieves new state-of-the-art results on all benchmarks. This includes the accuracy improvements on the HEALTHVER as 84.26%, the FAVIQ A dev set as 78.7%, the FAVIQ R dev set as 86.1%, test set as 86.0%, and the COLLOQUIAL as 89.5%. To our best knowledge, this is the first work to investigate a question-answering dialogue based fact verification, and achieves new state-of-the-art results on various benchmark datasets.
3. [开放域、科学类事实验证] SciFact-Open: Towards open-domain scientific claim verification
作者：David Wadden, Kyle Lo, Bailey E. Kuehl, Arman Cohan, Iz Beltagy, Lucy Lu Wang and Hannaneh Hajishirzi （华盛顿大学、艾伦人工智能研究院）
分类：NLP Applications
摘要：While research on scientific claim verification has led to the development of powerful systems that appear to approach human performance, these approaches have yet to be tested in a realistic setting against large corpora of scientific literature. Moving to this open-domain evaluation setting, however, poses unique challenges; in particular, it is infeasible to exhaustively annotate all evidence documents. In this work, we present SciFact-Open, a new test collection designed to evaluate the performance of scientific claim verification systems on a corpus of 500K research abstracts. Drawing upon pooling techniques from information retrieval, we collect evidence for scientific claims by pooling and annotating the top predictions of four state-of-the-art scientific claim verification models. We find that systems developed on smaller corpora struggle to generalize to SciFact-Open, exhibiting performance drops of at least 15 F1. In addition, analysis of the evidence in SciFact-Open reveals interesting phenomena likely to appear when claim verification systems are deployed in practice, e.g., cases where the evidence supports only a special case of the claim. Our dataset is available at https://github.com/dwadden/scifact-open.
链接：
4. [文档级、已核查事实检测] Assisting the Human Fact-Checkers: Detecting All Previously Fact-Checked Claims in a Document
作者：Shaden Shaar, Nikola Georgiev, Firoj Alam, Giovanni Da San Martino, Aisha Mohamed and Preslav Nakov （卡塔尔计算研究院、意大利帕多瓦大学、MBZUAI）
分类：NLP Applications
摘要：Given the recent proliferation of false claims online, there has been a lot of manual fact-checking effort. As this is very time-consuming, human fact-checkers can benefit from tools that can support them and make them more efficient. Here, we focus on building a system that could provide such support. Given an input document, it aims to detect all sentences that contain a claim that can be verified by some previously fact-checked claims (from a given database). The output is a re-ranked list of the document sentences, so that those that can be verified are ranked as high as possible, together with corresponding evidence. Unlike previous work, which has looked into claim retrieval, here we take a document-level perspective. We create a new manually annotated dataset for the task, and we propose suitable evaluation measures. We further experiment with a learning-to-rank approach, achieving sizable performance gains over several strong baselines. Our analysis demonstrates the importance of modeling text similarity and stance, while also taking into account the veracity of the retrieved previously fact-checked claims. We believe that this research would be of interest to fact-checkers, journalists, media, and regulatory authorities.
链接：
5. [逻辑谬误检测] Logical Fallacy Detection
作者：Zhijing Jin, Abhinav Lalwani, Tejas Vaidhya, Xiaoyu Shen, Yiwen Ding, Zhiheng LYU, Mrinmaya Sachan, Rada Mihalcea and Bernhard Schoelkopf （德国马克斯-普朗克研究所、苏黎世联邦理工学院、印度皮拉尼比尔拉理工学院、印度卡哈拉格普尔理工学院、德国萨尔兰大学、香港大学）
分类：NLP Applications
摘要：Reasoning is central to human intelligence. However, fallacious arguments are common, and some exacerbate problems such as spreading misinformation about climate change. In this paper, we propose the task of logical fallacy detection, and provide a new dataset (Logic) of logical fallacies generally found in text, together with an additional challenge set for detecting logical fallacies in climate change claims (LogicClimate). Detecting logical fallacies is a hard problem as the model must understand the underlying logical structure of the argument. We find that existing pretrained large language models perform poorly on this task. In contrast, we show that a simple structure-aware classifier outperforms the best language model by 5.46% F1 scores on Logic and 4.51% on LogicClimate. We encourage future work to explore this task since (a) it can serve as a new reasoning challenge for language models, and (b) it can have potential applications in tackling the spread of misinformation. Our dataset and code are available at https://github.com/causalNLP/logical-fallacy.
链接：

专栏相关文章：

上一篇：普京一直在等待华盛顿的这一时刻
下一篇：托福94有机会上uw吗？

白金会员	积分	兔币	帖子
白金会员, 积分 3478, 距离下一级还需 1522 积分	3478	1908	1570
在线时间：0 小时	最后登录：2023-10-10

[问答] EMNLP 2022 谣言/假新闻检测、事实核查论文汇总（附链接 ...

关联主题

日元暴跌！美市场人士担忧：日本政府或再次

6分钟闪崩大跌26%，A股惊现“杀猪盘”？“

美军中有2万华裔？最高军衔居然不如日裔

nova首款！华为新折叠屏价格有惊喜

文学大奖+下嫁“牛郎”，教授之女活在争议

蔚来公布从用户利益出发的全生命周期质量体

丢失36年的儿子被找回：当年的3岁小男孩，

《执行法官》第1集就是王炸，一个敢于“自

一加Ace 3 Pro正式发布，旗舰性能配置，游

争5亿房产、传4P丑闻，百亿大佬又开打了

TES沙特杯迎来地狱难度赛程！BLG大概率保送

《怦然4》男五段炼太下头！融入不了集体，

落地25万，现款“奔驰C200L”价格触底，想

马斯克的38岁前妻大婚，和娃娃脸男星喜结连

房地产双轨制，会让穷人买房更加困难

真打疼了？俄罗斯可能开放图们江出海口，中

网友偶遇李健夫妇，孟小蓓发福长胖不少，两

50岁王艳晒照：《还珠》26年，三个女主全翻

伊朗造出无人机航母，美海军破防了

日元暴跌！美市场人士担忧：日本政府或再次

董军就台海划红线不到48小时，美27家军火商

TES沙特杯迎来地狱难度赛程！BLG大概率保送

莫迪连任还没等到中国贺电，先收到了中方的

苹果迟迟不入局，折叠屏手机能否成为手机厂

《怦然4》男五段炼太下头！融入不了集体，

诸茅的黄昏

落地25万，现款“奔驰C200L”价格触底，想

马斯克的38岁前妻大婚，和娃娃脸男星喜结连

中国出口回暖

事关巴菲特！美股突发：数只股票大跌98%

财神驾到

绿林道的

一抹伤

哇哇的哭

冷香丸