Data
2026^
2025^
-
Touché25-Retrieval-Augmented-Debate-Claims
Annotated simulated conversations for 100 argumentative claims. doi events publications -
Touché25-Image-Retrieval-and-Generation-for-Arguments
About 32,000 images crawled for controversial topics. doi events publications
2024^
-
Touché24-Image-Retrieval-and-Generation-for-Arguments
About 9,000 images crawled for controversial topics. doi events publications -
Touché24-ValueEval
About 3000 texts in 9 languages annotated for 19 human values (Schwartz' system) and value attainment as part of the ValuesML project. doi events publications -
Webis-Follow-Up-Questions-24
About 19,000 simulated continuations of conversational search sessions with 3883 human judgments. doi publications
2023^
-
Touché23-Image-Retrieval-for-Arguments
About 56,000 images crawled for controversial topics. doi events publications -
Webis-Topic-Ontologies
About 9,000,000 argument units from 45 corpora with topics labels. doi publications -
Touché23-ValueEval
About 8900 arguments annotated for 54 human value aspects (based on Schwartz values), extending the Webis-ArgValues-22. doi events publications -
Webis-Generated-Game-Art-23
Report and 110 images generated by an expert when designing a game. doi publications -
Webis-Nudged-Questions-23
About 8600 questions in response to 30 spoken information snippets in 18 variants with different nudging. doi publications
2022^
-
Webis-ArgValues-22
About 5300 arguments annotated for 54 human value aspects (based on Schwartz values). doi publications -
Touché22-Image-Retrieval-for-Arguments
About 24,000 images crawled for controversial topics. doi events publications
2021^
-
Webis-Exhibition-Questions-21
About 850 English and German questions on a virtual exhibition from 63 participants. doi publications -
Webis-ArgImages-21
About 1000 images judged for relevance to 20 controversial topics. doi publications -
Webis-Conversational-Query-Reformulations-21
About 2700 conversational queries and reformulation from 4 search domains. doi publications -
Webis-SCSmeta-21
Meta-information annotations for the 1044 dialog turns in the Spoken Conversational Search dataset. doi publications -
Webis-WebSeg-20-Algorithm-Segmentations
About 250,000 algorithmic segmentations for the web pages in the Webis-WebSeg-20 (itself based on the Webis-Web-Archive-17). doi publications
2020^
-
Webis-EditorialSum-20
About 1300 curated extractive summaries for 266 editorials from 3 news portals. doi publications -
Webis-WebSeg-20
About 42,000 human segmentations of the web pages in the Webis-Web-Archive-17. doi publications -
Webis-Voice-based-and-Conversational-Argument-Search-20
Responses from about 500 participants on their expectations for conversational argument search. doi publications
2019^
-
Webis-Web-Errors-19
Annotations for the 10,000 web pages of the Webis-Web-Archive-17: mostly advertisement, cut off, still loading, pornographic, pop-ups, CAPTCHAs, error messages. doi publications
2018^
-
PAN-SemEval-Hyperpartisan-News-Detection-19
About 750,000 news articles, half of them from hyperpartisan news publishers, and 645 articles with manual hyperpartisanship annotations. doi events publications -
BuzzFeed-Webis Fake News Corpus 16
About 1600 news articles of 9 publishers from September 2016 fact-checked and categorized by media bias. doi publications
2017^
-
Webis-Web-Archive-17
About 10,000 web page archives from mid-2017 including screenshots and annotations of web archive quality. doi publications -
Webis-Mnemonics-17
About 1000 human-selected sentences for password generation and memorization. doi publications -
Webis-Simple-Sentences-17
About 470,000,000 sentences from the ClueWeb12 crawl, randomly sampled to mirror the sentence complexity of password mnemonics. doi publications
2016^
-
Webis-Editorials-16
About 14,000 argumentative units from 300 editorials annotated for argumentative role. doi publications