Martin Potthast

Bauhaus-Universität Weimar
Digital Bauhaus Lab
Bauhausstraße 9a · Room 206
99423 Weimar, Germany

Email: martin.potthast[at]uni-weimar.de
Phone: +49 (0)3643 - 58 3567

Short Curriculum Vitae

Martin Potthast studied computer science at the University of Paderborn. After completing his diploma thesis in 2006 he joined the working group Web Technology and Information Systems at the Bauhaus-Universität Weimar. Martin graduated as Dr. rer. nat. in December 2011 and works since as Postdoc at the Digital Bauhaus Lab. His research interests include information retrieval, machine learning, and web technology.

Research

My primary research activities are the following:

  • digital text forensics
  • writing assistance technologies,
  • Big Data, and,
  • science reproducibility.

In this connection I have made significant algorithmic contributions to research on

  • plagiarism detection
  • Wikipedia vandalism detection, and
  • evaluation as a service technologies.

I have conducted award-winning research on technologies required for the aforementioned tasks, such as information retrieval models, document fingerprinting, multidimensional scaling, near-duplicate detection, inverted indexing, cross-language information retrieval, corpus linguistics, information retrieval evaluation, authorship attribution, clustering, opinion mining, and social software misuse detection.

I take a leading role in a number of projects where much of the aforementioned research is applied within large-scale web services:

  • Netspeak, a writing assistance tool,
  • Picapica, a text reuse search engine,
  • TIRA, a service for evaluation as a service,
  • ChatNoir, a web search engine for static web crawls, and
  • AItools, a programming library for information retrieval.

Professional Activities

I am one of the initiators and organizers of the PAN workshop and competition series on digital text forensics. Since 2009, PAN has hosted successful shared tasks on plagiarism detection, Wikipedia vandalism detection, author identification, and author profiling. 

I regularly serve on the program committees of high-ranked conferences and journals, such as SIGIR, ACL, TKDE, and TOIS.

At Bauhaus-Universität, I've served on the Board for Science and Projects, responsible for the local funding of creative and innovative research, and on the search committee for the junior professorship Mobile Medien (mobile media). I have been repeatedly teaching assistant for the lectures Databases, Web Technology (foundations), and Web Technology (advanced) at this working group.

At the Digital Bauhaus Lab, I oversee its large-scale computing infrastructure for massively parallel Big Data processing and high-performance computing. In this connection, I have coordinated the planning and acquisition of the infrastructure worth about 1 million Euros.

I have successfully acquired funding and supervised a startup project from the German Federal Ministry of Economic Affairs and Energy worth 100.000 Euros.

Publications

Martin Potthast, Francisco Rangel, Michael Tschuggnall, Efstathios Stamatatos, Paolo Rosso, and Benno Stein. Overview of PAN'17: Author Identification, Author Profiling, and Author Obfuscation. In Gareth J. F. Jones et al, editors, Experimental IR Meets Multilinguality, Multimodality, and Interaction. 7th International Conference of the CLEF Initiative (CLEF 17), Berlin Heidelberg New York, September 2017. Springer. [paper] [bib]
Michael Völske, Martin Potthast, Shahbaz Syed, and Benno Stein. TL;DR: Mining Reddit to Learn Automatic Summarization. In Proceedings of the EMNLP 2017 Workshop on New Frontiers in Summarization (to appear) , September 2017. [paper] [bib]
Michael Tschuggnall, Efstathios Stamatatos, Ben Verhoeven, Walter Daelemans, Günther Specht, Benno Stein, and Martin Potthast. Overview of the Author Identification Task at PAN-2017: Style Breach Detection and Author Clustering. In Linda Cappellato, Nicola Ferro, Lorraine Goeuriot, and Thomad Mandl, editors, Working Notes Papers of the CLEF 2017 Evaluation Labs volume 1866 of CEUR Workshop Proceedings, September 2017. CLEF and CEUR-WS.org. ISSN 1613-0073. [publisher] [paper] [bib]
Matthias Hagen, Martin Potthast, and Benno Stein. Overview of the Author Obfuscation Task at PAN 2017: Safety Evaluation Revisited. In Linda Cappellato, Nicola Ferro, Lorraine Goeuriot, and Thomad Mandl, editors, Working Notes Papers of the CLEF 2017 Evaluation Labs volume 1866 of CEUR Workshop Proceedings, September 2017. CLEF and CEUR-WS.org. ISSN 1613-0073. [publisher] [paper] [bib]
Henning Wachsmuth, Martin Potthast, Khalid Al-Khatib, Yamen Ajjour, Jana Puschmann, Jiani Qu, Jonas Dorsch, Viorel Morari, Janek Bevendorff, and Benno Stein. Building an Argument Search Engine for the Web. In Proceedings of the Fourth Workshop on Argument Mining (ArgMining 17), pages 49–59, September 2017. [paper] [bib] [demo] [slides]
Francisco Manuel Rangel Pardo, Paolo Rosso, Martin Potthast, and Benno Stein. Overview of the 5th Author Profiling Task at PAN 2017: Gender and Language Variety Identification in Twitter. In Linda Cappellato, Nicola Ferro, Lorraine Goeuriot, and Thomad Mandl, editors, Working Notes Papers of the CLEF 2017 Evaluation Labs volume 1866 of CEUR Workshop Proceedings, September 2017. CLEF and CEUR-WS.org. ISSN 1613-0073. [publisher] [paper] [bib]
Daniel Zeman, Martin Popel, Milan Straka, Jan Hajic, Joakim Nivre, Filip Ginter, Juhani Luotolahti, Sampo Pyysalo, Slav Petrov, Martin Potthast, Francis Tyers, Elena Badmaeva, Memduh Gokirmak, Anna Nedoluzhko, Silvie Cinkova, Jan Hajic jr., Jaroslava Hlavacova, Václava Kettnerová, Zdenka Uresova, Jenna Kanerva, Stina Ojala, Anna Missilä, Christopher D. Manning, Sebastian Schuster, Siva Reddy, Dima Taji, Nizar Habash, Herman Leung, Marie-Catherine de Marneffe, Manuela Sanguinetti, Maria Simi, Hiroshi Kanayama, Valeria de Paiva, Kira Droganova, Héctor Martínez Alonso, Çağrı Çöltekin, Umut Sulubacak, Hans Uszkoreit, Vivien Macketanz, Aljoscha Burchardt, Kim Harris, Katrin Marheinecke, Georg Rehm, Tolga Kayadelen, Mohammed Attia, Ali Elkahky, Zhuoran Yu, Emily Pitler, Saran Lertpradit, Michael Mandl, Jesse Kirchner, Hector Fernandez Alcalde, Jana Strnadová, Esha Banerjee, Ruli Manurung, Antonio Stella, Atsuko Shimada, Sookyoung Kwak, Gustavo Mendonca, Tatiana Lando, Rattima Nitisaroj, and Josie Li. CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies. In Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pages 1-19, August 2017. Association for Computational Linguistics. [doi] [paper] [bib]
Matthias Hagen, Martin Potthast, Marcel Gohsen, Anja Rathgeber, and Benno Stein. A Large-Scale Query Spelling Correction Corpus. In Noriko Kando et al, editors, 40th International ACM Conference on Research and Development in Information Retrieval (SIGIR 17), pages 1261-1264, August 2017. ACM. ISBN 978-1-4503-5022-8. [doi] [paper] [corpus] [bib] [poster]
Johannes Kiesel, Martin Potthast, Matthias Hagen, and Benno Stein. Spatio-temporal Analysis of Reverted Wikipedia Edits. In Proceedings of the Eleventh International AAAI Conference on Web and Social Media (ICWSM 17), May 2017. [publisher] [paper] [bib] [code]
Martin Potthast, Christian Forler, Eik List, and Stefan Lucks. Passphone: Outsourcing Phone-Based Web Authentication While Protecting User Privacy. Cryptology ePrint Archive, Report 2017/158, 2017. [publisher] [paper] [bib]
Stefan Heindorf, Martin Potthast, Hannah Bast, Björn Buchhold, and Elmar Haussmann. WSDM Cup 2017: Vandalism Detection and Triple Scoring. In Proceedings of the 10th ACM International Conference on Web Search and Data Mining (WSDM 17), pages 827-828, New York, February 2017. ACM. ISBN 978-1-4503-4675-7. [doi] [paper] [bib]
Martin Potthast, Christian Forler, Eik List, and Stefan Lucks. Passphone: Outsourcing Phone-Based Web Authentication While Protecting User Privacy. In Billy Bob Brumley and Juha Röning, editors, Proceedings of the 21st Nordic Conference on Secure IT Systems (NordSec 16), pages 235-255, November 2016. Springer. ISBN 978-3-319-47560-8. [doi] [paper] [bib] [slides]
Stefan Heindorf, Martin Potthast, Benno Stein, and Gregor Engels. Vandalism Detection in Wikidata. In Snehasis Mukhopadhyay et al, editors, Proceedings of the 25th ACM International Conference on Information and Knowledge Management (CIKM 16), pages 327-336, October 2016. ACM. ISBN 978-1-4503-4073-1. [doi] [paper] [bib] [slides]
Efstathios Stamatatos, Michael Tschuggnall, Ben Verhoeven, Walter Daelemans, Günther Specht, Benno Stein, and Martin Potthast. Clustering by Authorship Within and Across Documents. In Working Notes Papers of the CLEF 2016 Evaluation Labs volume 1609 of CEUR Workshop Proceedings, September 2016. CLEF and CEUR-WS.org. ISSN 1613-0073. [publisher] [paper] [bib] [slides]
Francisco Manuel Rangel Pardo, Paolo Rosso, Ben Verhoeven, Walter Daelemans, Martin Potthast, and Benno Stein. Overview of the 4th Author Profiling Task at PAN 2016: Cross-Genre Evaluations. In Working Notes Papers of the CLEF 2016 Evaluation Labs volume 1609 of CEUR Workshop Proceedings, September 2016. CLEF and CEUR-WS.org. ISSN 1613-0073. [publisher] [paper] [bib] [slides]
Martin Potthast, Matthias Hagen, and Benno Stein. Author Obfuscation: Attacking the State of the Art in Authorship Verification. In Working Notes Papers of the CLEF 2016 Evaluation Labs volume 1609 of CEUR Workshop Proceedings, September 2016. CLEF and CEUR-WS.org. ISSN 1613-0073. [publisher] [paper] [bib] [slides]
Paolo Rosso, Francisco Rangel, Martin Potthast, Efstathios Stamatatos, Michael Tschuggnall, and Benno Stein. Overview of PAN'16—New Challenges for Authorship Analysis: Cross-genre Profiling, Clustering, Diarization, and Obfuscation. In Norbert Fuhr et al, editors, Experimental IR Meets Multilinguality, Multimodality, and Interaction. 7th International Conference of the CLEF Initiative (CLEF 16), Berlin Heidelberg New York, September 2016. Springer. ISBN 978-3-319-44564-9. [doi] [paper] [bib] [slides]
Patrick Riehmann, Martin Potthast, Henning Gruendl, Johannes Kiesel, Dean Jürges, Giuliano Castiglia, Bagrat Ter-Akopyan, and Bernd Froehlich. Visualizing Article Similarities in Wikipedia. In Tobias Isenberg and Filip Sadlo, editors, Proceedings of the 18th Eurographics Conference on Visualization (EuroVis 16) - Posters Volume, pages 69-71, June 2016. The Eurographics Association. ISBN 978-3-03868-015-4. [doi] [paper] [bib] [poster]
Ingo Frommholz, Haider M. al-Khateeb, Martin Potthast, Zinnar Ghasem, Mitul Shukla, and Emma Short. On Textual Analysis and Machine Learning for Cyberstalking Detection. Datenbank-Spektrum, 16 (2) : 127–135, June 2016. [doi] [article] [bib]
Matthias Hagen, Martin Potthast, Michael Völske, Jakob Gomoll, and Benno Stein. How Writers Search: Analyzing the Search and Writing Logs of Non-fictional Essays. In Diane Kelly et al, editors, Proceedings of the 1st ACM SIGIR Conference on Human Information Interaction and Retrieval (CHIIR 16), pages 193-202, March 2016. ACM. [doi] [paper] [bib] [slides]
Martin Potthast, Sarah Braun, Tolga Buz, Fabian Duffhauss, Florian Friedrich, Jörg Marvin Gülzow, Jakob Köhler, Winfried Lötzsch, Fabian Müller, Maike Elisa Müller, Robert Paßmann, Bernhard Reinke, Lucas Rettenmeier, Thomas Rometsch, Timo Sommer, Michael Träger, Sebastian Wilhelm, Benno Stein, Efstathios Stamatatos, and Matthias Hagen. Who Wrote the Web? Revisiting Influential Author Identification Research Applicable to Information Retrieval. In Nicola Ferro et al, editors, Advances in Information Retrieval. 38th European Conference on IR Research (ECIR 16) volume 9626 of Lecture Notes in Computer Science, pages 393-407, Berlin Heidelberg New York, March 2016. Springer. [doi] [paper] [bib] [slides]
Martin Potthast, Sebastian Köpsel, Benno Stein, and Matthias Hagen. Clickbait Detection. In Nicola Ferro et al, editors, Advances in Information Retrieval. 38th European Conference on IR Research (ECIR 16) volume 9626 of Lecture Notes in Computer Science, pages 810-817, Berlin Heidelberg New York, March 2016. Springer. [doi] [paper] [bib] [slides] [poster]
Allan Hanbury, Henning Müller, Krisztian Balog, Torben Brodt, Gordon V. Cormack, Ivan Eggel, Tim Gollub, Frank Hopfgartner, Jayashree Kalpathy-Cramer, Noriko Kando, Anastasia Krithara, Jimmy Lin, Simon Mercer, and Martin Potthast. Evaluation-as-a-Service: Overview and Outlook. ArXiv e-prints, December 2015. [publisher] [article] [bib]
Efstathios Stamatatos, Martin Potthast, Francisco Rangel, Paolo Rosso, and Benno Stein. Overview of the PAN/CLEF 2015 Evaluation Lab. In Josiane Mothe et al, editors, Experimental IR Meets Multilinguality, Multimodality, and Interaction. 6th International Conference of the CLEF Initiative (CLEF 15), pages 518-538, Berlin Heidelberg New York, September 2015. Springer. ISBN 978-3-319-24026-8. [doi] [paper] [bib]
Matthias Hagen, Martin Potthast, and Benno Stein. Source Retrieval for Plagiarism Detection from Large Web Corpora: Recent Approaches. In Working Notes Papers of the CLEF 2015 Evaluation Labs, CEUR Workshop Proceedings, September 2015. CLEF and CEUR-WS.org. ISSN 1613-0073. [publisher] [paper] [bib]
Martin Potthast, Steve Göring, Paolo Rosso, and Benno Stein. Towards Data Submissions for Shared Tasks: First Experiences for the Task of Text Alignment. In Working Notes Papers of the CLEF 2015 Evaluation Labs, CEUR Workshop Proceedings, September 2015. CLEF and CEUR-WS.org. ISSN 1613-0073. [publisher] [paper] [bib]
Francisco Rangel, Fabio Celli, Paolo Rosso, Martin Potthast, Benno Stein, and Walter Daelemans. Overview of the 3rd Author Profiling Task at PAN 2015. In Working Notes Papers of the CLEF 2015 Evaluation Labs, CEUR Workshop Proceedings, September 2015. CLEF and CEUR-WS.org. ISSN 1613-0073. [publisher] [paper] [bib]
Efstathios Stamatatos, Walter Daelemans, Ben Verhoeven, Patrick Juola, Aurelio López López, Martin Potthast, and Benno Stein. Overview of the Author Identification Task at PAN 2015. In Working Notes Papers of the CLEF 2015 Evaluation Labs, CEUR Workshop Proceedings, September 2015. CLEF and CEUR-WS.org. ISSN 1613-0073. [publisher] [paper] [bib]
Stefan Heindorf, Martin Potthast, Benno Stein, and Gregor Engels. Towards Vandalism Detection in Knowledge Bases: Corpus Construction and Analysis. In Ricardo Baeza-Yates, Mounia Lalmas, Alistair Moffat, and Berthier Ribeiro-Neto, editors, 38th International ACM Conference on Research and Development in Information Retrieval (SIGIR 15), pages 831-834, August 2015. ACM. ISBN 978-1-4503-3621-5. [doi] [paper] [corpus] [bib] [poster]
Patrick Riehmann, Martin Potthast, Benno Stein, and Bernd Froehlich. Visual Assessment of Alleged Plagiarism Cases. Computer Graphics Forum, 34 (3) : 1-10, July 2015. [doi] [article] [bib] [video]
Frank Hopfgartner, Allan Hanbury, Henning Müller, Noriko Kando, Simon Mercer, Jayashree Kalpathy-Cramer, Martin Potthast, Tim Gollub, Anastasia Krithara, Jimmy Lin, Krisztian Balog, and Ivan Eggel. Report on the Evaluation-as-a-Service (EaaS) Expert Workshop. SIGIR Forum, 49 (1) : 57-65, June 2015. [doi] [article] [bib]
Matthias Hagen, Martin Potthast, Michel Büchner, and Benno Stein. Webis: An Ensemble for Twitter Sentiment Detection. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 15), pages 582-589, June 2015. Association for Computational Linguistics. [publisher] [paper] [bib] [code]
Matthias Hagen, Martin Potthast, Michel Büchner, and Benno Stein. Twitter Sentiment Detection via Ensemble Classification Using Averaged Confidence Scores. In Advances in Information Retrieval. 37th European Conference on IR Research (ECIR 15) volume 9022 of Lecture Notes in Computer Science, pages 513-525, Berlin Heidelberg New York, March 2015. Springer. [doi] [paper] [bib] [code] [slides]
Efstathios Stamatatos, Walter Daelemans, Ben Verhoeven, Martin Potthast, Benno Stein, Patrick Juola, Miguel A. Sanchez-Perez, and Alberto Barrón-Cedeño. Overview of the Author Identification Task at PAN 2014. In Linda Cappellato, Nicola Ferro, Martin Halvey, and Wessel Kraaij, editors, Working Notes Papers of the CLEF 2014 Evaluation Labs, CEUR Workshop Proceedings, September 2014. CLEF and CEUR-WS.org. ISSN 1613-0073. [publisher] [paper] [bib]
Martin Potthast, Matthias Hagen, Anna Beyer, Matthias Busse, Martin Tippmann, Paolo Rosso, and Benno Stein. Overview of the 6th International Competition on Plagiarism Detection. In Linda Cappellato, Nicola Ferro, Martin Halvey, and Wessel Kraaij, editors, Working Notes Papers of the CLEF 2014 Evaluation Labs, CEUR Workshop Proceedings, September 2014. CLEF and CEUR-WS.org. ISSN 1613-0073. [publisher] [paper] [bib] [slides]
Martin Potthast, Tim Gollub, Francisco Rangel, Paolo Rosso, Efstathios Stamatatos, and Benno Stein. Improving the Reproducibility of PAN's Shared Tasks: Plagiarism Detection, Author Identification, and Author Profiling. In Evangelos Kanoulas et al, editors, Information Access Evaluation meets Multilinguality, Multimodality, and Visualization. 5th International Conference of the CLEF Initiative (CLEF 14), pages 268-299, Berlin Heidelberg New York, September 2014. Springer. ISBN 978-3-319-11381-4. [doi] [paper] [bib] [slides]
Francisco Rangel, Paolo Rosso, Irina Chugur, Martin Potthast, Martin Trenkmann, Benno Stein, Ben Verhoeven, and Walter Daelemans. Overview of the Author Profiling Task at PAN 2014. In Linda Cappellato, Nicola Ferro, Martin Halvey, and Wessel Kraaij, editors, Working Notes Papers of the CLEF 2014 Evaluation Labs, CEUR Workshop Proceedings, September 2014. CLEF and CEUR-WS.org. ISSN 1613-0073. [publisher] [paper] [bib]
Martin Potthast, Matthias Hagen, Anna Beyer, and Benno Stein. Improving Cloze Test Performance of Language Learners Using Web N-Grams. In Junichi Tsujii and Jan Hajic, editors, 25th International Conference on Computational Linguistics (COLING 14), pages 962-973, August 2014. Association for Computational Linguistics. [paper] [bib]
Tim Gollub, Martin Potthast, Anna Beyer, Matthias Busse, Francisco Rangel, Paolo Rosso, Efstathios Stamatatos, and Benno Stein. Recent Trends in Digital Text Forensics and its Evaluation. In Pamela Forner et al, editors, Information Access Evaluation meets Multilinguality, Multimodality, and Visualization. 4th International Conference of the CLEF Initiative (CLEF 13), pages 282-302, Berlin Heidelberg New York, September 2013. Springer. ISBN 978-3-642-40801-4. ISSN 0302-9743. [doi] [paper] [bib] [slides]
Martin Potthast, Tim Gollub, Matthias Hagen, Martin Tippmann, Johannes Kiesel, Paolo Rosso, Efstathios Stamatatos, and Benno Stein. Overview of the 5th International Competition on Plagiarism Detection. In Pamela Forner, Roberto Navigli, and Dan Tufis, editors, Working Notes Papers of the CLEF 2013 Evaluation Labs, September 2013. ISBN 978-88-904810-3-1. ISSN 2038-4963. [publisher] [paper] [bib] [slides]
Martin Potthast, Matthias Hagen, Michael Völske, and Benno Stein. Exploratory Search Missions for TREC Topics. In Max L. Wilson et al, editors, 3rd European Workshop on Human-Computer Interaction and Information Retrieval (EuroHCIR 2013), pages 11-14, August 2013. CEUR-WS.org. ISSN 1613-0073. [publisher] [paper] [bib]
Martin Potthast, Matthias Hagen, Michael Völske, and Benno Stein. Crowdsourcing Interaction Logs to Understand Text Reuse from the Web. In Pascale Fung and Massimo Poesio, editors, Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL 13), pages 1212-1221, August 2013. Association for Computational Linguistics. [publisher] [paper] [bib] [slides]
Steven Burrows, Martin Potthast, and Benno Stein. Paraphrase Acquisition via Crowdsourcing and Machine Learning. Transactions on Intelligent Systems and Technology (ACM TIST), 4 (3) : 43:1-43:21, June 2013. [doi] [article] [bib]
Martin Potthast. Technologien zur Wiederverwendung von Texten aus dem Web. In Steffen Hölldobler et al, editors, Ausgezeichnete Informatikdissertationen 2011 volume D-12 LNI of Lecture Notes in Informatics, pages 141-150, December 2012. Gesellschaft für Informatik. ISBN 978-3-88579-416-5. [publisher] [paper] [bib] [slides]
Matthias Hagen, Martin Potthast, Matthias Busse, Jakob Gomoll, Jannis Harder, and Benno Stein. Webis at the TREC 2012 Sessions Track. In Ellen M. Voorhees and Lori P. Buckland, editors, 21st International Text Retrieval Conference (TREC 12), NIST Special Publication, November 2012. National Institute of Standards and Technology (NIST). [publisher] [paper] [bib] [slides]
Matthias Hagen, Martin Potthast, Anna Beyer, and Benno Stein. Towards Optimum Query Segmentation: In Doubt Without. In Xuewen Chen, Guy Lebanon, Haixun Wang, and Mohammed J. Zaki, editors, 21st ACM International Conference on Information and Knowledge Management (CIKM 12), pages 1015-1024, October 2012. ACM. ISBN 978-1-4503-1156-4. [doi] [paper] [bib] [slides]
Martin Potthast, Tim Gollub, Matthias Hagen, Jan Graßegger, Johannes Kiesel, Maximilian Michel, Arnd Oberländer, Martin Tippmann, Alberto Barrón-Cedeño, Parth Gupta, Paolo Rosso, and Benno Stein. Overview of the 4th International Competition on Plagiarism Detection. In Pamela Forner, Jussi Karlgren, and Christa Womser-Hacker, editors, Working Notes Papers of the CLEF 2012 Evaluation Labs, September 2012. ISBN 978-88-904810-3-1. ISSN 2038-4963. [publisher] [paper] [bib] [slides]
Martin Potthast, Benno Stein, Fabian Loose, and Steffen Becker. Information Retrieval in the Commentsphere. Transactions on Intelligent Systems and Technology (ACM TIST), 3 (4) : 68:1-68:21, September 2012. [doi] [article] [bib]
Patrick Riehmann, Henning Gruendl, Martin Potthast, Martin Trenkmann, Benno Stein, and Bernd Froehlich. WORDGRAPH: Keyword-in-Context Visualization for NETSPEAK's Wildcard Search. IEEE Transactions on Visualization and Computer Graphics, 18 (9) : 1411-1423, September 2012. [doi] [article] [bib]
Martin Potthast, Matthias Hagen, Benno Stein, Jan Graßegger, Maximilian Michel, Martin Tippmann, and Clement Welsch. ChatNoir: A Search Engine for the ClueWeb09 Corpus. In Bill Hersh, Jamie Callan, Yoelle Maarek, and Mark Sanderson, editors, 35th International ACM Conference on Research and Development in Information Retrieval (SIGIR 12), pages 1004, August 2012. ACM. ISBN 978-1-4503-1472-5. [doi] [paper] [bib]
Martin Potthast. Technologies for Reusing Text from the Web. Dissertation, Bauhaus-Universität Weimar, December 2011. [publisher] [paper] [bib] [video] [slides]
Martin Potthast, Andreas Eiselt, Alberto Barrón-Cedeño, Benno Stein, and Paolo Rosso. Overview of the 3rd International Competition on Plagiarism Detection. In Vivien Petras, Pamela Forner, and Paul D. Clough, editors, Working Notes Papers of the CLEF 2011 Evaluation Labs, September 2011. ISBN 978-88-904810-1-7. ISSN 2038-4963. [publisher] [paper] [bib] [slides]
Martin Potthast and Teresa Holfeld. Overview of the 2nd International Competition on Wikipedia Vandalism Detection. In Vivien Petras, Pamela Forner, and Paul D. Clough, editors, Notebook Papers of CLEF 11 Labs and Workshops, September 2011. ISBN 978-88-904810-1-7. ISSN 2038-4963. [publisher] [paper] [bib]
Benno Stein, Martin Potthast, Alberto Barrón-Cedeño, Paolo Rosso, Efstathios Stamatatos, and Moshe Koppel. 4th International Workshop on Uncovering Plagiarism, Authorship, and Social Software Misuse (PAN 10). SIGIR Forum, 45 (1) : 45-48, June 2011. [doi] [article] [bib]
Martin Potthast, Alberto Barrón-Cedeño, Benno Stein, and Paolo Rosso. Cross-Language Plagiarism Detection. Language Resources and Evaluation (LRE), 45 (1) : 45-62, March 2011. [doi] [article] [bib]
Patrick Riehmann, Henning Gruendl, Bernd Froehlich, Martin Potthast, Martin Trenkmann, and Benno Stein. The Netspeak WordGraph: Visualizing Keywords in Context. In Giuseppe Di Battista, Jean-Daniel Fekete, and Huamin Qu, editors, 4th IEEE Pacific Visualization Symposium (PacificVis 11), pages 123-130, March 2011. IEEE. [doi] [paper] [bib]
Matthias Hagen, Martin Potthast, Benno Stein, and Christof Bräutigam. Query Segmentation Revisited. In Sadagopan Srinivasan et al, editors, 20th International Conference on World Wide Web (WWW 11), pages 97-106, March 2011. ACM. [doi] [paper] [bib] [slides]
Martin Potthast and Steffen Becker. Opinion Summarization of Web Comments. In Cathal Gurrin et al, editors, Advances in Information Retrieval. 32nd European Conference on Information Retrieval (ECIR 10) volume 5993 of Lecture Notes in Computer Science, pages 668-669, Heidelberg, 2010. Springer. ISBN 978-3-642-12274-3. [doi] [paper] [bib] [poster]
Martin Potthast, Benno Stein, and Teresa Holfeld. PAN Wikipedia Vandalism Corpus PAN-WVC-10. http://www.uni-weimar.de/medien/webis/corpora, 2010. [corpus] [bib]
Martin Potthast, Benno Stein, and Teresa Holfeld. Overview of the 1st International Competition on Wikipedia Vandalism Detection. In Martin Braschler, Donna Harman, and Emanuele Pianta, editors, Working Notes Papers of the CLEF 2010 Evaluation Labs, September 2010. ISBN 978-88-904810-2-4. ISSN 2038-4963. [publisher] [paper] [bib] [slides]
Martin Potthast, Alberto Barrón-Cedeño, Andreas Eiselt, Benno Stein, and Paolo Rosso. Overview of the 2nd International Competition on Plagiarism Detection. In Martin Braschler, Donna Harman, and Emanuele Pianta, editors, Working Notes Papers of the CLEF 2010 Evaluation Labs, September 2010. ISBN 978-88-904810-2-4. ISSN 2038-4963. [publisher] [paper] [bib] [slides]
Martin Potthast, Benno Stein, Alberto Barrón-Cedeño, and Paolo Rosso. An Evaluation Framework for Plagiarism Detection. In Chu-Ren Huang and Dan Jurafsky, editors, 23rd International Conference on Computational Linguistics (COLING 10), pages 997-1005, Stroudsburg, Pennsylvania, August 2010. Association for Computational Linguistics. [paper] [bib] [poster]
Martin Potthast. Crowdsourcing a Wikipedia Vandalism Corpus. In Fabio Crestani et al, editors, 33rd International ACM Conference on Research and Development in Information Retrieval (SIGIR 10), pages 789-790, July 2010. ACM. ISBN 978-1-4503-0153-4. [doi] [paper] [bib] [poster]
Martin Potthast, Martin Trenkmann, and Benno Stein. Using Web N-Grams to Help Second-Language Speakers. In SIGIR 10 Web N-Gram Workshop, pages 49, July 2010. [publisher] [paper] [bib] [slides]
Matthias Hagen, Martin Potthast, Benno Stein, and Christof Bräutigam. The Power of Naïve Query Segmentation. In Fabio Crestani et al, editors, 33rd International ACM Conference on Research and Development in Information Retrieval (SIGIR 10), pages 797-798, July 2010. ACM. ISBN 978-1-4503-0153-4. [doi] [paper] [bib] [poster]
Antonio Reyes, Martin Potthast, Paolo Rosso, and Benno Stein. Evaluating Humor Features on Web Comments. In Nicoletta Calzolari et al, editors, 7th Conference on International Language Resources and Evaluation (LREC 10), May 2010. European Language Resources Association (ELRA). ISBN 2-9517408-6-7. [paper] [bib] [poster]
Alberto Barrón-Cedeño, Martin Potthast, Paolo Rosso, Benno Stein, and Andreas Eiselt. Corpus and Evaluation Measures for Automatic Plagiarism Detection. In Nicoletta Calzolari et al, editors, 7th Conference on International Language Resources and Evaluation (LREC 10), May 2010. European Language Resources Association (ELRA). ISBN 2-9517408-6-7. [paper] [bib] [slides]
Martin Potthast, Benno Stein, and Steffen Becker. Towards Comment-based Cross-Media Retrieval. In Michael Rappa, Paul Jones, Juliana Freire, and Soumen Chakrabarti, editors, 19th International Conference on World Wide Web (WWW 10), pages 1169-1170, April 2010. ACM. ISBN 978-1-60558-799-8. [doi] [paper] [bib] [poster]
Martin Potthast, Martin Trenkmann, and Benno Stein. Netspeak: Assisting Writers in Choosing Words. In Cathal Gurrin et al, editors, Advances in Information Retrieval. 32nd European Conference on Information Retrieval (ECIR 10) volume 5993 of Lecture Notes in Computer Science, pages 672, Berlin Heidelberg New York, March 2010. Springer. ISBN 978-3-642-12274-3. [doi] [paper] [bib]
Benno Stein, Martin Potthast, and Martin Trenkmann. Retrieving Customary Web Language to Assist Writers. In Cathal Gurrin et al, editors, Advances in Information Retrieval. 32nd European Conference on Information Retrieval (ECIR 10) volume 5993 of Lecture Notes in Computer Science, pages 631-635, Berlin Heidelberg New York, March 2010. Springer. ISBN 978-3-642-12274-3. [doi] [paper] [bib] [poster]
Maik Anderka, Benno Stein, and Martin Potthast. Cross-language High Similarity Search: Why no Sub-linear Time Bound can be Expected. In Cathal Gurrin et al, editors, Advances in Information Retrieval. 32nd European Conference on Information Retrieval (ECIR 10) volume 5993 of Lecture Notes in Computer Science, pages 640-644, Berlin Heidelberg New York, March 2010. Springer. ISBN 978-3-642-12274-3. [doi] [paper] [bib] [poster]
Martin Potthast, Benno Stein, Andreas Eiselt, Alberto Barrón-Cedeño, and Paolo Rosso. Overview of the 1st International Competition on Plagiarism Detection. In Benno Stein et al, editors, SEPLN 09 Workshop on Uncovering Plagiarism, Authorship, and Social Software Misuse (PAN 09), pages 1-9, September 2009. CEUR-WS.org. ISSN 1613-0073. [publisher] [paper] [bib] [slides]
Martin Potthast. Measuring the Descriptiveness of Web Comments. In Mark Sanderson et al, editors, 32th International ACM Conference on Research and Development in Information Retrieval (SIGIR 09), pages 724-725, July 2009. ACM. ISBN 978-1-60558-483-6. [doi] [paper] [bib] [poster]
Martin Potthast, Benno Stein, and Maik Anderka. A Wikipedia-Based Multilingual Retrieval Model. In Craig Macdonald et al, editors, Advances in Information Retrieval. 30th European Conference on IR Research (ECIR 08) volume 4956 of Lecture Notes in Computer Science, pages 522-530, Berlin Heidelberg New York, 2008. Springer. ISBN 978-3-540-78645-0. ISSN 0302-9743. [doi] [paper] [bib] [wikipedia] [slides] [poster]
Martin Potthast, Benno Stein, and Robert Gerling. Automatic Vandalism Detection in Wikipedia. In Craig Macdonald et al, editors, Advances in Information Retrieval. 30th European Conference on IR Research (ECIR 08) volume 4956 of Lecture Notes in Computer Science, pages 663-668, Berlin Heidelberg New York, 2008. Springer. ISBN 978-3-540-78645-0. ISSN 0302-9743. [doi] [paper] [bib] [poster]
Martin Potthast and Benno Stein. New Issues in Near-duplicate Detection. In Christine Preisach, Hans Burkhardt, Lars Schmidt-Thieme, and Reinhold Decker, editors, Data Analysis, Machine Learning and Applications. Selected papers from the 31th Annual Conference of the German Classification Society (GFKL 07), Studies in Classification, Data Analysis, and Knowledge Organization, pages 601-609, Berlin Heidelberg New York, 2008. Springer. ISBN 978-3-540-78239-1. [doi] [paper] [bib] [slides]
Fabian Loose, Steffen Becker, Martin Potthast, and Benno Stein. Retrieval-Technologien für die Plagiaterkennung in Programmen. In Joachim Baumeister and Martin Atzmüller, editors, Workshop Special Interest Group Information Retrieval (FGIR 08), Technical Report 448, pages 5-12, October 2008. University of Würzburg, Germany. [paper] [bib] [slides]
Benno Stein and Martin Potthast. Putting Successor Variety Stemming to Work. In Reinhold Decker and Hans J. Lenz, editors, Advances in Data Analysis. Selected papers from the 30th Annual Conference of the German Classification Society (GFKL 06), Studies in Classification, Data Analysis, and Knowledge Organization, pages 367-374, Berlin Heidelberg New York, 2007. Springer. ISBN 978-3-540-70980-0. ISSN 1431-8814. [doi] [paper] [bib] [slides]
Benno Stein and Martin Potthast. Construction of Compact Retrieval Models. In Sándor Dominich and Ferenc Kiss, editors, Studies in Theory of Information Retrieval. 1st International Conference on the Theory of Information Retrieval (ICTIR 07), pages 85-93, Budapest, October 2007. Foundation for Information Society. ISBN 978-963-06-3237-9. ISSN 1587-2386. [paper] [bib] [slides]
Benno Stein, Sven Meyer zu Eißen, and Martin Potthast. Strategies for Retrieving Plagiarized Documents. In Charles Clarke et al, editors, 30th International ACM Conference on Research and Development in Information Retrieval (SIGIR 07), pages 825-826, New York, July 2007. ACM. ISBN 987-1-59593-597-7. [doi] [paper] [bib] [poster]
Martin Potthast. Wikipedia in the Pocket - Indexing Technology for Near-duplicate Detection and High Similarity Search. In Charles Clarke et al, editors, 30th International ACM Conference on Research and Development in Information Retrieval (SIGIR 07), pages 909, July 2007. ACM. ISBN 978-1-59593-597-7. [paper] [bib]
Benno Stein and Martin Potthast. Applying Hash-based Indexing in Text-Based Information Retrieval. In Marie-Francine Moens, Tinne Tuytelaars, and Arjen P. de Vries, editors, 7th Dutch-Belgian Information Retrieval Workshop (DIR 07), pages 29-35, Leuven, Belgium, March 2007. Faculty of Engineering, Universiteit Leuven. ISBN 978-90-5682-771-7. [paper] [bib] [slides]
Benno Stein and Martin Potthast. Hashing-basierte Indizierung: Anwendungsszenarien, Theorie und Methoden. In Norbert Fuhr, Sebastian Goeser, and Thomas Mandl, editors, Workshop Special Interest Group Information Retrieval (FGIR 06), Hildesheimer Informatikberichte, pages 159-166, October 2006. University of Hildesheim, Germany. ISSN 0941-3014. [publisher] [paper] [bib] [slides]
Benno Stein, Sven Meyer zu Eißen, and Martin Potthast. Syntax versus Semantics: Analysis of Enriched Vector Space Models. In Benno Stein and Odej Kao, editors, 3rd International Workshop on Text-Based Information Retrieval (TIR 06) at ECAI, pages 47-52, August 2006. University of Trento, Italy. ISSN 1613-0073. [paper] [bib] [slides]
Sven Meyer zu Eißen, Benno Stein, and Martin Potthast. The Suffix Tree Document Model Revisited. In Klaus Tochtermann and Hermann Maurer, editors, 5th International Conference on Knowledge Management (I-KNOW 05), Journal of Universal Computer Science, pages 596-603, Graz, Austria, July 2005. Know-Center. ISSN 0948-695x. [paper] [bib] [slides]
Martin Potthast, Alexander Beneke, Tobias Jain Denecke, and Jan Gilg. DVD Technikstandards und Troubleshooting. Sybex, 2002. [bib]