Page heading
Languages and Services
  
    You are here menu
    Subpage heading
    Web Technology & Information Systems / Prof. Dr. Benno Stein
    Navigation
    Additional Content
    Main Content

    AItools

    Synopsis

    AItools is a dedicated tool suite to address text-based information retrieval tasks and is currently developed and maintained at the Web Technology and Information Systems Group, Bauhaus University Weimar. The suite comprises basic and advanced algorithms, data structures, and design patterns to model complex real-world retrieval processes. The following figure illustrates typical steps of an AItools processing pipe.

    The solution of information retrieval problems requires highly developed skills from various computer science areas. Main objective of the AItools project is both to simplify the development and to minimize time to market of new information retrieval technology. Key issues from a software engineering viewpoint are the provision of approved solutions, a unified interface to algorithms, rapid prototyping, and experiment automation.

    Project Outline

    Each gray cell in the table below corresponds to a software component, whereas the colored cells name classes of a component. The components are organized, from left to right, with respect to the following five functional areas (the colors indicate these areas):

    • Acquisition [aq]
    • Information Extraction [ie]
    • Information Retrieval [ir]
    • Data Mining [dm]
    • Information Visualization [iv]

    The cells in the table are hyperlinked: clicking on a colored cell, a gray cell, or an area frame will open the Java documentation for the respective class, component, or area. The drop-down box "Filter..." hides and unhides classes with respect to the chosen filter criterion.

    aitools




    Corpus Converter Document Store Serialization
    Europarl File System AI Document
    German Newsgroups HDFS UIMA XCAS
    RCV1 Lucene Web Crawler
    Santini IO Web Download
    TREC File Utils HTTP
    Wikipedia Open Office Vfs
    Data Structure PDF Wget
    Graph PS Web Search
    kdTree Parser Google Engine
    pTrie HTML LiveSearch Engine
    Suffix Tree Wiki Text Meta Search Engine
    Typed Vector   Yahoo Engine
    Chunking Lexer
    Hashed Breakpoint Max Entropy Sentence Lexer
    Window Paragraph Extractor
    Compound Splitting Word Lexer
    Decomposition Strategies Text Preprocessing
    Keyword Extraction Porter Stemmer
    BC Extractor POS Tagging
    Content Extractor Smart Spell
    Cooccurence Extractor Snowball Stemmer
    Kea Extractor Stopword Filter
    RSP Extractor Suffix Tree Stemmer
    Language Detection Word Class Dictionary
    By Stopwords  
    Authorship Verification Retrieval  Model 
    Koppel Divergence from Randomness
    Indexing ESA Model
    Lucene Wrapper LDA Model
    Perfect Minimal Index LSH Model
    Terrier Wrapper LSI Model
      Okapi Model
      pLSI Model
      Ponte Model
      Smart Model
      Suffix Tree Model
      Tf Model
      TfIdf Model
      Summarization
    Cluster Algorithm Cluster Labeling Evaluation
    DbScan Centroid Topic Identification Best Gradient Finder
    Group Average Link Cooccurence Topics Davies Bouldin Index
    k-Means Frequent Predic- tive Words Expected Density
    MajorClust Propescul Labeling F - Measure
    RWC RSP Topics Generalized Dunn Index
    Single Link Weighted Cent- roid Covering Lambda
    SuffixTree Clustering   Variance Cluster Quality
        Statistics
        Gini Coefficient
        Kendalls Tau
        Spearmans Rho
    Graph Drawing
    Distortion Visualization
    JTree Visualization
    Reingold Tilford Graph
    Walker Linear Graph
    MDS
    Chalmers 1996
    Chalmers 2003
    Jourdan 2004
    Jourdan 2004 Multiscale
    Spring Model
    Stein 2006
    AI Document API

    People

    • Benno Stein
    • Martin Potthast
    • Maik Anderka
    • Nedim Lipka
    • Tim Gollub
    • Sven Meyer zu Eissen
    Content signature

    © Fakultät Medien 08.06.2009 / Kontakt / Impressum / Bemerkung zu dieser Seite