Web pages partitioned into classes, with hyperlink data. The dataset has been used for text categorization and learning to extract symbolic knowledge from the World Wide Web.
Research on Localization and Mapping, Partially Observable Markov Decision Processes, Computer Vision and Image Processing, Robot Architectures and Programming Languages, Learning Algorithms.
A library of C code useful for writing statistical text analysis, language modeling, and information retrieval programs. The current distribution includes the library, as well as front-ends for document classification (rainbow), document retrieval (ar...