Probabilistic RetrievalCovers Probabilistic Information Retrieval, modeling relevance as a probability, query expansion, and automatic thesaurus generation.
Text-Based Information RetrievalCovers the basic concepts of text-based information retrieval and how documents are indexed and retrieved based on user queries.
Information retrieval: vector spaceCovers the basics of information retrieval using vector space models and practical exercises on relevance feedback and posting list scanning.
Latent Semantic IndexingCovers Latent Semantic Indexing, a method to improve information retrieval by mapping documents and queries into a lower-dimensional concept space.
Data Wrangling with HadoopCovers data wrangling techniques using Hadoop, focusing on row versus column-oriented databases, popular storage formats, and HBase-Hive integration.