DATA-641 Applied Natural Language Processing (3)


This course covers fundamental methods for analyzing textual datasets, focusing on applying classical natural language processing (NLP) methods and libraries in Python to interesting corpora. Students gain familiarity with introductory and intermediate Python concepts to facilitate processing of text for textual analysis. Topics include regular expressions, dictionary methods, an introduction to linguistic structure (e.g., parts of speech), bag-of-words methods and word/document embedding methods. Applications include sentiment analysis, predictive analytics, information retrieval via clustering and outlier detection methods, and language change detection. Crosslist: DATA-441 . Prerequisite/Concurrent: STAT-615 . Note: familiarity with the Python programming language required.

Print-Friendly Page (opens a new window)