TRIADS Training Series: Introduction to Text Analysis in Python

This four-session course will provide participants with an introduction to analyzing textual data using Python. We will begin by learning how to perform simple operations on text and convert text into data. This will cover topics such as working with strings, text preprocessing, NLP tasks (e.g., stemming, tokenizing), as well as representing text as data (e.g., bag-of-words, word embeddings). Subsequently, the course will introduce methods for measuring concepts using textual data and provide an overview of rule-based techniques, supervised learning, and unsupervised learning approaches. Specifically, we will delve into utilizing dictionaries, the application of Naive Bayes for text classification and Latent Dirichlet Allocation (LDA) for generating topic models.

This course is intended for graduate students, faculty and staff from any field at WashU who are interested in learning about quantitative text analysis and would like to become familiar with the main libraries and functions used to work with textual data in Python. Participants are expected to have a basic proficiency in Python (taking the Introduction to Python training series should be sufficient).

This class will be fully in-person, and participants will use their own laptops.

Time: Mondays and Wednesdays, 12:30 – 2 p.m.

Location: Danforth University Center (DUC), Room 233

Instructor: Ishita Gopal

Max enrollment: Enrollment is limited to 20. If you enroll and elect not to attend, please let us know ASAP so we can offer the space to another participant. 

RSVP