Demonstrates how to extract massive amounts of raw Reddit comment data (and metadata) in order to build an easily indexible SQLite database. We use this database to build and analyze a map of the 10,000 most popular subreddit communities.
January 2021Python code that scrapes user posts and comments from a specified Reddit community (subreddit), performs preprocessing, and then uses the corpus of text to train a transformer chatbot. This can be used to create chatbots that take on the "personality" of different communities within Reddit.
August 2020End-to-end machine learning project, with the purpose of analyzing political presences on Reddit. In particular, comment data is scraped from two diametrically opposed political communities (liberal vs. conservative). Exploratory data analysis is performed to discover new insights, and a binary classification model is trained to see if a single comment from either community can be predictive of political association.
December 2020Large collection of physics simulations, many of which are based off homework assignments in my senior year computational physics course.
March 2021Pytorch implementation of an image segmentation pipeline from scratch. Demonstrates the performance of various state of the art models on the Cityscapes dataset (street scenes), as well as a manually collected and annotated drone dataset.
May 2021Interactive dashboard describing the evolution of the Coronavirus pandemic. Documentation in progress.
May 2020