by Michelle L. Gill, Ph.D.
Under Construction, 2016/09/26
Note: this repo is under clean-up and is currently missing a few notebooks. It will be updated to completeness within the week. This note will be deleted when code clean-up is complete.
This is my fourth project for the Summer 2016 Metis Data Science Bootcamp, which incorporated unsupervised machine learning and natural language processing. Expert wine reviews were scraped and used in K-means clustering, latent semantic analysis (LSA), and latent dirichlet allocation (LDA). Sentiment analysis was also performed to see if review sentiment was higher in vintage years.
A blog post on themodernscientist will be available the week of 2016/09/26. This text will be updated when the website is posted.
- environment.yml: list of conda python libraries that were used during analysis
- figures: images used in the presentation
- notebooks: Jupyter notebooks used for analysis
- presentation: A PDF version of the final presentation
- visualization: D3 visualization of LDA clusters. A movie of the visualization is also available
- Expert wine reviews from Wine Enthusiast
- User reviews and tasting notes from various other websites