In this project, exploratory data analysis is conducted to explore the variables, structure, patterns, oddities, and underlying relationships of factors that affect red wine quality.
When I worked on this project, it helped me learn a great deal about EDA i.e., to use plots to understand the distribution of a variable and to check for patterns and their relationships with other variables. Moreover, I learned to create a logical flow when building up from single-variable analysis to multivariate analysis.
wineQualityReds.csv
– This dataset is publicly available for research in the UCI Machine Learning Repository.
Red_Wine_Quality.rmd
– Main RMD project file containing the analysis.
Red_Wine_Quality.html
– HTML file knitted from the project file.
Red_Wine_Quality.R
- R code extract (with documentation).
References.txt
– List of references.
This project was developed using RStudio Version 1.0.153 – © 2009-2017 RStudio, Inc (R Version 3.4.2).
The required packages are ggplot2
, gridExtra
, GGally
, ggthemes
, dplyr
, knitr
and memisc
.