What Makes A Good Wine?

In this project, a data set of red wine quality is explored based on its physicochemical properties. The objective is to find physicochemical properties that distinguish good quality wine from lower quality ones. An attempt to build linear model on wine quality is also shown.

Dataset Description

This tidy dataset contains 1,599 red wines with 11 variables on the chemical properties of the wine. Another variable attributing to the quality of wine is added; at least 3 wine experts did this rating. The preparation of the dataset has been described in this link.

First, the structure of the dataset is explored using summary and str functions.

## 'data.frame':    1599 obs. of  13 variables:
##  $ X                   : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ fixed.acidity       : num  7.4 7.8 7.8 11.2 7.4 7.4 7.9 7.3 7.8 7.5 ...
##  $ volatile.acidity    : num  0.7 0.88 0.76 0.28 0.7 0.66 0.6 0.65 0.58 0.5 ...
##  $ citric.acid         : num  0 0 0.04 0.56 0 0 0.06 0 0.02 0.36 ...
##  $ residual.sugar      : num  1.9 2.6 2.3 1.9 1.9 1.8 1.6 1.2 2 6.1 ...
##  $ chlorides           : num  0.076 0.098 0.092 0.075 0.076 0.075 0.069 0.065 0.073 0.071 ...
##  $ free.sulfur.dioxide : num  11 25 15 17 11 13 15 15 9 17 ...
##  $ total.sulfur.dioxide: num  34 67 54 60 34 40 59 21 18 102 ...
##  $ density             : num  0.998 0.997 0.997 0.998 0.998 ...
##  $ pH                  : num  3.51 3.2 3.26 3.16 3.51 3.51 3.3 3.39 3.36 3.35 ...
##  $ sulphates           : num  0.56 0.68 0.65 0.58 0.56 0.56 0.46 0.47 0.57 0.8 ...
##  $ alcohol             : num  9.4 9.8 9.8 9.8 9.4 9.4 9.4 10 9.5 10.5 ...
##  $ quality             : int  5 5 5 6 5 5 5 7 7 5 ...
##        X          fixed.acidity   volatile.acidity  citric.acid   
##  Min.   :   1.0   Min.   : 4.60   Min.   :0.1200   Min.   :0.000  
##  1st Qu.: 400.5   1st Qu.: 7.10   1st Qu.:0.3900   1st Qu.:0.090  
##  Median : 800.0   Median : 7.90   Median :0.5200   Median :0.260  
##  Mean   : 800.0   Mean   : 8.32   Mean   :0.5278   Mean   :0.271  
##  3rd Qu.:1199.5   3rd Qu.: 9.20   3rd Qu.:0.6400   3rd Qu.:0.420  
##  Max.   :1599.0   Max.   :15.90   Max.   :1.5800   Max.   :1.000  
##  residual.sugar     chlorides       free.sulfur.dioxide
##  Min.   : 0.900   Min.   :0.01200   Min.   : 1.00      
##  1st Qu.: 1.900   1st Qu.:0.07000   1st Qu.: 7.00      
##  Median : 2.200   Median :0.07900   Median :14.00      
##  Mean   : 2.539   Mean   :0.08747   Mean   :15.87      
##  3rd Qu.: 2.600   3rd Qu.:0.09000   3rd Qu.:21.00      
##  Max.   :15.500   Max.   :0.61100   Max.   :72.00      
##  total.sulfur.dioxide    density             pH          sulphates     
##  Min.   :  6.00       Min.   :0.9901   Min.   :2.740   Min.   :0.3300  
##  1st Qu.: 22.00       1st Qu.:0.9956   1st Qu.:3.210   1st Qu.:0.5500  
##  Median : 38.00       Median :0.9968   Median :3.310   Median :0.6200  
##  Mean   : 46.47       Mean   :0.9967   Mean   :3.311   Mean   :0.6581  
##  3rd Qu.: 62.00       3rd Qu.:0.9978   3rd Qu.:3.400   3rd Qu.:0.7300  
##  Max.   :289.00       Max.   :1.0037   Max.   :4.010   Max.   :2.0000  
##     alcohol         quality     
##  Min.   : 8.40   Min.   :3.000  
##  1st Qu.: 9.50   1st Qu.:5.000  
##  Median :10.20   Median :6.000  
##  Mean   :10.42   Mean   :5.636  
##  3rd Qu.:11.10   3rd Qu.:6.000  
##  Max.   :14.90   Max.   :8.000

The following observations are made/confirmed:

  1. There are 1599 samples of Red Wine properties and quality values.

  2. No wine achieves either a terrible (0) or perfect (10) quality score.

  3. Citric Acid had a minimum of 0.0. No other property values were precisely 0.

  4. Residual Sugar measurement has a maximum that is nearly 20 times farther away from the 3rd quartile than the 3rd quartile is from the 1st. There is a chance of a largely skewed data or that the data has some outliers.

  5. The ‘quality’ attribute is originally considered an integer; I have converted this field into an ordered factor which is much more a representative of the variable itself.

  6. There are two attributes related to ‘acidity’ of wine i.e. ‘fixed.acidity’ and ‘volatile.acidity’. Hence, a combined acidity variable is added using data$total.acidity <- data$fixed.acidity + data$volatile.acidity.

Univariate Plots Section

To lead the univariate analysis, I’ve chosen to build a grid of histograms. These histograms represent the distributions of each variable in the dataset.

There are some really interesting variations in the distributions here. Looking closer at a few of the more interesting ones might prove quite valuable. Working from top-left to right, selected plots are analysed.

Acidity

Fixed acidity is determined by aids that do not evaporate easily – tartaricacid. It contributes to many other attributes, including the taste, pH, color, and stability to oxidation, i.e., prevent the wine from tasting flat. On theother hand, volatile acidity is responsible for the sour taste in wine. A very high value can lead to sour tasting wine, a low value can make the wine seem heavy. (References: 1, 2.

## [1] "Summary statistics of Fixed Acidity"
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    4.60    7.10    7.90    8.32    9.20   15.90
## [1] "Summary statistics of Volatile Acidity"
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.1200  0.3900  0.5200  0.5278  0.6400  1.5800
## [1] "Summary statistics of Total Acidity"
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   5.120   7.680   8.445   8.847   9.740  16.285

Of the wines we have in our dataset, we can see that most have a fixed acidity of 7.5. The median fixed acidity is 7.9, and the mean is 8.32. There is a slight skew in the data because a few wines possess a very high fixed acidity. The median volatile acidity is 0.52 g/dm^3, and the mean is 0.5278 g/dm^3. It will be interesting to note which quality of wine is correlated to what level of acidity in the bivariate section.

Citric Acid

Citric acid is part of the fixed acid content of most wines. A non-volatile acid, citric also adds much of the same characteristics as tartaric acid does. Again, here I would guess most good wines have a balanced amount of citric acid.

## [1] "Summary statistics of Citric Acid"
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.000   0.090   0.260   0.271   0.420   1.000
## [1] "Number of Zero Values"
## 
## FALSE  TRUE 
##  1467   132

There is a very high count of zero in citric acid. To check if this is genuinely zero or merely a ‘not available’ value. A quick check using table function shows that there are 132 observations of zero values and no NA value in reported citric acid concentration. The citric acid concentration could be too low and insignificant hence was reported as zero.

As far as content wise the wines have a median citric acid level of 0.26 g/dm^3, and a mean level of 0.271 g/dm^3.

Sulfur-Dioxide & Sulphates

Free sulfur dioxide is the free form of SO2 exists in equilibrium between molecular SO2 (as a dissolved gas) and bisulfite ion; it prevents microbial growth and the oxidation of wine. Sulphates is a wine additive which can contribute to sulfur dioxide gas (SO2) levels, which acts as an anti-microbial moreover, antioxidant – overall keeping the wine, fresh.

The distributions of all three values are positively skewed with a long tail. Thelog-transformation results in a normal-behaving distribution for ‘total sulfur dioxide’ and ‘sulphates’.

Alcohol

Alcohol is what adds that special something that turns rotten grape juice into a drink many people love. Hence, by intuitive understanding, it should be crucial in determining the wine quality.

## [1] "Summary statistics for alcohol %age."
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    8.40    9.50   10.20   10.42   11.10   14.90

The mean alcohol content for our wines is 10.42%, the median is 10.2%

Quality

## [1] "Summary statistics - Wine Quality."
##   0   1   2   3   4   5   6   7   8   9  10 
##   0   0   0  10  53 681 638 199  18   0   0

Overall wine quality, rated on a scale from 1 to 10, has a normal shape and very few exceptionally high or low-quality ratings.

It can be seen that the minimum rating is 3 and 8 is the maximum for quality. Hence, a variable called ‘rating’ is created based on variable quality.

  • 8 to 7 are Rated A.

  • 6 to 5 are Rated B.

  • 3 to 4 are Rated C.

##    C    B    A 
##   63 1319  217

The distribution of ‘rating’ is much higher on the ‘B’ rating wine as seen in quality distribution. This is likely to cause overplotting. Therefore, a comparison of only the ‘C’ and ‘A’ wines is done to find distinctive properties that separate these two. The comparison is made using summary statistics.

## [1] "Summary statistics of Wine with Rating 'A'"
##        X          fixed.acidity    volatile.acidity  citric.acid    
##  Min.   :   8.0   Min.   : 4.900   Min.   :0.1200   Min.   :0.0000  
##  1st Qu.: 482.0   1st Qu.: 7.400   1st Qu.:0.3000   1st Qu.:0.3000  
##  Median : 939.0   Median : 8.700   Median :0.3700   Median :0.4000  
##  Mean   : 831.7   Mean   : 8.847   Mean   :0.4055   Mean   :0.3765  
##  3rd Qu.:1089.0   3rd Qu.:10.100   3rd Qu.:0.4900   3rd Qu.:0.4900  
##  Max.   :1585.0   Max.   :15.600   Max.   :0.9150   Max.   :0.7600  
##                                                                     
##  residual.sugar    chlorides       free.sulfur.dioxide
##  Min.   :1.200   Min.   :0.01200   Min.   : 3.00      
##  1st Qu.:2.000   1st Qu.:0.06200   1st Qu.: 6.00      
##  Median :2.300   Median :0.07300   Median :11.00      
##  Mean   :2.709   Mean   :0.07591   Mean   :13.98      
##  3rd Qu.:2.700   3rd Qu.:0.08500   3rd Qu.:18.00      
##  Max.   :8.900   Max.   :0.35800   Max.   :54.00      
##                                                       
##  total.sulfur.dioxide    density             pH          sulphates     
##  Min.   :  7.00       Min.   :0.9906   Min.   :2.880   Min.   :0.3900  
##  1st Qu.: 17.00       1st Qu.:0.9947   1st Qu.:3.200   1st Qu.:0.6500  
##  Median : 27.00       Median :0.9957   Median :3.270   Median :0.7400  
##  Mean   : 34.89       Mean   :0.9960   Mean   :3.289   Mean   :0.7435  
##  3rd Qu.: 43.00       3rd Qu.:0.9973   3rd Qu.:3.380   3rd Qu.:0.8200  
##  Max.   :289.00       Max.   :1.0032   Max.   :3.780   Max.   :1.3600  
##                                                                        
##     alcohol         quality    total.acidity    rating 
##  Min.   : 9.20   7      :199   Min.   : 5.320   C:  0  
##  1st Qu.:10.80   8      : 18   1st Qu.: 7.780   B:  0  
##  Median :11.60   0      :  0   Median : 9.040   A:217  
##  Mean   :11.52   1      :  0   Mean   : 9.253          
##  3rd Qu.:12.20   2      :  0   3rd Qu.:10.490          
##  Max.   :14.00   3      :  0   Max.   :16.285          
##                  (Other):  0
## [1] "Summary statistics of Wine with Rating 'C'"
##        X          fixed.acidity    volatile.acidity  citric.acid    
##  Min.   :  19.0   Min.   : 4.600   Min.   :0.2300   Min.   :0.0000  
##  1st Qu.: 435.0   1st Qu.: 6.800   1st Qu.:0.5650   1st Qu.:0.0200  
##  Median : 834.0   Median : 7.500   Median :0.6800   Median :0.0800  
##  Mean   : 837.7   Mean   : 7.871   Mean   :0.7242   Mean   :0.1737  
##  3rd Qu.:1285.5   3rd Qu.: 8.400   3rd Qu.:0.8825   3rd Qu.:0.2700  
##  Max.   :1522.0   Max.   :12.500   Max.   :1.5800   Max.   :1.0000  
##                                                                     
##  residual.sugar     chlorides       free.sulfur.dioxide
##  Min.   : 1.200   Min.   :0.04500   Min.   : 3.00      
##  1st Qu.: 1.900   1st Qu.:0.06850   1st Qu.: 5.00      
##  Median : 2.100   Median :0.08000   Median : 9.00      
##  Mean   : 2.685   Mean   :0.09573   Mean   :12.06      
##  3rd Qu.: 2.950   3rd Qu.:0.09450   3rd Qu.:15.50      
##  Max.   :12.900   Max.   :0.61000   Max.   :41.00      
##                                                        
##  total.sulfur.dioxide    density             pH          sulphates     
##  Min.   :  7.00       Min.   :0.9934   Min.   :2.740   Min.   :0.3300  
##  1st Qu.: 13.50       1st Qu.:0.9957   1st Qu.:3.300   1st Qu.:0.4950  
##  Median : 26.00       Median :0.9966   Median :3.380   Median :0.5600  
##  Mean   : 34.44       Mean   :0.9967   Mean   :3.384   Mean   :0.5922  
##  3rd Qu.: 48.00       3rd Qu.:0.9977   3rd Qu.:3.500   3rd Qu.:0.6000  
##  Max.   :119.00       Max.   :1.0010   Max.   :3.900   Max.   :2.0000  
##                                                                        
##     alcohol         quality   total.acidity    rating
##  Min.   : 8.40   4      :53   Min.   : 5.120   C:63  
##  1st Qu.: 9.60   3      :10   1st Qu.: 7.525   B: 0  
##  Median :10.00   0      : 0   Median : 8.280   A: 0  
##  Mean   :10.22   1      : 0   Mean   : 8.596         
##  3rd Qu.:11.00   2      : 0   3rd Qu.: 9.162         
##  Max.   :13.10   5      : 0   Max.   :12.960         
##                  (Other): 0

On comparing the mean statistic of different attribute for ‘A-rated’ and ‘C-rated’ wines (A → C), the following %age change is noted.

  1. fixed.acidity: mean reduced by 11%.

  2. volatile.acidity - mean increased by 80%.

  3. citric.acidity - mean increased by 117%.

  4. sulphates - mean reduced by 20.3%

  5. alcohol - mean reduced by 12.7%.

  6. residualsugar and chloride showed a very low variation.

These changes are, however, only suitable for estimation of important quality impacting variables and setting a way for further analysis. No conclusion can be drawn from it.

Univariate Analysis - Summary

Overview

The red wine dataset features 1599 separate observations, each for a different red wine sample. As presented, each wine sample is provided as a single row in the dataset. Due to the nature of how some measurements are gathered, some values given represent components of a measurement total.

For example, data.fixed.acidity and data.volatile.acidity are both obtained via separate measurement techniques, and must be summed to indicate the total acidity present in a wine sample. For these cases, I supplemented the data given by computing the total and storing in the data frame with a data.total.* variable.

Features of Interest

An interesting measurement here is the wine quality. It is the subjective measurement of how attractive the wine might be to a consumer. The goal here will is to try and correlate non-subjective wine properties with its quality.

I am curious about a few trends in particular – Sulphates vs. Quality as low sulphate wine has a reputation for not causing hangovers, Acidity vs. Quality - Given that it impacts many factors like pH, taste, color, it is compelling to see if it affects the quality. Alcohol vs. Quality - Just an interesting measurement.

At first, the lack of an age metric was surprising since it is commonly a factor in quick assumptions of wine quality. However, since the actual effect of wine age is on the wine’s measurable chemical properties, its exclusion here might not be necessary.

Distributions

Many measurements that were clustered close to zero had a positive skew (you cannot have negative percentages or amounts). Others such as pH and total.acidity and quality had normal looking distributions.

The distributions studied in this section were primarily used to identify the trends in variables present in the dataset. This helps in setting up a track for moving towards bivariate and multivariate analysis.

Bivariate Plots Section

Observations from the correlation matrix.

  • Total Acidity is highly correlatable with fixed acidity.

  • pH appears correlatable with acidity, citric acid, chlorides, and residual sugars.

  • No single property appears to correlate with quality.

Further, in this section, metrics of interest are evaluated to check their significance on the wine quality. Moreover, bivariate relationships between other variables are also studied.

Acidity vs. Rating & Quality

The boxplots depicting quality also depicts the distribution of various wines, and we can again see 5 and 6 quality wines have the most share. The blue dot is the mean, and the middle line shows the median.

The box plots show how the acidity decreases as the quality of wine improve. However, the difference is not very noticeable. Since most wines tend to maintain a similar acidity level & given the fact that volatile acidity is responsible for the sour taste in wine, hence a density plot of the said attribute is plotted to investigate the data.

Red Wine of quality 7 and 8 have their peaks for volatile.acidity well below the 0.4 mark. Wine with quality 3 has the pick at the most right hand side (towards more volatile acidity). This shows that the better quality wines are lesser sour and in general have lesser acidity.

Alcohol vs. Quality

The plot between residual sugar and alcohol content suggests that there is no erratic relation between sugar and alcohol content, which is surprising as alcohol is a byproduct of the yeast feeding off of sugar during the fermentation process. That inference could not be established here.

Alcohol and quality appear to be somewhat correlatable. Lower quality wines tend to have lower alcohol content. This can be further studied using boxplots.

The boxplots show an indication that higher quality wines have higher alcohol content. This trend is shown by all the quality grades from 3 to 8 except quality grade 5.

Does this mean that by adding more alcohol, we’d get better wine?

The above line plot indicates nearly a linear increase till 13% alcohol concetration, followed by a steep downwards trend. The graph has to be smoothened to remove variances and noise.

Sulphates vs. Quality

Good wines have higher sulphates values than bad wines, though the difference is not that wide.

There is a slight trend implying a relationship between sulphates and wine quality, mainly if extreme sulphate values are ignored, i.e., because disregarding measurements where sulphates > 1.0 is the same as disregarding the positive tail of the distribution, keeping just the normal-looking portion. However, the relationship is mathematically, still weak.

Bivariate Analysis - Summary

There is no apparent and mathematically strong correlation between any wine property and the given quality. Alcohol content is a strong contender, but even so, the correlation was not particularly strong.

Most properties have roughly normal distributions, with some skew in one tail. Scatterplot relationships between these properties often showed a slight trend within the bulk of property values. However, as soon as we leave the expected range, the trends reverse. For example, Alcohol Content or Sulphate vs. Quality. The trend is not a definitive one, but it is seen in different variables.

Possibly, obtaining an outlier property (say sulphate content) is particularly challenging to do in the wine making process. Alternatively, there is a change that the wines that exhibit outlier properties are deliberately of a non-standard variety. In that case, it could be that wine judges have a harder time agreeing on a quality rating.

Multivariate Plots Section

This section includes visualizations that take bivariate analysis a step further, i.e., understand the earlier patterns better or to strengthen the arguments that were presented in the previous section.

Alcohol, Volatile Acid & Wine Rating

Earlier inspections suggested that the volatile acidity and alcohol had high correlations values of negative and positive. Alcohol seems to vary more than volatile acidity when we talk about quality, nearly every Rating A wine has less than 0.6 volatile acidity.

Understanding the Significance of Acidity

Nearly every wine has volatile acidity less than 0.8. As discussed earlier the A rating wines all have volatile.acidity of less than 0.6. For wines with rating B, the volatile acidity is between 0.4 and 0.8. Some C rating wine have a volatile acidity value of more than 0.8

Most A rating wines have citric acid value of 0.25 to 0.75 while the B rating wines have citric acid value below 0.50.

Understanding the Significance of Sulphates

It is incredible to see that nearly all wines lie below 1.0 sulphates level. Due to overplotting, wines with rating B have been removed. It can be seen rating A wines mostly have sulphate values between 0.5 and 1 and the best rated wines have sulphate values between 0.6 and 1. Alcohol has the same values as seen before.

Density & Sugar

Higher quality wines appear to have a slight correlation with higher acidity across all densities. Moreover, there are abnormally high and low quality wines coincident with higher-than-usual sugar content.

Multivariate Analysis - Summary

Based on the investigation, it can be said that higher citric.acid and lower volatile.acidity contribute towards better wines. Also, better wines tend to have higher alcohol content.

There were surprising results with suplhates and alcohol graphs. Sulphates had a better correlation with quality than citric acid, still the distribution was not that distinct between the different quality wines. Further nearly all wines had a sulphate content of less than 1, irrespective of the alcohol content; suplhate is a byproduct of fermantation just like alcohol.

Based on the analysis presented, it can be noted because wine rating is a subjective measure, it is why statistical correlation values are not a very suitable metric to find important factors. This was realized half-way through the study. The graphs aptly depict that there is a suitable range and it is some combination of chemical factors that contribute to the flavour of wine.

Final Plots and Summary

Plot One

Description One

The plot is from the univariate section, which introduced the idea of this analysis. As in the analysis, there are plenty of visualizations which only plot data-points from A and C rated wines. A first comparison of only the ‘C’ and ‘A’ wines helped find distinctive properties that separate these two.

It also suggests that it is likely that the critics can be highly subjective as they do not rate any wine with a measure of 1, 2 or 9, 10. With most wines being mediocre, the wines that had the less popular rating must’ve caught the attention of the wine experts, hence, the idea was derived to compare these two rating classes.

Plot Two

Description Two

These are plots taken from bivariate analysis section discussing the effect of alcohol percentage on quality.

The first visualization was especially appealing to me because of the way that you can almost see the distribution shift from left to right as wine ratings increase. Again, just showing a general tendency instead of a substantial significance in judging wine quality.

The above boxplots show a steady rise in the level of alcohol. An interesting trend of a decrement of quality above 13%, alcohol gave way to further analysis which shows that a general correlation measure might not be suitable for the study.

The plot that follows set the basis for which I carried out the complete analysis. Rather than emphasizing on mathematical correlation measures, the inferences drawn were based on investigating the visualizations. This felt suitable due to the subjectivity in the measure of wine quality.

Plot Three

Description Three

These plots served as finding distinguishing boundaries for given attributes, i.e., sulphates, citric.acid, alcohol, volatile.acidity. The conclusions drawn from these plots are that sulphates should be high but less than 1 with an alcohol concentration around 12-13%, along with less (< 0.6) volatile acidity. It can be viewed nearlyas a depiction of a classification methodology without application of any machine learning algorithm. Moreover, these plots strengthened the arguments laid in the earlier analysis of the data.


Reflection

In this project, I was able to examine relationship between physicochemical properties and identify the key variables that determine red wine quality, which are alcohol content volatile acidity and sulphate levels.

The dataset is quite interesting, though limited in large-scale implications. I believe if this dataset held only one additional variable it would be vastly more useful to the layman. If price were supplied along with this data one could target the best wines within price categories, and what aspects correlated to a high performing wine in any price bracket.

Overall, I was initially surprised by the seemingly dispersed nature of the wine data. Nothing was immediately correlatable to being an inherent quality of good wines. However, upon reflection, this is a sensible finding. Wine making is still something of a science and an art, and if there was one single property or process that continually yielded high quality wines, the field wouldn’t be what it is.

According to the study, it can be concluded that the best kind of wines are the ones with an alcohol concentration of about 13%, with low volatile acidity & high sulphates level (with an upper cap of 1.0 g/dm^3).

Future Work & Limitations

With my amateurish knowledge of wine-tasting, I tried my best to relate it to how I would rate a bottle of wine at dining. However, in the future, I would like to do some research into the winemaking process. Some winemakers might actively try for some property values or combinations, and be finding those combinations (of 3 or more properties) might be the key to truly predicting wine quality. This investigation was not able to find a robust generalized model that would consistently be able to predict wine quality with any degree of certainty.

If I were to continue further into this specific dataset, I would aim to train a classifier to correctly predict the wine category, in order to better grasp the minuteness of what makes a good wine.

Additionally, having the wine type would be helpful for further analysis. Sommeliers might prefer certain types of wines to have different properties and behaviors. For example, a Port (as sweet desert wine) surely is rated differently from a dark and robust abernet Sauvignon, which is rated differently from a bright and fruity Syrah. Without knowing the type of wine, it is entirely possible that we are almost literally comparing apples to oranges and can’t find a correlation.