Results

A CART analysis was performed to gain an understanding of the relationships between the predictor and response variables.  The resulting tree was deemed not a very good model due various reason (R-1).  The individual and cumulative variance explained for each node on the tree is low (R-2).  Each branch on the tree is not representing the data well.  The first node of the branch on the tree is elevation because it explained the most amount of variance but SCA explained close to this (R-3).   Elevation accounts for 18.7% while SCA accounts for 12.2%, meaning either variable could be the first node.  This occurs because many of the variables in the data set are correlated but CART does not take this into account. Another indicator that the CART tree model was not a good fit was found when trying to determine the range of values captured by each leaf.  The EC range for the first leaf on the left was found to be between 8 and 48.  Based on knowledge of the study, this range was too large. From the CART analysis, there is a main theme, that elevation and SCA are important variables related to the EC of the soil.

R-1.  Regression tree created from CART analysis pruned to show only nodes with variance explained >0.03.

Picture

R-2. Variance explained by each node on the tree (R-1) given by CART analysis.

Picture

R-3. Alternate variance explained by variables at different nodes from CART analysis.

Picture
RandomForest was used to make predictions on the EC of a field with the inputs being the x and y coordinates.  The importance information from the method showed similar results to that of the CART interpretation.  Elevation and SCA were the most important parameters (R-4). Importance is the amount of times each variable was used as a primary node.  The variable with the highest importance value is the  main predictor variable.  Predicted values of EC were generated using RandomForest and then compared to the measured values (R-5, R-6). 

R-4. Importance information from RandomForest analysis for predicting electrical conductivity values.

Picture

R-5.  Measured electrical conductivity values for field site near Lacombe, Alberta.

Picture

R-6. Predicted electrical conductivity values from RandomForest model.

Picture
RandomForest was used to make another prediction, for the satellite imagery zone. The satellite imagery zones for the field was given with the dataset.  Satellite imagery is an alternative to the using EC in precision agriculture based on color of the field at maturity. The predicted satellite imagery zones from the model were compared to the measured  (R-7, R-8).  From the importance information there is a slightly different pattern then described in the CART and EC RandomForest predictions(R-9).  Elevation still has the highest importance but after that the next two are gradient then SDA.

R-7. Measured satellite imagery zones for field site near Lacombe, Alberta.

Picture

R-8. Predicted satellite imagery zones from RandomForest model.

Picture

R-9. Importance information from RandomForest analysis for predicting satellite imagery zones.

Picture