I am new to doing NMDS in R, and I have a question about how the vectors that characterize the fit of environmental data are calculated.
Context: I am exploring how the concentrations of metals in sediment affect benthic community composition. I am using scores()
to extract coordinate data from envfit()
, and overlaying the resulting vectors on NMDS figures.
My understanding based on "The environmental variables are the dependent variables that are explained by the ordination scores, and each dependent variable is analysed separately." in the envfit()
help text was that, regardless of how many environmental variables you want to overlay on the NMDS plot, the coordinates for each variable should be the same.
This appears to be the case when I include a few variables. However, when I start adding 10 or more environmental variables, the results can change dramatically. I don't believe this is a coding error: as you can see, the code I'm running is identical except for the addition of one variable.
The code I'm using is, essentially:
data.frame(
scores(envfit(bci.mds_dens,na.rm = TRUE,
dat_env%>%
filter(loc_id!="TL20-BM-07")%>%
select(`Water Depth (ft)`,
`Arsenic (mg/kg)`)), #Selecting metals
display="vectors"))
where dat_env
contains each environmental variable as a column. The result of this code is:
NMDS1 NMDS2
Water Depth (ft) 0.1196980 0.1212918
Arsenic (mg/kg) -0.1417012 -0.3099184
If I add more variables, I would not expect the results to change if they are calculated independently. This is the case up to a certain point. If I run, for example, selecting Arsenic (mg/kg)
:Zinc (mg/kg)
, I get the same results for water depth and arsenic.
data.frame(
scores(envfit(bci.mds_dens,na.rm = TRUE,
dat_env%>%
filter(loc_id!="TL20-BM-07")%>%
select(`Water Depth (ft)`,
`Arsenic (mg/kg)`:`Zinc (mg/kg)`)), #Selecting additional metals
display="vectors")
)
NMDS1 NMDS2
Water Depth (ft) 0.11969801 0.12129178
Arsenic (mg/kg) -0.14170121 -0.30991835
Cadmium (mg/kg) -0.04827247 0.42203081
Chromium (mg/kg) -0.20945984 0.09785620
Copper (mg/kg) -0.28567961 0.04701332
Lead (mg/kg) -0.07496529 -0.20329097
Mercury (mg/kg) -0.14035867 -0.20792955
Nickel (mg/kg) -0.12195289 0.09952376
Zinc (mg/kg) -0.19545612 0.02121037
However, if I include one additional environmental variable (selecting Arsenic (mg/kg)
:Aluminum (mg/kg)
), all of the results change.
data.frame(
scores(envfit(bci.mds_dens,na.rm = TRUE,
dat_env%>%
filter(loc_id!="TL20-BM-07")%>%
select(`Water Depth (ft)`,
`Arsenic (mg/kg)`:`Aluminum (mg/kg)`)), #Selecting additional metals
display="vectors")
)
NMDS1 NMDS2
Water Depth (ft) 0.13765597 0.3345912
Arsenic (mg/kg) 0.28208807 0.3232619
Cadmium (mg/kg) 0.23706765 0.3226476
Chromium (mg/kg) 0.21876222 0.2343784
Copper (mg/kg) 0.59141618 0.6611765
Lead (mg/kg) 0.32338315 0.2917671
Mercury (mg/kg) 0.25736750 0.3537455
Nickel (mg/kg) -0.18547038 0.1815348
Zinc (mg/kg) 0.42798824 0.5076813
Aluminum (mg/kg) 0.08773589 0.2549420
I don't believe this is a coding error, as all I'm doing is selecting additional data. I'm hoping for some insight into how envfit()
works so that I can figure out what's going on here.
I haven't provided reproducible data for confidentiality issues. I can try to recreate this problem using dummy data if people think it would be particularly helpful, but my question is more to do with the function itself.
Thank you for any help!