Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
272 views
in Technique[技术] by (71.8m points)

r - Spread with data.frame/tibble with duplicate identifiers

The documentation for tidyr suggests that gather and spread are transitive, but the following example with the "iris" data shows they are not, but it is not clear why. Any clarification would be greatly appreciated

iris.df = as.data.frame(iris)
long.iris.df = iris.df %>% gather(key = feature.measure, value = size, -Species)
w.iris.df = long.iris.df %>% spread(key = feature.measure, value = size, -Species)

I expected the data frame "w.iris.df" to be the same as "iris.df" but received the following error instead:

"Error: Duplicate identifiers for rows (1, 2, 3, 4, 5, 6, 7, 8, 9..."

My general question is how to reverse an application of "gather" on this sort of dataset.

question from:https://stackoverflow.com/questions/25960394/spread-with-data-frame-tibble-with-duplicate-identifiers

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Hadley's intervention was unsurprisingly perfect... but I ended up mucking with the syntax a bit after that... so for what it's worth, I post the fully operational code (sorry my syntax is a bit different than above):

library(tidyr)
library(dplyr)

wide <- 
  iris %>%
  mutate(row = row_number()) %>%
  gather(vars, val, -Species, -row) %>%
  spread(vars, val)

head(wide)
#   Species row Petal.Length Petal.Width Sepal.Length Sepal.Width
# 1  setosa   1          1.4         0.2          5.1         3.5
# 2  setosa   2          1.4         0.2          4.9         3.0
# 3  setosa   3          1.3         0.2          4.7         3.2
# 4  setosa   4          1.5         0.2          4.6         3.1
# 5  setosa   5          1.4         0.2          5.0         3.6
# 6  setosa   6          1.7         0.4          5.4         3.9

head(iris)
# Sepal.Length Sepal.Width Petal.Length Petal.Width Species
# 1          5.1         3.5          1.4         0.2  setosa
# 2          4.9         3.0          1.4         0.2  setosa
# 3          4.7         3.2          1.3         0.2  setosa
# 4          4.6         3.1          1.5         0.2  setosa
# 5          5.0         3.6          1.4         0.2  setosa
# 6          5.4         3.9          1.7         0.4  setosa

They are the same.... just need to reorder if u feel like it...

wide <- wide[,c(3, 4, 5, 6, 1)]  ## Reorder and then remove "row" column

and done.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...