Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
195 views
in Technique[技术] by (71.8m points)

r - How to summarize percent cluster assignment by groups?

Consider this data frame:

set.seed(123)
data <- data.frame(Loc = rep(letters[1:20], each = 5),
                  ID = 1:200,
                  cluster = sample(4, 200, replace=T))

Loc is a grouping variable for the IDs, and each ID was assigned to a cluster based on some attribute.

I want to create a data.frame that shows what percent of each Loc were assigned to each of the 4 clusters:

Loc   1     2     3     4    
a     ...  ...   ...   ...    
b     ...  ...   ...   ...     
c     ...  ...   ...   ...     
...

So the numbers above would be expressed in percentages. I also want to add a column that shows the original number of observations in each Loc, so the final data frame would look like this:

Loc   1     2     3     4    total
a     ...  ...   ...   ...    ...
b     ...  ...   ...   ...    ...
c     ...  ...   ...   ...    ...
...

What is the best way to go about producing this?

question from:https://stackoverflow.com/questions/65905171/how-to-summarize-percent-cluster-assignment-by-groups

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

We can use the tidyverse:

library(tidyverse)

dat %>%
  count(Loc, cluster) %>%
  group_by(Loc) %>%
  # calculate percent within Loc
  mutate(n = n / sum(n)) %>%
  # long to wide -> clusters now columns 
  pivot_wider(names_from = cluster, values_from = n) %>%
  # add count by Loc
  inner_join(count(dat, Loc, name = "Total"))

 #   Loc     `1`   `2`   `3`   `4` Total
 # 1 a       0.2   0.2   0.4   0.2    10
 # 2 b       0.2   0.4   0.3   0.1    10
 # 3 c       0.2   0.4   0.1   0.3    10
 # ...

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...