Consider this data frame:
set.seed(123)
data <- data.frame(Loc = rep(letters[1:20], each = 5),
ID = 1:200,
cluster = sample(4, 200, replace=T))
Loc
is a grouping variable for the ID
s, and each ID
was assigned to a cluster
based on some attribute.
I want to create a data.frame that shows what percent of each Loc
were assigned to each of the 4 cluster
s:
Loc 1 2 3 4
a ... ... ... ...
b ... ... ... ...
c ... ... ... ...
...
So the numbers above would be expressed in percentages. I also want to add a column that shows the original number of observations in each Loc
, so the final data frame would look like this:
Loc 1 2 3 4 total
a ... ... ... ... ...
b ... ... ... ... ...
c ... ... ... ... ...
...
What is the best way to go about producing this?
question from:
https://stackoverflow.com/questions/65905171/how-to-summarize-percent-cluster-assignment-by-groups 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…