Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
193 views
in Technique[技术] by (71.8m points)

r - All combinations picking one by group

Given are a group indicator variable and some values within groups:

group = rep(c(1,2), each = 3)
val   = letters[1:6]

cbind(group, val)

     group val
[1,] "1"   "a"
[2,] "1"   "b"
[3,] "1"   "c"
[4,] "2"   "d"
[5,] "2"   "e"
[6,] "2"   "f"

I am looking for a matrix giving me all unique combinations that result from combining one element from each group with one element from each other group. That is, only one element per group is allowed to be ''active'' in each combination.

The desired output is a matrix where each column represents one of the possible combinations. The first four columns of the result matrix may look like this:

     [,1] [,2] [,3] [,4]
[1,]    1    0    0    1
[2,]    0    1    0    0
[3,]    0    0    1    0
[4,]    1    1    1    0
[5,]    0    0    0    1
[6,]    0    0    0    0

where the rows corresponds to the rows given in the input matrix above. The first column tells you that a is active in group 1 and d is active in group 2. The second column tells you that b is active in group 1 and d is active in group 2. The third column tells you that c is active in 1 and d is active in 2 and so on. Hence, the sum of each column will always be equal to the number of groups, because only one element per group is allowed to be active.

I'm a bit puzzled as to how to obtain the desired output matrix in an organized fashion. I've been thinking of enumerating all possible combinations and restricting to feasible ones (where the sum of the resulting vector elements within groups is exactly equal to one for all groups), but this may cause memory problems for large data sets and I am unsure whether there is a more elegant and efficient approach.

Edit: The solution should generalize to an arbitrary number of groups and an arbitrary number of elements (>1) within groups.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

I think this should scale as needed:

group = rep(c(1,2), each = 3)
val   = letters[1:6]
input = data.frame(group, val)

combos = do.call(expand.grid, split(input$val, input$group))

combo_matrix = matrix(0, nrow = nrow(input), ncol = nrow(combos))
for(i in 1:ncol(combos)) {
  combo_matrix[cbind(match(combos[[i]], input$val), 1:ncol(combo_matrix))] = 1
}

combo_matrix
#      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
# [1,]    1    0    0    1    0    0    1    0    0
# [2,]    0    1    0    0    1    0    0    1    0
# [3,]    0    0    1    0    0    1    0    0    1
# [4,]    1    1    1    0    0    0    0    0    0
# [5,]    0    0    0    1    1    1    0    0    0
# [6,]    0    0    0    0    0    0    1    1    1

It does assume that val values are not repeated in the input data frame.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...