Cfrgtkky

Question

I have a matrix:

m <- matrix(c(

  1,    1,    1,    0,    0,    0,

  0,    0,    0,    0,    0,    0,

  3,    0,    0,    0,    0,    0,

  3,    0,    0,    0,    0,    2,

  3,    0,    0,    0,    0,    0,

  3,    0,    0,    0,    2,    2),

  ncol = 6, byrow = TRUE)



     [,1] [,2] [,3] [,4] [,5] [,6]

[1,]    1    1    1    0    0    0

[2,]    0    0    0    0    0    0

[3,]    3    0    0    0    0    0

[4,]    3    0    0    0    0    2 # <- island 3, value 2

[5,]    3    0    0    0    0    0

[6,]    3    0    0    0    2    2 # <- island  4, also value 2

In this matrix, there are four 'islands', i.e. non-zero values separated by zeros:

(1) an island composed of three 1's, (2) four 3's, (3) one 2, and (4) two 2's.

Thus, two islands are composed of the value 2. I want to identify such 'duplicate' islands and change the values of one of the 'islands' (either will do) to the next available number (4 in this case):

     [,1] [,2] [,3] [,4] [,5] [,6]

[1,]    1    1    1    0    0    0

[2,]    0    0    0    0    0    0

[3,]    3    0    0    0    0    0

[4,]    3    0    0    0    0    2

[5,]    3    0    0    0    0    0

[6,]    3    0    0    0    4    4

Not clear (I see 3 'islands', -- why is the next value 4?, ...). Please explain it more and give an expected output as well. Also make sure your examples are reproducible — Nov 20 '18 at 10:47
There are 4 islands: one made up of 1's, one made up of 3's and 2 made up of 2's. 4 is a number that hasn't already been taken and is one more than the maximum number already used. — Nov 20 '18 at 10:49
As I said in the original question, change either of the islands labelled '2' to '4' but not both. — Nov 20 '18 at 10:52

score 2 · Accepted Answer · 2018-11-20 14:51:24Z

Fun question! Let's take a more involved case

(M <- matrix(c(1, 0, 3, 3, 3, 3, 1, 0, 0, 0, 0, 0, 1, 0, 3, 0, 2, 

               0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 2, 1, 0, 0, 2, 0, 2), 6, 6))

#      [,1] [,2] [,3] [,4] [,5] [,6]

# [1,]    1    1    1    0    0    1

# [2,]    0    0    0    0    0    0

# [3,]    3    0    3    3    0    0

# [4,]    3    0    0    0    0    2

# [5,]    3    0    2    0    0    0

# [6,]    3    0    0    0    2    2

Here's a graph-based solution.

library(igraph)

# Indices of nonzero matrix elements

idx <- which(M != 0, arr.ind = TRUE)

# Adjacency matrix for matrix entries

# Two entries are adjacent if their column or row number differs by one

# Also, due to idx, an implicit condition is also that the two entries are the same

adj <- 1 * (as.matrix(dist(idx, method = "manhattan")) == 1)

# Creating loops as to take into account singleton islands

diag(adj) <- 1

# A corresponding graphs

g <- graph_from_adjacency_matrix(adj, mode = "undirected")

# Connected components of this graph

cmps <- clusters(g)

# Going over unique values of M

for(i in 1:max(M)) {

  # Islands of value i

  un <- unique(cmps$membership[M[idx] == i])

  # More than one island?

  if(length(un) > 1)

    # If so, let's go over islands 2, 3, ...

    for(cmp in un[-1])

      # ... and replace corresponding matrix entries by max(M) + 1

      M[idx[cmps$membership == cmp, , drop = FALSE]] <- max(M) + 1

}



M

#      [,1] [,2] [,3] [,4] [,5] [,6]

# [1,]    1    1    1    0    0    4

# [2,]    0    0    0    0    0    0

# [3,]    3    0    7    7    0    0

# [4,]    3    0    0    0    0    6

# [5,]    3    0    2    0    0    0

# [6,]    3    0    0    0    5    5

Also note that using adj alone we could find all the islands if we could find its permutation leading to a block-diagonal matrix with the maximal number of blocks. Then each block would correspond to an island. However, I couldn't find an R implementation of a relevant procedure.

I was trying to use a depth first search, this is perfect.

– user3651829
Nov 20 '18 at 13:22 — Nov 20 '18 at 13:22

score 1 · Accepted Answer · 2018-11-20 19:47:43Z

'Islands' of non-zero values can be identified by raster::clump*. Then use data.table convenience functions to identify which values should be updated.

library(raster)

library(data.table)



# get index of non-zero values. re-order to match the clump order

ix <- which(m != 0, arr.ind = TRUE)

ix <- ix[order(ix[ , "row"]), ]



# get clumps

cl <- clump(raster(m))

cl_ix <- cl@data@values



# put stuff in a data.table and order by x

d <- data.table(ix, x = m[ix], cl_ix = cl_ix[!is.na(cl_ix)])

setorder(d, x, cl_ix)



# for each x, create a counter of runs of clump index

d[ , g := rleid(cl_ix), by = x]



# for 'duplicated' runs...

# ...add to x based on runs of x and clump index runs

d[g > 1, x := max(d$x) + rleid(x, g)]



# update matrix

m2 <- m

m2[as.matrix(d[ , .(row, col)])] <- d$x



m

#      [,1] [,2] [,3] [,4] [,5] [,6]

# [1,]    1    1    1    0    0    1

# [2,]    0    0    0    0    0    0

# [3,]    3    0    3    3    0    0

# [4,]    3    0    0    0    0    2

# [5,]    3    0    2    0    0    0

# [6,]    3    0    0    0    2    2



m2

#      [,1] [,2] [,3] [,4] [,5] [,6]

# [1,]    1    1    1    0    0    4

# [2,]    0    0    0    0    0    0

# [3,]    3    0    7    7    0    0

# [4,]    3    0    0    0    0    2

# [5,]    3    0    5    0    0    0

# [6,]    3    0    0    0    6    6

*Note that the clump function requires that the igraph package is available.

score 0 · Accepted Answer · 2018-11-20 11:20:58Z

It was harder than I thought becase of the "not both" condition, I achieved the result with a while loop for now, we'll se if it can be improved:

(basically we move by row and check if the island is found, if so we end our research)

# some useful variables

i=1 # row counter

counter=0 # check if island is found

max_m <- max(m) #finds the max value in the matrix, to fill



while(counter == 0) {



  if (any(m[i, ] == 2)) { # check if we find the island in the row, otherwise skip

    row <- m[i, ]

    row[row == 2] <- max_m + 1 # here we change the value

    m[i, ] <- row

    counter <- counter + 1

  }



  i = i + 1 # we move up one row

  #cat("row number: ", i, "n") # sanity check to see if it was an infinite loop

}

m

#      [,1] [,2] [,3] [,4] [,5] [,6]

# [1,]    1    1    1    0    0    0

# [2,]    0    0    0    0    0    0

# [3,]    3    0    0    0    0    0

# [4,]    3    0    0    0    0    4

# [5,]    3    0    0    0    0    0

# [6,]    3    0    0    0    2    2

This is far from perfect, because we move by rows, so if the first island is across a column we would change only the first value.

Example of unexpected result:

#      [,1] [,2] [,3] [,4] [,5] [,6]

# [1,]    1    1    1    0    0    0

# [2,]    0    0    0    0    0    0

# [3,]    3    0    0    0    0    0

# [4,]    3    0    0    0    0    4

# [5,]    3    0    0    0    0    2 # problem here

# [6,]    3    0    0    0    0    0

Data used:

m <- matrix(c(rep(1, 3),

              rep(0, 9),

              3, 

              rep(0, 5),

              3,

              rep(0, 4),

              2,

              3,

              rep(0, 5),

              3,

              rep(0,3),

              rep(2, 2)),ncol=6,nrow=6, byrow = T)

score 0 · Accepted Answer · 2018-11-20 13:20:19Z

This can easily be achieved with package TraMineR.

islander <- function(mat) {

  require(TraMineR)

  rows.mat.seq <- seqdef(mat)  # seeks all sequences in rows 

  cols.mat.seq <- seqdef(t(mat))  # tranposed version

  rows <- seqpm(rows.mat.seq, 22)$MIndex  # seeks for sub sequence 2-2 in rows

  cols <- seqpm(cols.mat.seq, 22)$MIndex  # seeks for sub sequence 2-2 in columns

  if (length(cols) == 0) {  # the row case

    mat[rows, which(mat[rows, ] == 2)] <- 4

    return(mat)

  } else {  # the column case

    mat[which(mat[, cols] == 2), cols] <- 4

    return(mat)

  }

}

Yielding

> islander(row.mat)

...

     [,1] [,2] [,3] [,4] [,5] [,6]

[1,]    1    1    1    0    0    0

[2,]    0    0    0    0    0    0

[3,]    3    0    0    0    0    0

[4,]    3    0    0    0    0    2

[5,]    3    0    0    0    0    0

[6,]    3    0    0    0    4    4



> islander(col.mat)

...

     [,1] [,2] [,3] [,4] [,5] [,6]

[1,]    1    1    1    0    0    0

[2,]    0    0    0    0    0    0

[3,]    3    0    0    0    0    0

[4,]    3    0    0    0    0    0

[5,]    3    0    0    0    0    4

[6,]    3    0    0    2    0    4

Note: If your island is longer, you need to adept the code, e.g. for length of island is 3 do seqpm(., 222). It is certainly possible to implement the consideration of all cases into the function.

Data

row.mat <- structure(c(1, 0, 3, 3, 3, 3, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 

                   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 2, 0, 2), .Dim = c(6L, 

                                                                                      6L))

col.mat <- structure(c(1, 0, 3, 3, 3, 3, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 

                    0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2), .Dim = c(6L, 

                                                                                       6L))



> row.mat

     [,1] [,2] [,3] [,4] [,5] [,6]

[1,]    1    1    1    0    0    0

[2,]    0    0    0    0    0    0

[3,]    3    0    0    0    0    0

[4,]    3    0    0    0    0    2

[5,]    3    0    0    0    0    0

[6,]    3    0    0    0    2    2

> col.mat

     [,1] [,2] [,3] [,4] [,5] [,6]

[1,]    1    1    1    0    0    0

[2,]    0    0    0    0    0    0

[3,]    3    0    0    0    0    0

[4,]    3    0    0    0    0    0

[5,]    3    0    0    0    0    2

[6,]    3    0    0    2    0    2

score 2 · Accepted Answer · 2018-11-20 14:51:24Z

Fun question! Let's take a more involved case

(M <- matrix(c(1, 0, 3, 3, 3, 3, 1, 0, 0, 0, 0, 0, 1, 0, 3, 0, 2, 

               0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 2, 1, 0, 0, 2, 0, 2), 6, 6))

#      [,1] [,2] [,3] [,4] [,5] [,6]

# [1,]    1    1    1    0    0    1

# [2,]    0    0    0    0    0    0

# [3,]    3    0    3    3    0    0

# [4,]    3    0    0    0    0    2

# [5,]    3    0    2    0    0    0

# [6,]    3    0    0    0    2    2

Here's a graph-based solution.

library(igraph)

# Indices of nonzero matrix elements

idx <- which(M != 0, arr.ind = TRUE)

# Adjacency matrix for matrix entries

# Two entries are adjacent if their column or row number differs by one

# Also, due to idx, an implicit condition is also that the two entries are the same

adj <- 1 * (as.matrix(dist(idx, method = "manhattan")) == 1)

# Creating loops as to take into account singleton islands

diag(adj) <- 1

# A corresponding graphs

g <- graph_from_adjacency_matrix(adj, mode = "undirected")

# Connected components of this graph

cmps <- clusters(g)

# Going over unique values of M

for(i in 1:max(M)) {

  # Islands of value i

  un <- unique(cmps$membership[M[idx] == i])

  # More than one island?

  if(length(un) > 1)

    # If so, let's go over islands 2, 3, ...

    for(cmp in un[-1])

      # ... and replace corresponding matrix entries by max(M) + 1

      M[idx[cmps$membership == cmp, , drop = FALSE]] <- max(M) + 1

}



M

#      [,1] [,2] [,3] [,4] [,5] [,6]

# [1,]    1    1    1    0    0    4

# [2,]    0    0    0    0    0    0

# [3,]    3    0    7    7    0    0

# [4,]    3    0    0    0    0    6

# [5,]    3    0    2    0    0    0

# [6,]    3    0    0    0    5    5

Also note that using adj alone we could find all the islands if we could find its permutation leading to a block-diagonal matrix with the maximal number of blocks. Then each block would correspond to an island. However, I couldn't find an R implementation of a relevant procedure.

I was trying to use a depth first search, this is perfect.

– user3651829
Nov 20 '18 at 13:22 — Nov 20 '18 at 13:22

score 1 · Accepted Answer · 2018-11-20 19:47:43Z

'Islands' of non-zero values can be identified by raster::clump*. Then use data.table convenience functions to identify which values should be updated.

library(raster)

library(data.table)



# get index of non-zero values. re-order to match the clump order

ix <- which(m != 0, arr.ind = TRUE)

ix <- ix[order(ix[ , "row"]), ]



# get clumps

cl <- clump(raster(m))

cl_ix <- cl@data@values



# put stuff in a data.table and order by x

d <- data.table(ix, x = m[ix], cl_ix = cl_ix[!is.na(cl_ix)])

setorder(d, x, cl_ix)



# for each x, create a counter of runs of clump index

d[ , g := rleid(cl_ix), by = x]



# for 'duplicated' runs...

# ...add to x based on runs of x and clump index runs

d[g > 1, x := max(d$x) + rleid(x, g)]



# update matrix

m2 <- m

m2[as.matrix(d[ , .(row, col)])] <- d$x



m

#      [,1] [,2] [,3] [,4] [,5] [,6]

# [1,]    1    1    1    0    0    1

# [2,]    0    0    0    0    0    0

# [3,]    3    0    3    3    0    0

# [4,]    3    0    0    0    0    2

# [5,]    3    0    2    0    0    0

# [6,]    3    0    0    0    2    2



m2

#      [,1] [,2] [,3] [,4] [,5] [,6]

# [1,]    1    1    1    0    0    4

# [2,]    0    0    0    0    0    0

# [3,]    3    0    7    7    0    0

# [4,]    3    0    0    0    0    2

# [5,]    3    0    5    0    0    0

# [6,]    3    0    0    0    6    6

*Note that the clump function requires that the igraph package is available.

score 0 · Accepted Answer · 2018-11-20 11:20:58Z

It was harder than I thought becase of the "not both" condition, I achieved the result with a while loop for now, we'll se if it can be improved:

(basically we move by row and check if the island is found, if so we end our research)

# some useful variables

i=1 # row counter

counter=0 # check if island is found

max_m <- max(m) #finds the max value in the matrix, to fill



while(counter == 0) {



  if (any(m[i, ] == 2)) { # check if we find the island in the row, otherwise skip

    row <- m[i, ]

    row[row == 2] <- max_m + 1 # here we change the value

    m[i, ] <- row

    counter <- counter + 1

  }



  i = i + 1 # we move up one row

  #cat("row number: ", i, "n") # sanity check to see if it was an infinite loop

}

m

#      [,1] [,2] [,3] [,4] [,5] [,6]

# [1,]    1    1    1    0    0    0

# [2,]    0    0    0    0    0    0

# [3,]    3    0    0    0    0    0

# [4,]    3    0    0    0    0    4

# [5,]    3    0    0    0    0    0

# [6,]    3    0    0    0    2    2

This is far from perfect, because we move by rows, so if the first island is across a column we would change only the first value.

Example of unexpected result:

#      [,1] [,2] [,3] [,4] [,5] [,6]

# [1,]    1    1    1    0    0    0

# [2,]    0    0    0    0    0    0

# [3,]    3    0    0    0    0    0

# [4,]    3    0    0    0    0    4

# [5,]    3    0    0    0    0    2 # problem here

# [6,]    3    0    0    0    0    0

Data used:

m <- matrix(c(rep(1, 3),

              rep(0, 9),

              3, 

              rep(0, 5),

              3,

              rep(0, 4),

              2,

              3,

              rep(0, 5),

              3,

              rep(0,3),

              rep(2, 2)),ncol=6,nrow=6, byrow = T)

score 0 · Accepted Answer · 2018-11-20 13:20:19Z

This can easily be achieved with package TraMineR.

islander <- function(mat) {

  require(TraMineR)

  rows.mat.seq <- seqdef(mat)  # seeks all sequences in rows 

  cols.mat.seq <- seqdef(t(mat))  # tranposed version

  rows <- seqpm(rows.mat.seq, 22)$MIndex  # seeks for sub sequence 2-2 in rows

  cols <- seqpm(cols.mat.seq, 22)$MIndex  # seeks for sub sequence 2-2 in columns

  if (length(cols) == 0) {  # the row case

    mat[rows, which(mat[rows, ] == 2)] <- 4

    return(mat)

  } else {  # the column case

    mat[which(mat[, cols] == 2), cols] <- 4

    return(mat)

  }

}

Yielding

> islander(row.mat)

...

     [,1] [,2] [,3] [,4] [,5] [,6]

[1,]    1    1    1    0    0    0

[2,]    0    0    0    0    0    0

[3,]    3    0    0    0    0    0

[4,]    3    0    0    0    0    2

[5,]    3    0    0    0    0    0

[6,]    3    0    0    0    4    4



> islander(col.mat)

...

     [,1] [,2] [,3] [,4] [,5] [,6]

[1,]    1    1    1    0    0    0

[2,]    0    0    0    0    0    0

[3,]    3    0    0    0    0    0

[4,]    3    0    0    0    0    0

[5,]    3    0    0    0    0    4

[6,]    3    0    0    2    0    4

Note: If your island is longer, you need to adept the code, e.g. for length of island is 3 do seqpm(., 222). It is certainly possible to implement the consideration of all cases into the function.

Data

row.mat <- structure(c(1, 0, 3, 3, 3, 3, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 

                   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 2, 0, 2), .Dim = c(6L, 

                                                                                      6L))

col.mat <- structure(c(1, 0, 3, 3, 3, 3, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 

                    0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2), .Dim = c(6L, 

                                                                                       6L))



> row.mat

     [,1] [,2] [,3] [,4] [,5] [,6]

[1,]    1    1    1    0    0    0

[2,]    0    0    0    0    0    0

[3,]    3    0    0    0    0    0

[4,]    3    0    0    0    0    2

[5,]    3    0    0    0    0    0

[6,]    3    0    0    0    2    2

> col.mat

     [,1] [,2] [,3] [,4] [,5] [,6]

[1,]    1    1    1    0    0    0

[2,]    0    0    0    0    0    0

[3,]    3    0    0    0    0    0

[4,]    3    0    0    0    0    0

[5,]    3    0    0    0    0    2

[6,]    3    0    0    2    0    2

搜尋此網誌

Cfrgtkky

Identify duplicate islands in a matrix and change their values

4 Answers
4

Your Answer

Post as a guest

4 Answers
4

4 Answers
4

Post as a guest

Popular posts from this blog

mysqli_query(): Empty query in /home/lucindabrummitt/public_html/blog/wp-includes/wp-db.php on line 1924

Multiple Hosts connection in Hyperledger Fabric

How to send String Array data to Server using php in android

Identify duplicate islands in a matrix and change their values

4 Answers 4

Your Answer

Sign up or log in

Post as a guest

Post as a guest

4 Answers 4

4 Answers 4

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

mysqli_query(): Empty query in /home/lucindabrummitt/public_html/blog/wp-includes/wp-db.php on line 1924

Multiple Hosts connection in Hyperledger Fabric

How to send String Array data to Server using php in android

4 Answers
4

4 Answers
4

4 Answers
4