Eﬀicient determination of the k most vital edges for the minimum spanning tree problem

. We study in this paper the problem of ﬁnding in a graph a subset of k edges whose deletion causes the largest increase in the weight of a minimum spanning tree. We propose for this problem an explicit enumeration algorithm whose complexity, when compared to the current best algorithm, is better for general k but very slightly worse for ﬁxed k . More interestingly, unlike in the previous algorithms, we can easily adapt our algorithm so as to transform it into an implicit exploration algorithm based on a branch and bound scheme. We also propose a mixed integer programming formulation for this problem. Computational results show a clear superiority of the implicit enumeration algorithm both over the explicit enumeration algorithm and the mixed integer program.


Introduction
In many applications involving the use of communication or transportation networks, we often need to identify critical infrastructures.By critical infrastructure we mean a set of links whose damage causes the largest perturbation within the network.Modeling this network by a weighted graph, identifying critical infrastructures amounts to finding a subset of edges whose removal from the graph causes the largest increase in the cost.In the literature this problem is referred to as the k most vital edges problem.In this paper, we are interested in determining a subset of edges of the graph whose deletion causes the largest increase in the weight of a minimum spanning tree (MST).This problem is referred to as k Most Vital Edges MST.
The problem of finding the k most vital edges of a graph has been studied for various problems including shortest path [1, 7,11] and maximum flow [18,14,19].For the minimum spanning tree problem defined on a graph G with n vertices and m edges, Frederickson et al. [4] showed that, for general k, k Most Vital Edges MST is NP -hard and proposed an O(log k)-approximation algorithm.For a fixed k the problem is obviously polynomial.The case k = 1 has been largely studied in the literature [5,6,16].Hsu et al. [5] gave two algorithms in O(m log m) and O(n 2 ).Iwano and Katoh [6] proposed an algorithm in O(mα(m, n)) using Tarjan's result [17], where α is the inverse-Ackermann function.Pettie [12] improved the results of Tarjan [17] and Dixon et al. [3], and therefore the current best deterministic algorithm for solving the case k = 1 is in O(m log α(m, n)).Several exact algorithms based on an explicit enumeration of possible solutions have been proposed [8,9,15].The best one [8] runs in time O(n k α((k + 1)(n − 1), n)) and was achieved by reducing G to a sparse graph.Using Pettie's result [12], the running time of the later algorithm becomes O(n k log α((k + 1)(n − 1), n)).
In this paper we propose a new efficient algorithm also based on an explicit enumeration of all possible solutions for k Most Vital Edges MST.Its complexity O(n k log α(2(n − 1), n)) for fixed k is theoretically very slightly worse than the complexity of the algorithm proposed by Liang [8] using Pettie's result [12].However, given the fact that α(m, n) is always less than 4 in practice, the complexity of these two algorithms can be deemed as equivalent.Moreover, the complexity of our algorithm is better than that of Liang's algorithm for general k.More interestingly, unlike any other algorithm, our algorithm has two specific useful features.First, it can also determine an optimal solution for i Most Vital Edges MST, for each 1 ≤ i ≤ k, with the same time complexity.Second, it can be easily adapted to establish an implicit enumeration algorithm based on a branch and bound procedure.We also present in this paper a formulation by a mixed integer program to solve k Most Vital Edges MST.We implement and test all these proposed algorithms using, for the implicit enumeration algorithm, different branching and evaluation strategies.The results show that the implicit enumeration algorithm is much faster than the explicit enumeration algorithm as well as the resolution of the mixed integer program and its use of memory space can handle instances of significantly larger size.Moreover, we propose an ε-approximate algorithm.
The rest of the paper is organized as follows.In section 2 we introduce notations and some results related to our problem.In section 3 we present a new explicit enumeration algorithm that solves k Most Vital Edges MST.In section 4 we propose another exact algorithm based on an implicit enumeration scheme.In section 5, we present a mixed integer programming formulation for k Most Vital Edges MST.Computational results are presented in section 6.In section 7, we present an ǫ-approximate algorithm and compare it with the exact one.Conclusions are provided in section 8.

Basic concepts and preliminary results
Let G = (V, E) be a weighted undirected connected graph with |V | = n, |E| = m and w(e) ≥ 0 is the integer weight of each edge e ∈ E. We denote by G − E ′ the graph obtained from G by removing the subset of edges E ′ ⊆ E. k Most Vital Edges MST consists of finding a subset of edges S * ⊆ E with |S * | = k that maximizes the weight of a MST in the graph G − S * .We assume that G is at least (k + 1) edge-connected, since otherwise any selection of k edges including the edges of a minimum unweighted cut is a trivial solution.Therefore, we assume k ≤ λ(G)−1, where λ(G) is the edge-connectivity of G. Also, without loss of generality, we suppose in the following that all weights are different (by introducing, if necessary, an arbitrary total order on edges with the same weight).This assumption implies the uniqueness of minimum spanning trees or forests.For a non necessarily connected graph, a minimum spanning forest (MSF) is the union of minimum spanning trees for each of its connected components.In this paper a tree or a forest is considered as a graph but also, for convenience, as a subset of edges.For a set of edges F , w(F ) represents the sum of the weights of the edges in F .
We denote by T 0 the MST of G. Remark that an optimal solution of k Most Vital Edges MST must contain at least one edge of T 0 .For i ≥ 1, let T i be the MSF of the graph G i = G − ∪ i−1 j=0 T j .We use in the following the graph U G k = (V, ∪ k j=0 T j ) which has the following interesting property.
Lemma 1. (Liang and Shen [9]) For any S ⊆ E, |S| ≤ k, any edge of the MST of graph G − S belongs to U G k .
By Lemma 1, solving k Most Vital Edges MST on G reduces to solving the same problem on the sparser graph U G k whose number of edges is at most Considering T a MST of a graph, the replacement edge r(e) for an edge e ∈ T is defined as the edge e ′ = e of minimum weight which connects the two disconnected components of T \ {e}.The sensitivity of a minimum spanning tree T , i.e. the allowable variation for each edge weight so that T remains a minimum spanning tree, can be computed in O(m log α(m, n)) [12].In particular, for edges in T , this algorithm provides replacement edges.As a consequence, we get the following result.Proof : Let T * be a minimum spanning tree in a given graph.We calculate the replacement edges r(e) for all edges e ∈ T * .The most vital edge is the edge e * such that w(r(e * )) − w(e * ) = max e∈T * w(r(e)) − w(e). 2 Actually, replacement edges belong to a specific subset of edges as shown by the following result.Lemma 3.For each edge e ∈ T i , we have r(e) ∈ T i+1 for i = 0, . . ., k − 1.
Proof : Given a graph G, Liang [8] shows that for each edge e ∈ T 0 , r(e) ∈ T 1 .Applying this to graph G i , for which T i is the MSF, we get the result.2 3 An explicit enumeration algorithm for finding the k most vital edges We propose an algorithm that constructs a tree search of depth k − 1 in a breadth-first mode.At the i th level of this tree search, i = 0, . . ., k − 1, a node s is characterized by: • mv(s): a subset of i edges, corresponding to a tentative partial selection of the k most vital edges.
j=0 T j (s).• mst(s): a subset of edges forbidden to deletion.These edges belonging to T 0 (s), will necessary belong to any MST associated with any descendant of s.Depending on the position of s in the tree search, the cardinality of mst(s) varies from 0 to n − 2.
Denote by N i , for i = 0, . . ., k − 1, the set of nodes of the tree search at the i th level.We describe in the following the exact algorithm.
We first construct the graph U G k .Let a be the root of the search tree with mv(a) = mst(a) = ∅, U (a) = U G k , w(T 0 (a)) = w(T 0 ), and N 0 = {a}.For a level i, 0 ≤ i ≤ k − 2, we compute for each node s ∈ N i and each edge e ∈ T 0 (s), the replacement edges r(e) in T 1 (s).Node s gives rise to |T 0 (s)\mst(s)| children in N i+1 .Each such child d, corresponding to an edge e j in T 0 (s)\mst(s) = {e 1 , . . ., e n−1−|mst(s)| }, is characterized by: is updated from U (s) as follows (using Lemma 3): • T 0 (d) = T 0 (s) ∪ {r(e j )} \ {e j } and hence w(T 0 (d)) = w(T 0 (s)) − w(e j ) + w(r(e j )).At level k − 1, for each node s ∈ N k−1 and for all edges e ∈ T 0 (s) \ mst(s), we find r(e) in T 1 (s) and we determine a node s * that verifies max Algorithm 1 describes this procedure.Its correctness and complexity are given in Theorem 1.
Theorem 1. Algorithm 1 computes an optimal solution for an instance of k Most Vital Edges MST with n vertices and m edges in k − S ′ .Let r be a node of the tree search such that mv(r) ⊆ S ′ and for any child d of r, mv(d) S ′ .Clearly, r exists and corresponds at worst to root a when S ′ ∩ T 0 = ∅.Since, by definition, r is such that no edge of T 0 (r) belongs to S ′ , we have w ′ = w(T 0 (r)).Moreover, since w(T 0 (r)) ≤ w * , we have w ′ ≤ w * .
We compute now the complexity of Algorithm 1.The construction of U G k requires O(kmα(m, n)) using k times the best current algorithms for MST [2,13].Denote by t u the time for constructing U G k , by t edge−rep the time for finding the replacement edges for all edges of a minimum spanning tree, and by t gen the time for generating any node s of the tree search (that is determining mv(s), mst(s) and At level k, we compute the k most vital edges.Thus, the total time of Algorithm 1 is given by For each node s ∈ N i , subset mv(s) consists of ℓ tree edges of T 0 (a) and (i − ℓ) edges belonging to the union set of the (i − ℓ) replacement edges of these ℓ edges, 1 ≤ ℓ ≤ i (the p replacement edges of an edge e ∈ T 0 (a) are the p edges of minimum weight which connect the two disconnected components of T 0 (a)\{e}).This implies that is the number of combinations with repetition of p elements chosen from a set of n elements.
For a node s Since the replacement edges of a MST in a graph with n vertices and m edges can be computed in O(m log α(m, n)) [12], Note that the time needed to generate all the nodes of the tree search is dominated by the total time to find, for all nodes s of the tree search, the replacement edges r(e) in T 1 (s) for all edges e ∈ T 0 (s). 2 Remark 1.For each node s of the tree search, we could use, instead of the graph U (s), the graph where G ′′ (s) is the graph obtained from G by contracting the edges of mst(s) and removing the edges of mv(s).
. Unfortunately, given a child d of a node s of the tree search, updating efficiently U (d) from U (s) is not as straightforward as for Ũ .However, even if updating U could be performed more efficiently than Ũ , we would get the same complexity since the time for generating all nodes of the tree search is dominated by the total time for finding the replacement edges for all nodes in the tree search.
Discussion For fixed k, by using the result of Dixon et al. [3], Liang [8] proposes an algorithm to solve k Most Vital Edges MST in O(n k α((k + 1)(n − 1), n)) time.Using Pettie's result [12] Liang's algorithm can be implemented in O(t u + n k log α((k + 1)(n − 1), n)) time, where t u is the time for constructing U G k .Our algorithm has a complexity that is theoretically slightly worse than that of Liang.Nevertheless, since α(m, n) is always less than or equal to 4 in practice, the complexity of these two algorithms can be considered as equivalent.Moreover, the advantage of our algorithm is to determine, with the same time complexity, an optimal solution for i Most Vital Edges MST, for 1 ≤ i ≤ k.Indeed, at each level i, we can find among nodes of N i , the node with the largest weight of a MST.
For general k, our bound is clearly better than that of Liang.Indeed, in Liang's algorithm, after the determination of U G k , Liang divides the problem into two cases: represents a subset of k most vital edges.In (i), for every possible combination of i edges among the n − 1 edges of T 0 , 1 ≤ i < k, the author constructs a specific graph G with a number of nodes and edges depending only on k, and determines the k − i remaining edges in G.In (ii), from every possible choice of (k − 1) edges among the n − 1 edges of T 0 , the author constructs a MST T ′ in the graph obtained by deleting these (k − 1) edges and finds the k th edge to be removed by using the replacement edges of T ′ .Therefore, (i) and (ii) are performed respectively in k−1 i=1 n−1 i (t G + t k−i ) and n−1 k−1 t last time, where t G , t k−i and t last are respectively the time to construct G, the time to determine the k − i remaining edges to be removed from G and the time to find the k th edge to be removed from T ′ ∩ T 0 .Note that Liang, who considers only the case where k is fixed, does not need to explicit the term involving t k−i .However, for general k, even if expressing the complexity of his algorithm as in , one can observe that it is relatively larger than the complexity of our proposed algorithm that remains in O(t u + n k log α(2(n − 1), n)) time.
The other exact algorithms proposed in the literature [9,15] have a worse complexity than our algorithm both for fixed ad general k.

An implicit enumeration algorithm for finding the k most vital edges
An interesting feature of our explicit enumeration algorithm is that, unlike the algorithms previously proposed, it can easily be adapted to design an implicit algorithm based on a branch and bound scheme.To do this, we use for each node s an upper bound U B(s) based on successive replacements of edges.We also use lower bounds LB(s) constructed by extending the forest, corresponding to s, to a particular minimum spanning tree.
In order to obtain the best possible bounds, we construct U (s) for each node s, instead of using Ũ (s).For each child d of s, U (d) is determined by constructing

Lower bounds
For a fixed node s of the tree search, k − |mv(s)| edges remain to be deleted from U (s).We present different ways of determining these remaining edges giving rise to three possible lower bounds.
1. LB greedy (s): Given T 0 (s), we compute r(e j ) for all e j ∈ T 0 (s).We delete the edge e * j which realizes max ej ∈T0(s)\mst(s) (w(r(e j )) − w(e j )) and replace it by r(e * j ).We update U (s) and repeat the process until we remove k − |mv(s)| edges.The value of this bound is the weight of the last MST obtained.2. LB f irst (s): We remove the k − |mv(s)| edges of T 0 (s) \ mst(s) having the smallest weight, and we construct a MST from the remaining edges in T 0 (s).
The weight of the MST obtained is the value of this bound.3. LB best (s): Given T 0 (s), we compute r(e j ) for all e j ∈ T 0 (s).We remove the k − |mv(s)| edges in T 0 (s) \ mst(s) whose difference between the weight of their replacement edge and their weight is the largest, and we construct a MST from the remaining edges in T 0 (s).The value of this bound is the weight of the MST obtained.
In order to test these bounds, we computed, for instances with different values of n and k, these three lower bounds at the root a of the tree search.The instances are generated as explained in section 6. Due to space limitation, we give in Table 1, results for two types of instances.We note that there is no dominance between these three bounds.We also note that LB f irst is the fastest in terms of running time but gives bad values.LB greedy , which gives the best values in most cases, takes much more time than the other bounds.LB best , which gives similar values as LB greedy , takes only about twice as much time as LB f irst and about 40 to 100 times less time than LB greedy .
Table 1.Values of the lower and upper bounds at the root of the tree search

Upper bound
Let s be a given node of the tree search.To compute U B(s), we select the edge in T 1 (s) of largest weight and we replace the edge deleted from T j (s) by the edge with largest weight belonging to T j+1 (s), for j = 1, . . ., k − |mv(s)| − 1.We repeat this process k − |mv(s)| − 1 times.
Let F be the set of the k − |mv(s)| edges selected from T 1 (s) in this process.Then, we must determine the k − |mv(s)| edges to remove.To obtain an upper bound for all feasible solutions obtained from s, we delete the k−|mv(s)| edges of smallest weight among the edges of F ∪T 0 (s)\mst(s).Denote by E min the subset of these selected edges removed.Therefore, U B(s) = w(T 0 (s))+w(F )−w(E min ).
We computed, for instances with different values of n and k, this upper bound at the root a of the tree search (see Table 1).The main observation is that U B(a) is rather close to the optimal value for small values of k and deteriorates as k increases.

Branching strategy
Let a be the root of the tree search.The branching strategy is the same as for the explicit enumeration algorithm.We start with a feasible solution value corresponding to max{LB greedy (a), LB f irst (a), LB best (a)}.We tested two different best first search strategies.The first one is the standard strategy (Branching: best upper bound) where the node with the largest upper bound is selected first.No lower bound is computed and the fathoming test is performed only when we update the current best feasible solution value, which can occur only at level k − 1 of the tree search.In the second strategy (Branching: best lower bound), the node with the largest lower bound is selected first.Lower and upper bounds are computed at every node.Since LB best gives values close to the best ones and takes less time, we use this bound for computing a lower bound.Here, the fathoming test is performed at each node by comparing each lower bound value with the current best feasible solution value.

A mixed integer programming formulation for finding the k most vital edges
Consider the graph ) be the digraph obtained by replacing each edge (i, j) in E u by two arcs (i, j) and (j, i) in A u and let w ij = w(e) for each edge e ∈ E u .In [10], Magnanti and Wolsey present a formulation of the minimum spanning tree problem, called the directed multicommodity flow model.Using this model, we propose the following formulation for k Most Vital Edges MST: In this formulation, we consider node 1 as the root of a MST and every node ℓ = 1 defines a commodity.Denote by f ℓ ij the flow of ℓ passing through (i, j).Variable z ij is equal to 1 if edge (i, j) is deleted and 0 otherwise.In order to discard this edge from any MST, we assign it the weight w ij + M ij where M ij is a large enough constant, e.g.
Using the dual of the inner program, we obtain the following mixed integer programming formulation for k Most Vital Edges MST.

Computational results
All experiments presented here were performed on a 3.4GHz computer with 3Gb RAM.All proposed algorithms are implemented in C. All instances are complete graphs defined on n vertices.Weights w(e) for all e ∈ E are generated randomly, uniformly distributed in [1, 100].For each value of n and k presented in this study, 10 different instances were generated and tested.The results are reported in Table 2 where each given value is the average over 10 instances.For the implicit enumeration algorithm, treated and generated nodes represent respectively nodes for which we have computed mv, mst, and U and nodes satisfying the condition of not fathoming (U B > bestvalue).Column ♯opt corresponds to the number of instances solved optimally.
We first compare the explicit and implicit enumeration algorithms.The results show that implicit enumeration algorithms are much faster than the explicit enumeration algorithm and can handle instances of considerably larger size.Observe that, for the explicit enumeration algorithm, the tree search size is identical for any instance of the same (n, k) type.As a consequence, either all or none of the instances of a same (n, k) type can be solved.Moreover, for the same reason, computation times show a low variance for all instances of a same (n, k) type.Regarding the implicit enumeration algorithm, the "Branching: best upper bound" strategy yields slightly better running times than the "Branching: best lower bound" strategy.However, the "Branching: best upper bound" strategy, for which fathoming tests are performed less frequently, generates more nodes.Thus, owing to the limited memory capacity, the "Branching: best lower bound" strategy can handle instances of larger size.
We compare now the results obtained by the mixed integer program with those of the implicit enumeration algorithm.For this, we implemented the mixed integer program using the solver CPLEX 12.1 and we run it on the same generated instances.We limited the running time to 1 hour for the instances with 20, 25, 30 and 50 vertices, and to 2 hours for the other instances.The results are also reported in Table 2 where • Time, given in seconds, is the average running time on the 10 instances.For any instance which is not solved optimally within the time limit, the running time is set to this limit; • Generated nodes represents the average number of nodes created in the tree search corresponding to instances giving a feasible solutions; • Gap, expressed as a percentage, represents the average over ratios UB − BS UB computed on all instances returning at least one feasible solution, where U B is the final best upper bound and BS is the best solution value found; • Opt/Feas represents the number of instances solved optimally /for which at least one feasible solution was found within the time limit.
We note that the mixed integer program reaches the optimal value for very small instances only.Actually, for n < 100, we only obtain in most cases feasible solutions with rather large gaps which indicates that optimality is far from being reached.Finally, for instances with n ≥ 100, no feasible solutions are returned within the time limit.Moreover, for n = 300 and 400, the execution of the program exceeds the memory after a few seconds (297.437 and 0.56 seconds in average respectively).
From all these remarks, we can conclude that our proposed implicit enumeration algorithm gives better results than the explicit enumeration algorithm as well as the resolution of the mixed integer program and this both in terms of running time and using memory capacity.

ε-approximate algorithm
The proposed algorithm is based on the previous implicit algorithm.The aim being to obtain an ε-approximate solution of the optimum, the condition to generate a node s in the tree search is now (1 − ε)U B(s) > bestvalue.Indeed, the value v returned by the approximate algorithm must verify opt The algorithm is implemented in C and tested on the same instances generated in Section 6 and this for ε = 0.01; 0.05 and 0.1.Thus, we compare the ε-approximate algorithm with the implicit algorithm.The results are summarized in Table 3.The meaning of treated and generated nodes is the same as in Section 6 and each given value in the table represents the average over the 10 generated instances for each value of n and k.
We note that the running times of the ε-approximate algorithm are significantly lower than those of the implicit enumeration algorithm.Running times do not exceed 21 seconds for ε = 0.1, 180 seconds for ε = 0.05 and 1 215 seconds for ε = 0.01.We also note that for large instances with n = 300 and 400 nodes, the ε-approximate algorithm solves the problem for ε = 0.05 and 0.1 at the root in a time less than 1 second, and for ε = 0.1 in a time less than 90 seconds while the implicit enumeration algorithm requires 1 793.460 and 7 265.850seconds respectively.Moreover, the approximate solutions a posteriori are within ε ′ to the optimum, with ε ′ ≤ 0.0006 for ε = 0.01, ε ′ ≤ 0.0047 for ε = 0.05 and ε ′ ≤ 0.00922 for ε = 0.1.
All these remarks show that the proposed lower bounds and upper bound are of very good quality and that the running time of the implicit enumeration algorithm is the time needed to verify the optimality of the solution.Indeed, this optimal solution is either found in a few seconds or determined at the root of the tree search corresponding then to the maximum value of the three lower bounds associated to the root.

Conclusions
Algorithms proposed in this paper can be easily adapted to solve some variants of the k Most Vital Edges MST problem.In a first variant, a removing cost is associated to each edge.The problem consists of finding a subset of edges with total cost bounded by a budget limit whose deletion causes the largest increase in the weight of a minimum spanning tree.In a second variant, we have to determine a minimum number of edges to be removed such that the weight of a minimum spanning tree in the resulting graph is at least a fixed value.

Lemma 2 .
1 Most Vital Edges MST defined on a graph with n vertices and m edges is solvable in O(m log α(m, n)).
by deleting the replacement edge e rep of the edge deleted from T j−1 (s) and replacing it by its replacement edge r(e rep ) ∈ T j+1 (s).If for a level i and an edge e rep , the replacement edge r(e rep ) does not exist, T j (d) = T j (s)\{e rep } and T ℓ (d) = T ℓ (s) for ℓ = j +1, . . ., k−|mv(d)|.If for a level i, T i (s) = ∅ then T ℓ (d) = ∅ for ℓ = i, . . ., k − |mv(d)|.

Table 2 .
Comparison of explicit enumeration, implicit enumeration and MIP-based algorithms

Table 3 .
Results of the ε-approximate algorithm