Description

ACYCLIC PARTITIONING OF LARGE DIRECTED ACYCLIC GRAPHS JULIEN HERRMANN, M. YUSUF ÖZKAYA, BORA UÇAR, KAMER KAYA, AND ÜMIT V. ÇATALYÜREK Abstract. We

Categories

Published

All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.

Share

Transcript

ACYCLIC PARTITIONING OF LARGE DIRECTED ACYCLIC GRAPHS JULIEN HERRMANN, M. YUSUF ÖZKAYA, BORA UÇAR, KAMER KAYA, AND ÜMIT V. ÇATALYÜREK Abstract. We investigate the problem of partitioning the vertices of a directed acyclic graph into a given number of parts. The objective function is to minimize the number or the total weight of the edges having end points in different parts, which is also known as edge cut. The standard load balancing constraint of having an equitable partition of the vertices among the parts should be met. Furthermore, the partition is required to be acyclic, i.e., the inter-part edges between the vertices from different parts should preserve an acyclic dependency structure among the parts. In this work, we adopt the multilevel approach with coarsening, initial partitioning, and refinement phases for acyclic partitioning of directed acyclic graphs. We focus on two-way partitioning (sometimes called bisection), as this scheme can be used in a recursive way for multi-way partitioning. To ensure the acyclicity of the partition at all times, we propose novel and efficient coarsening and refinement heuristics. The quality of the computed acyclic partitions is assessed by computing the edge cut. We also propose effective ways to use the standard undirected graph partitioning methods in our multilevel scheme. We perform a large set of experiments on a dataset consisting of (i) graphs coming from an application and (ii) some others corresponding to matrices from a public collection. We report improvements, on average, around 59% compared to the current state of the art. Key words. directed graph, acyclic partitioning, multilevel partitioning AMS subject classifications. 05C70, 05C85, 68R10, 68W Introduction. The standard graph partitioning (GP) problem asks for a partition of the vertices of an undirected graph into a number of parts. The objective and the constraint of this well-known problem are to minimize the number of edges having vertices in two different parts and to equitably partition the vertices among the parts. The GP problem is NP-complete [11, ND14]. We investigate a variant of this problem, called acyclic partitioning, for directed acyclic graphs. In this variant, we have one more constraint: the partition should be acyclic. In other words, for a suitable numbering of the parts, all edges should be directed from a vertex in a part p to another vertex in a part q where p q. The directed acyclic graph partitioning (DAGP) problem arises in many applications. The stated variant of the DAGP problem arises in exposing parallelism in automatic differentiation [5, Ch.9], and particularly in the computation of the Newton step for solving nonlinear systems [3, 4]. The DAGP problem with some additional constraints is used to reason about the parallel data movement complexity and to dynamically analyze the data locality potential [8, 9]. Other important applications of the DAGP problem include (i) fusing loops for improving temporal locality, and enabling streaming and array contractions in runtime systems [15], such as Bohrium [16]; (ii) analysis of cache efficient execution of streaming applications on uniprocessors [1]; (iii) a number of circuit design applications in which the signal directions impose acyclic partitioning requirement [6, 23]. Let us consider a toy example shown in Figure 1.1(a). A partition of the vertices A preliminary version appeared in CCGRID 17 [12]. School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia , CNRS and LIP (UMR5668 CNRS-ENS Lyon-INRIA-UCBL), 46, allée d Italie, ENS Lyon, 69364, France, Sabancı University, Istanbul, Turkey, 1 2 HERRMANN et al. (a) A toy graph (b) A partition ignoring the directions; it is cyclic. (c) An acyclic partitioning. Fig. 1.1: a) A toy example with six tasks and six dependencies, b) a non-acyclic partitioning when edges are oriented, c) an acyclic partitioning of the same directed graph of this graph is shown in Figure 1.1(b) with a dashed curve. Since there is a cut edge from s to u and another from u to t, the partition is cyclic, and is not acceptable. An acyclic partition is shown in Figure 1.1(c), where all the cut edges are from one part to the other. We adopt the multilevel partitioning approach [?,?] with the coarsening, initial partitioning, and refinement phases for acyclic partitioning of DAGs. We propose heuristics for these three phases (Subsections 4.1, 4.2 and 4.3, respectively) which guarantee acyclicity of the partitions at all phases and maintains a DAG at every level. We strived to have fast heuristics at the core. With these characterizations, the coarsening phase requires new algorithmic/theoretical reasoning, while the initial partitioning and refinement heuristics are direct adaptations of the standard methods used in undirected graph partitioning, with some differences worth mentioning. We discuss only the bisection case, as we were able to improve the direct k-way algorithms we proposed before [12] by using the bisection heuristics recursively we give a brief comparison in Subsection 5.4. The acyclicity constraint on the partitions precludes the use of the state of the art undirected graph partitioning tools. This has been recognized before, and those tools were put aside [12, 18]. While this is sensible, one can still try to make use of the existing undirected graph partitioning tools [?, 13, 20, 22], as they have been very well engineered. Let us assume that we have partitioned a DAG with an undirected graph partitioning tool into two parts by ignoring the directions. It is easy to detect if the partition is cyclic since all the edges need to go from part one to part two. Furthermore, we can easily fix it as follows. Let v be a vertex in the second part; we can move all u vertices for which there is path from v to u into the second part. This procedure breaks any cycle containing v and hence, the partition becomes acyclic. However, the edge cut may increase, and the partitions can be unbalanced. To solve the balance problem and reduce the cut, we can apply a restricted version of the move-based refinement algorithms in the literature. After this step, this final partition meets the acyclicity and balance conditions. Depending on the structure of the input graph, it could also be a good initial partition for reducing the edge cut. Indeed, one of our most effective schemes uses an undirected graph partitioning algorithm to create a (potentially cyclic) partition, fixes the cycles in the partition, and refines the resulting acyclic partition with a novel heuristic to obtain an initial partition. We then integrate this partition within the proposed coarsening approaches to refine it at different granularities. We elaborate on this scheme in Subsection 4.4. The rest of the paper is organized as follows: Section 2 introduces the notation ACYCLIC PARTITIONING OF DAGS and background on directed acyclic graph partitioning and Section 3 briefly surveys the existing literature. We propose multilevel partitioning heuristics for acyclic partitioning of directed acyclic graphs in Section 4. Section 5 presents the experimental results, and Section 6 concludes the paper. 2. Preliminaries and notation. A directed graph G = (V, E) contains a set of vertices V and a set of directed edges E of the form e = (u, v), where e is directed from u to v. A path is a sequence of edges (u 1, v 1 ) (u 2, v 2 ),... with v i = u i+1. A path ((u 1, v 1 ) (u 2, v 2 ) (u 3, v 3 ) (u l, v l )) is of length l, where it connects a sequence of l + 1 vertices (u 1, v 1 = u 2,..., v l 1 = u l, v l ). A path is called simple if the connected vertices are distinct. Let u v denote a simple path that starts from u and ends at v. Among all the u v paths, one with the smallest length l is called a shortest path. A path ((u 1, v 1 ) (u 2, v 2 ) (u l, v l )) forms a (simple) cycle if all v i for 1 i l are distinct and u 1 = v l. A directed acyclic graph, DAG in short, is a directed graph with no cycles. The path u v represents a dependency of v to u. We say that the edge (u, v) is redundant if there exists another u v path in the graph. That is when we remove a redundant (u, v) edge, u remains to be connected to v, and hence, the dependency information is preserved. We use Pred[v] = {u (u, v) E} to represent the (immediate) predecessors of a vertex v, and Succ[v] = {u (v, u) E} to represent the (immediate) successors of v. We call the neighbors of a vertex v, its immediate predecessors and immediate successors: Neigh[u] = Pred[v] Succ[v]. For a vertex u, the set of vertices v such that u v are called the descendants of u. Similarly, the set of vertices v such that v u are called the ancestors of the vertex u. Every vertex u has a weight denoted by w u and every edge (u, v) E has a cost denoted by c u,v. A k-way partitioning of a graph G = (V, E) divides V into k disjoint subsets {V 1,..., V k }. The weight of a part V i for 1 i k is equal to u V i w u, denoted as w(v i ), which is the total vertex weight in V i. Given a partition, an edge is called a cut edge if its endpoints are in different parts. The edge cut of a partition is defined as the sum of the costs of the cut edges. Usually, a constraint on the part weights accompanies the problem. We are interested in acyclic partitions, which are defined below. Definition 2.1 (Acyclic k-way partition). A partition {V 1,..., V k } of G = (V, E) is called an acyclic k-way partition if two paths u v and v u do not co-exist for u, u V i, v, v V j, and 1 i j k. There is a related definition in the literature [9], which is called convex partition. A partition is convex if for any pair of vertices u and v in the same part, all vertices in any path from u v are also in the same part. Hence, if a partition is acyclic it is also convex. On the other hand, convexity does not imply acyclicity. Figure 2.1 shows that the definitions of an acyclic partition and a convex partition are not equivalent. For the toy graph in Figure 2.1(a), there are three possible balanced partitions shown in Figure 2.1(b), Figure 2.1(c), and Figure 2.1(d). They are all convex, but only that in Figure 2.1(d) is acyclic. Deciding on the existence of a k-way acyclic partition respecting an upper bound on part weights and another upper bound on the cost of cut edges is NP-complete [11]. The formal problem treated in this paper is defined as follows. Definition 2.2 (DAG partitioning problem). Given a DAG G = (V, E) an imbalance parameter ε, find an acyclic k-way partition P = {V 1,..., V k } of V such that 4 HERRMANN et al. a b a b a b a b c d c d c d c d (a) A toy graph (b) Cyclic and convex (c) Cyclic and convex (d) Acyclic and convex Fig. 2.1: A toy graph (left), two cyclic and convex partitions (middle two), and an acyclic and convex partition (right) the balance constraints (2.1) w(v i ) (1 + ε) are satisfied and the edge cut is minimized. v V w v k In the related partitioning tools, a common value of ε is Related work. Fauzia et al. [9] propose a heuristic for the acyclic partitioning problem to optimize data locality when analyzing DAGs. To create partitions, the heuristic categorizes a vertex as ready to be assigned to a partition when all of the vertices it depends on have already been assigned. Vertices are assigned to the current partition set until the maximum number of vertices that would be active during the computation of the partition set reaches a specified cache size. This implies that partition sizes can be larger than the size of the cache. This differs from our problem as we limit the size of each partition to the cache size. Kernighan [14] proposes an algorithm to find a minimum edge-cut partition of the vertices of a graph into subsets of size greater than a lower bound and inferior to an upper bound. The partition needs to use a fixed vertex sequence that cannot be changed. Indeed, Kernighan s algorithm takes a topological order of the vertices of the graph as an input and partitions the vertices such that every vertex in a subset are adjacent in the given topological order. This procedure is optimal for a given, fixed topological order and has a run time proportional to the number of edges in the graph, if the part weights are taken as constant. We used a modified version of this algorithm as a heuristic in the earlier version of our work [12]. Cong et al. [6] describe two approaches for obtaining acyclic partitions of directed Boolean networks, modeling circuits. The first one is a single-level Fiduccia- Mattheyses (FM)-based approach. In this approach, Cong et al. generate an initial acyclic partition by splitting the list of the vertices (in a topological order) from left to right into k parts such that the weight of each part does not violate the bound. The quality of the results is then improved with a k-way variant of the FM heuristic [10] taking the acyclicity constraint into account. Our previous work [12] employs a similar refinement heuristic. The second approach of Cong et al. is a two-level heuristic; the initial graph is first clustered with a special decomposition, and then it is partitioned using the first heuristic. In a recent paper [18], Moreira et al. focus on an imaging and computer vision application on embedded systems and discuss acyclic partitioning heuristics. They ACYCLIC PARTITIONING OF DAGS propose a single level approach in which an initial partitioning is obtained using a topological order and then refined using four local search heuristics while respecting the balance constraint and maintaining the acyclicity of the partition. Three heuristics pick a vertex and move it to an eligible part when the move respects the constraints and improves the cut. They differ in the set of eligible parts for each vertex (from a very restrictive to a more general one allowing arbitrary target parts so long as acyclicity is maintained). The fourth one tentatively realizes the moves that hurt the cut in order to escape from the local minima. This fourth one delivers better results than the others. In a follow-up paper, Moreira et al. [17] discuss a multilevel graph partitioner and an evolutionary algorithm based on this multilevel scheme. Their multilevel scheme starts with a given acyclic partition. Then, the coarsening phase contracts edges that are in the same part until there is no edge to contract. Here matching-based heuristics from undirected graph partitioning tools are used without taking the directions of the edges into account. Therefore, the coarsening phase can create cycles in the graph; however the induced partitions are never cyclic. Then, an initial partition is obtained, which is refined during the uncoarsening phase with moved-based heuristics. In order to guarantee acyclic partitions, the vertices that lie in cycles are not moved. In a systematic evaluation of the proposed methods, Moreira et al. note that there are many local minima and suggest using relaxed constraints in the multilevel setting. The proposed methods have high run time, as the evolutionary method of Moreira et al. is not concerned with this issue. Improvements with respect to the earlier work [18] are reported. Previously, we had developed a multilevel partitioner [12]. In this paper, we propose methods to use an undirected graph partitioner to guide the multilevel partitioner. We focus on partitioning the graph in two parts since we can handle the general case with a recursive bisection scheme. We also propose new coarsening, initial partitioning, and refinement methods specifically designed for the 2-partitioning problem. Our multilevel scheme maintains acyclic partitions and graphs through all the levels. Other related work on acyclic partitioning of directed graphs include an exact, branch-and-bound algorithm by Nossack and Pesch [19] which works on the integer programming formulation of the acyclic partitioning problem. This solution is, of course, too costly to be used in practice. Wong et al. [23] present a modification of the decomposition of Cong et al. [6] for clustering, and use this in a two-level scheme. 4. Directed multilevel graph partitioning. We propose a new multilevel tool for obtaining acyclic partitions of directed acyclic graphs. Multilevel schemes [?,?] form the de-facto standard for solving graph and hypergraph partitioning problems efficiently, and used by almost all current state-of-the-art partitioning tools [2,?, 13, 20, 22]. Similar to other multilevel schemes, our tool has three phases; the coarsening phase, which reduces the number of vertices by clustering them; the initial partitioning phase, which finds a partition of the coarsened graph; and the uncoarsening phase, in which the initial partition is projected to the finer graphs and refined along the way, until a solution for the original graph is obtained Coarsening. In this phase, we obtain smaller DAGs by coalescing the vertices, level by level. This phase continues until the number of vertices becomes smaller than a specified bound or the reduction on the number of vertices from one level to the next one is lower than a threshold. At each level l, we start with a finer acyclic graph G l, compute a valid clustering C l ensuring the acyclicity, and obtain a coarser acyclic graph G l+1. While our previous work [12] discussed matching based algorithms for 6 HERRMANN et al coarsening, we present agglomerative clustering based variants here. The new variants supersede the matching based ones. Unlike the standard undirected graph case, in DAG partitioning, not all vertices can be safely combined. Consider a DAG with three vertices a, b, c and three edges (a, b), (b, c), (a, c). Here, the vertices a and c cannot be combined, since that would create a cycle. We say that a set of vertices is contractible (all its vertices are matchable), if unifying them does not create a cycle. We now present a general theory about finding clusters without forming cycles, after giving some definitions. Definition 4.1 (Clustering). A clustering of a DAG is a set of (disjoint) subsets of vertices without common vertices. Definition 4.2 (Coarse graph). Given a DAG G and a clustering C of G, we let G C denote the coarse graph created by contracting all sets of vertices of C. The coarse graph is a quotient graph of G if the clustering C is extended to a partition with singleton vertex sets. Definition 4.3 (Feasible clustering). A feasible clustering C of a DAG G is a clustering such that G C is acyclic. Theorem 4.1. Let G = (V, E) be a DAG. For u, v V and (u, v) E, the coarse graph G {(u,v)} is acyclic if and only if there is no path from u to v in G avoiding the edge (u, v). Proof. Let G = (V, E ) = G {(u,v)} be the coarse graph, and w be the merged, coarser vertex of G corresponding to {u, v}. If there is a path from u to v in G avoiding the edge (u, v), then all the edges of this path are also in G, and the corresponding path in G goes from w to w, creating a cycle. Assume that there is a cycle in the coarse graph G. This cycle has to pass through w; otherwise, it must be in G which is impossible by the definition of G. Thus, there is a cycle from w to w in the coarse graph G. Let a V be the first vertex visited by this cycle after w and b V be the last one, just before completing the cycle. Let p be an a b path in G such that (w, a) p (b, w) is the said w w cycle in G. Note that a can be equal to b and in this case p =. By the definition of the coarse graph G, a, b V and all edges in the path p are in E\{(u, v)}. Since we have a cycle in G, the following two items must hold: (i) either (u, a) E or (v, a) E, or both; and (ii) either (b, u) E or (b, v) E, or both. We now investigate these nine cases. Here we investigate only four of them, as the both cases will be implied by the others. (u, a) E and (b, u) E is impossible because otherwise, (u, a) p (b, u) would be a u u cycle in the original graph G. (v, a) E and (b, v) E is

Search

Similar documents

Related Search

Directed acyclic graphsTransportation planning of large-scale eventsTheory of Large DeformationAssessment of large data setsOptimisation of large modular remote systemFactorization of large numbersLaw of large numbersDesign and Management of Large Systems for InConservation of Large MammalsControl and Optimization of Large-Scale Netwo

We Need Your Support

Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks

SAVE OUR EARTH

We need your sign to support Project to invent "SMART AND CONTROLLABLE REFLECTIVE BALLOONS" to cover the Sun and Save Our Earth.

More details...Sign Now!

We are very appreciated for your Prompt Action!

x