1========================= 2Dependence Graphs in LLVM 3========================= 4 5.. contents:: 6 :local: 7 8Introduction 9============ 10Dependence graphs are useful tools in compilers for analyzing relationships 11between various program elements to help guide optimizations. The ideas 12behind these graphs are described in papers [1]_ and [2]_. 13 14The implementation of these ideas in LLVM may be slightly different than 15what is mentioned in the papers. These differences are documented in 16the `implementation details <implementation-details_>`_. 17 18.. _DataDependenceGraph: 19 20Data Dependence Graph 21===================== 22In its simplest form the Data Dependence Graph (or DDG) represents data 23dependencies between individual instructions. Each node in such a graph 24represents a single instruction and is referred to as an "atomic" node. 25It is also possible to combine some atomic nodes that have a simple 26def-use dependency between them into larger nodes that contain multiple- 27instructions. 28 29As described in [1]_ the DDG uses graph abstraction to group nodes 30that are part of a strongly connected component of the graph 31into special nodes called pi-blocks. pi-blocks represent cycles of data 32dependency that prevent reordering transformations. Since any strongly 33connected component of the graph is a maximal subgraph of all the nodes 34that form a cycle, pi-blocks are at most one level deep. In other words, 35no pi-blocks are nested inside another pi-block, resulting in a 36hierarchical representation that is at most one level deep. 37 38 39For example, consider the following: 40 41.. code-block:: c++ 42 43 for (int i = 1; i < n; i++) { 44 b[i] = c[i] + b[i-1]; 45 } 46 47This code contains a statement that has a loop carried dependence on 48itself creating a cycle in the DDG. The figure bellow illustrates 49how the cycle of dependency is carried through multiple def-use relations 50and a memory access dependency. 51 52.. image:: cycle.png 53 54The DDG corresponding to this example would have a pi-block that contains 55all the nodes participating in the cycle, as shown bellow: 56 57.. image:: cycle_pi.png 58 59Program Dependence Graph 60======================== 61 62The Program Dependence Graph (or PDG) has a similar structure as the 63DDG, but it is capable of representing both data dependencies and 64control-flow dependencies between program elements such as 65instructions, groups of instructions, basic blocks or groups of 66basic blocks. 67 68High-Level Design 69================= 70 71The DDG and the PDG are both directed graphs and they extend the 72``DirectedGraph`` class. Each implementation extends its corresponding 73node and edge types resulting in the inheritance relationship depicted 74in the UML diagram bellow: 75 76.. image:: uml_nodes_and_edges.png 77 78Graph Construction 79------------------ 80 81The graph build algorithm considers dependencies between elements of 82a given set of instructions or basic blocks. Any dependencies coming 83into or going out of instructions that do not belong to that range 84are ignored. The steps in the build algorithm for the DDG are very 85similar to the steps in the build algorithm for the PDG. As such, 86one of the design goals is to reuse the build algorithm code to 87allow creation of both DDG and PDG representations while allowing 88the two implementations to define their own distinct and independent 89node and edge types. This is achieved by using the well-known builder 90design pattern to isolate the construction of the dependence graph 91from its concrete representation. 92 93The following UML diagram depicts the overall structure of the design 94pattern as it applies to the dependence graph implementation. 95 96.. image:: uml_builder_pattern.png 97 98Notice that the common code for building the two types of graphs are 99provided in the ``DependenceGraphBuilder`` class, while the ``DDGBuilder`` 100and ``PDGBuilder`` control some aspects of how the graph is constructed 101by the way of overriding virtual methods defined in ``DependenceGraphBuilder``. 102 103Note also that the steps and the names used in this diagram are for 104illustrative purposes and may be different from those in the actual 105implementation. 106 107Design Trade-offs 108----------------- 109 110Advantages: 111^^^^^^^^^^^ 112 - Builder allows graph construction code to be reused for DDG and PDG. 113 - Builder allows us to create DDG and PDG as separate graphs. 114 - DDG nodes and edges are completely disjoint from PDG nodes and edges allowing them to change easily and independently. 115 116Disadvantages: 117^^^^^^^^^^^^^^ 118 - Builder may be perceived as over-engineering at first. 119 - There are some similarities between DDG nodes and edges compared to PDG nodes and edges, but there is little reuse of the class definitions. 120 121 - This is tolerable given that the node and edge types are fairly simple and there is little code reuse opportunity anyway. 122 123 124.. _implementation-details: 125 126Implementation Details 127====================== 128 129The current implementation of DDG differs slightly from the dependence 130graph described in [1]_ in the following ways: 131 132 1. The graph nodes in the paper represent three main program components, namely *assignment statements*, *for loop headers* and *while loop headers*. In this implementation, DDG nodes naturally represent LLVM IR instructions. An assignment statement in this implementation typically involves a node representing the ``store`` instruction along with a number of individual nodes computing the right-hand-side of the assignment that connect to the ``store`` node via a def-use edge. The loop header instructions are not represented as special nodes in this implementation because they have limited uses and can be easily identified, for example, through ``LoopAnalysis``. 133 2. The paper describes five types of dependency edges between nodes namely *loop dependency*, *flow-*, *anti-*, *output-*, and *input-* dependencies. In this implementation *memory* edges represent the *flow-*, *anti-*, *output-*, and *input-* dependencies. However, *loop dependencies* are not made explicit, because they mainly represent association between a loop structure and the program elements inside the loop and this association is fairly obvious in LLVM IR itself. 134 3. The paper describes two types of pi-blocks; *recurrences* whose bodies are SCCs and *IN* nodes whose bodies are not part of any SCC. In this implementation, pi-blocks are only created for *recurrences*. *IN* nodes remain as simple DDG nodes in the graph. 135 136 137References 138---------- 139.. [1] "D. J. Kuck, R. H. Kuhn, D. A. Padua, B. Leasure, and M. Wolfe (1981). DEPENDENCE GRAPHS AND COMPILER OPTIMIZATIONS." 140.. [2] "J. FERRANTE (IBM), K. J. OTTENSTEIN (Michigan Technological University) and JOE D. WARREN (Rice University), 1987. The Program Dependence Graph and Its Use in Optimization." 141