1 2MSMProp2, a simplified but functionally equivalent version of MSMProp1 3~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 4 5Julian Seward, OpenWorks Ltd, 19 August 2008 6~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 7 8Note that this file does NOT describe the state machine used in the 9svn://svn.valgrind.org/branches/YARD version of Helgrind. That state 10machine is different again from any previously described machine. 11 12See the file README_YARD.txt for more details on YARD. 13 14 ---------------------- 15 16In early 2008 Konstantin Serebryany proposed "MSMProp1", a memory 17state machine for data race detection. It is described at 18http://code.google.com/p/data-race-test/wiki/MSMProp1 19 20Implementation experiences show MSMProp1 is useful, but difficult to 21implement efficiently. In particular keeping the memory usage under 22control is complex and difficult. 23 24This note points out a key simplification of MSMProp1, which makes it 25easier to implement without changing the functionality. 26 27 28The idea 29~~~~~~~~ 30 31The core of the idea pertains to the "Condition" entry for MSMProp1 32state machine rules E5 and E6(r). These are, respectively: 33 34 HB(SS, currS) and its negation 35 ! HB(SS, currS). 36 37Here, SS is a set of segments, and currS is a single segment. Each 38segment contains a vector timestamp. The expression "HB(SS, currS)" 39is intended to denote 40 41 for each segment S in SS . happens_before(S,currS) 42 43where happens_before(S,T) means that S's vector timestamp is ordered 44before-or-equal to T's vector timestamp. 45 46In words, the expression 47 48 for each segment S in SS . happens_before(S,currS) 49 50is equivalent to saying that currS has a timestamp which is 51greater-than-equal to the timestamps of all the segments in SS. 52 53The key observation is that this is equivalent to 54 55 happens_before( JOIN(SS), currS ) 56 57where JOIN is the lattice-theoretic "max" or "least upper bound" 58operation on vector clocks. Given the definition of HB, 59happens_before and (binary) JOIN, this is easy to prove. 60 61 62The consequences 63~~~~~~~~~~~~~~~~ 64 65With that observation in place, it is a short step to observe that 66storing segment sets in MSMProp1 is unnecessary. Instead of 67storing a segment set in each shadow value, just store and 68update a single vector timestamp. The following two equivalences 69hold: 70 71 MSMProp1 MSMProp2 72 73 adding a segment S join-ing S's vector timestamp 74 to the segment-set to the current vector timestamp 75 76 HB(SS,currS) happens_before( 77 currS's timestamp, 78 current vector timestamp ) 79 80Once it is no longer necessary to represent segment sets, it then 81also becomes unnecessary to represent segments. This constitutes 82a significant simplication to the implementation. 83 84 85The resulting state machine, MSMProp2 86~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 87 88MSMProp2 is isomorphic to MSMProp1, with the following changes: 89 90States are New, Read(VTS,LS), Write(VTS,LS) 91 92where LS is a lockset (as before) and VTS is a vector timestamp. 93 94For a thread T with current lockset 'currLS' and current VTS 'currVTS' 95making a memory access, the new rules are 96 97Name Old-State Op Guard New-State Race-If 98 99E1 New rd True Read(currVTS,currLS) False 100 101E2 New wr True Write(currVTS,currLS) False 102 103E3 Read(oldVTS,oldLS) rd True Read(newVTS,newLS) False 104 105E4 Read(oldVTS,oldLS) wr True Write(newVTS,newLS) #newLS == 0 106 && !hb(oldVTS,currVTS) 107 108E5 Write(oldVTS,oldLS) rd hb(oldVTS, Read(currVTS,currLS) False 109 currVTS) 110 111E6r Write(oldVTS,oldLS) rd !hb(oldVTS, Write(newVTS,newLS) #newLS == 0 112 currVTS) && !hb(oldVTS,currVTS) 113 114E6w Write(oldVTS,oldLS) wr True Write(newVTS,newLS) #newLS == 0 115 && !hb(oldVTS,currVTS) 116 117 where newVTS = join2(oldVTS,currVTS) 118 119 newLS = if hb(oldVTS,currVTS) 120 then currLS 121 else intersect(oldLS,currLS) 122 123 hb(vts1, vts2) = vts1 happens before or is equal to vts2 124 125 126Interpretation of the states 127~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 128 129I always found the state names in MSMProp1 confusing. Both MSMProp1 130and MSMProp2 are easier to understand if the states Read and Write are 131renamed, like this: 132 133 old name new name 134 135 Read WriteConstraint 136 Write AllConstraint 137 138The effect of a state Read(VTS,LS) is to constrain all later-observed 139writes so that either (1) the writing thread holds at least one lock 140in common with LS, or (2) those writes must happen-after VTS. If 141neither of those two conditions hold, a race is reported. 142 143Hence a Read state places a constraint on writes. 144 145The effect of a state Write(VTS,LS) is similar, but it applies to all 146later-observed accesses: either (1) the accessing thread holds at 147least one lock in common with LS, or (2) those accesses must 148happen-after VTS. If neither of those two conditions hold, a race is 149reported. 150 151Hence a Write state places a constraint on all accesses. 152 153If we ignore the LS component of these states, the intuitive 154interpretation of the VTS component is that it states the earliest 155vector-time that the next write / access may safely happen. 156 157