Lines Matching refs:store
23 <li style="margin:0"><a href="#ss_ll">Store/store and load/load</a></li>
24 <li style="margin:0"><a href="#ls_sl">Load/store and store/load</a></li>
204 <p>To get into a situation where we see B=5 before we see the store to A, either
216 stores, it does not guarantee that a store followed by a load will be observed
246 SMP, the store to A and the load from B in thread 1 can be “observed” in a
295 continue executing instructions past the one that did the store, possibly
313 either way it won’t see the store performed by core 1. (“A” could be in core
320 performance penalty on every store operation. Relaxing the rules for the
328 store #1 to <strong>finish</strong> being published before it can start on store
347 is meant by “observing” a load or store. Suppose core 1 executes “A = 1”. The
348 store is <em>initiated</em> when the CPU executes the instruction. At some
349 point later, possibly through cache coherence activity, the store is
351 <em>complete</em> until the store arrives in main memory, but the memory
433 <li>store followed by another store</li>
435 <li>load followed by store</li>
436 <li>store followed by load</li>
439 <h4 id="ss_ll">Store/store and load/load</h4>
457 <p>Thread 1 needs to ensure that the store to A happens before the store to B.
458 This is a “store/store” situation. Similarly, thread 2 needs to ensure that the
465 lines, with minimal cache coherency. If the store to A stays local but the
466 store to B is published, core 2 will see B=1 but won’t see the update to A. On
482 <em>store/store barrier</em><br />
490 <p>The store/store barrier guarantees that <strong>all observers</strong> will
496 <p>Since the store/store barrier guarantees that thread 2 observes the stores in
502 <p>The store/store barrier could work by flushing all
516 <h4 id="ls_sl">Load/store and store/load</h4>
519 store/load barrier. Here’s an example where a load/store barrier is
535 <p>Thread 2 could observe thread 1’s store of B=1 before it observe’s thread 1’s
536 load from A, and as a result store A=41 before thread 1 has a chance to read A.
537 Inserting a load/store barrier in each thread solves the problem:</p>
546 <em>load/store barrier</em><br />
549 <em>load/store barrier</em><br />
556 <p>A store to local cache may be observed before a load from main memory,
559 while that’s in progress execution continues. The store to B happens in local
566 thread 2 store to A before thread 1’s read if thread 1 guarantees the load/store
574 <p>As mentioned earlier, store/load barriers are the only kind required on x86
585 <li>Alpha provides “rmb” (load/load), “wmb” (store/store), and “mb” (full).
590 <li>ARMv7 has “dmb st” (store/store) and “dmb sy” (full).</li>
616 <em>store/store barrier</em><br />
646 <em>store/store barrier</em><br />
664 load or store. It can let you avoid the need for an explicit barrier in certain
712 store in thread 1 causes something to happen in thread 2 which causes something
714 that order. (Inserting a load/store barrier in thread 2 fixes this.)</p>
755 load 0 from A, increment it to 1, and store it back, leaving a final result of
789 location is doubleword-aligned and special load/store instructions are used.
797 store.</p>
811 conditional store instruction is used to try to write the data back. If the
812 reservation is still in place, the store succeeds; if not, the store will fail.
924 other threads will stay out until they observe the store of 0. If it takes a
936 When releasing the spinlock, we issue the barrier and then the atomic store.
956 the store of zero to the lock word is observed after any loads or stores in the
957 critical section above it. In other words, we need a load/store and store/store
959 SMP -- only store/load barriers are required. The implementation of
961 barrier followed by a simple store. No CPU barrier is required.</p>
1032 move a store “downward” across another store’s release barrier.</li>
1033 <li>A load followed by a store can’t be reordered, because neither instruction
1035 <li>A store followed by a load <strong>can</strong> be reordered, because each
1039 <p>Hence, you only need store/load barriers on x86 SMP.</p>
1118 <p>Without a memory barrier, the store to <code>gGlobalThing</code> could be observed before
1164 <p>We need to replace the store with:</p>
1232 releasing store. This means that compilers and code optimizers are free to
1256 usual ways, for example the compiler could move a non-volatile load or store “above” a
1257 volatile store, but couldn’t move it “below”. Volatile accesses may not be
1403 <p>Now the problem should be obvious: the store to <code>helper</code> is
1408 <p>You could try to ensure that the store to <code>helper</code> happens after
1452 it can’t make any assumptions about <code>data2</code>, because that store was
1453 performed after the volatile store.</p>
1774 <p>As we saw in an earlier section, we need to insert a store/load barrier
1781 <th>volatile store</th>
1785 <em>load/load + load/store barrier</em></code></td>
1786 <td><code><em>store/store barrier</em><br />
1788 <em>store/load barrier</em></code></td>
1792 <p>The volatile load is just an acquiring load. The volatile store is similar
1793 to a releasing store, but we’ve omitted load/store from the pre-store barrier,
1794 and added a store/load barrier afterward.</p>
1798 issue the store/load barrier before the volatile load instead and get the same
1800 with the store.</p>
1803 atomic operation and skip the explicit store/load barrier. On x86, for example,