Javascript required
Skip to content Skip to sidebar Skip to footer

When a Cache Block Has Been Modified Since Being Read From Main Memory

Enshroud

Table of contents

  1. Mutual designs
  2. Cache operations
  3. Enshroud write policies
  4. Virtual or physical addr
  5. Cache coherency

1. Mutual designs ↑top

  • Fully associative: cake can be anywhere in the cache
  • Directly mapped: cake tin just be in 1 line in the enshroud
  • Set up-associative: cake can be in a few (2 to eight) places in the cache

2. Cache operations ↑top

The Pulpit Rock
Fig.1 - A enshroud system.
  • initial: all blocks are Invalid (I=0, cold beginning);
  • read miss: target block is Invalid, or no tag lucifer; the processor retrieves the unabridged 64B line to put into enshroud, termed equally cacheline fill up;
  • read hit: matched tag is found, directly fetches the data from cache;
  • write miss: when the processor wants to write an operand to memory, information technology beginning checks if the block is already in cache. Unfortunately, the write refers to a memory location that not currently in cache, which causes the processor to perform a cacheline-fill (write allocation) and and so proceeds to alter the value of the operand in enshroud without writing directly to memory;
  • write hit: the location is being held in cache, directly write (no eviction);
  • write back: when doing allacation for read/write misses, a line needed to be evicted for the newly fetched block; if the existing cache line is dirty, exercise a write-dorsum.

As a summary:

  • Five=1 means the line has valid data, and D=1 (dingy) means the bytes are newer than main retentivity.
  • when allocating line, set up V=i, D=0 (clean) and fill in tag and data;
  • when writing a line in response to write hitting, ready D=one;
  • when evicting a line: if D=0 (retentivity data is Not stale), just prepare 5=0; if D=1 (memory data is dried), write-back the data and then set up D=0 and 5=0.

3. Cache write policies ↑summit

if data is already in the cache ...

  • No-write: writes invalidate the cache and go directly to memory
  • Write-through: writes get to main memory and cache
  • Write-back: CPU writes only to cache; cache writes to chief retention when the muddy block is later evicted.

if data is non in the cache

  • Write-allocate: allocate a enshroud line (put it in enshroud) for new data (and perhaps write-through)
  • No-write-classify: write it straight to retentiveness without allocation

write-through vs. write-back

A enshroud with a write-through policy (and write-allocate) read an entire block (cacheline) from memory on a enshroud miss and writes only the updated detail to memory for a store. Evictions do not need to write to retentivity.

A enshroud with a write-dorsum policy (and write-allocate) reads an entire block (cacheline) from memory on a cache miss, may need to write dirty cacheline commencement. Whatever writes to retentivity need to be the entire cacheline since no fashion to distinguish which word was muddy with only a single dirty bit. Evictions of a muddied cacheline cause a write to memory.

write-through is lower but cleaner (retentivity always consequent), write-back is faster but complicated when multi cores sharing memory, requiring cache coherency protocol.

4. Virtual or concrete addr ↑top

TLBs are small (mayhap 64 entries), fully-associative caches for folio table entries.

physical enshroud vs. virtual cache

If we interpret earlier we get to the cache, we have a "physical cache" which works on physical addr.
Critical path = TLB admission time + enshroud access time

Alternatively, nosotros could translate after the cache (only for cache misses), we have a "virtual cache". Virtual cache is dangerous. We must flush the cache on a context switch to avoid "aliasing".

virtually indexed physically tagged

Folio offset $.25 are non translated and thus tin be presented to the cache immediately. Accordingly, enshroud and TLB accesses can begin simultaneously, and tag comparion is made after both accesses are completed.

The Pulpit Rock
Fig.2 - Virtual index, physical tag.

5. Enshroud coherency ↑acme

In a shared memory multiprocessor arrangement, an operand can have multiple copies in master memory and in caches. Cache coherence is to ensure that the changes in the values of shared operands are propagated throughout the system in a timely fashion.

Coherence rules:

  • writes eventually become visible to all processors;
  • writes to the same location are serialized.
The Pulpit Rock
Fig.3 - Enshroud of CMP.

The nearly bones protocol is MSI.
MSI → MESI:

  • MSI observation: doing read-modify-write sequences on individual data is common, and hence the traffic can be reduced for writes of cake on merely 1 cache
  • MESI solution: adds E land (exclusive, clean), writes on such lines happen silently (don't tell other caches to invalidate the line), transition to Thou (sectional, dirty)

MSI → MOSI:

  • MSI ascertainment: on G→S transitions, must write dorsum line
  • MOSI solution: adds O country (possessor), indicating that the current cadre owns this cake, and will service requests from other cores for the block.

MSI → MOESI:

  • achieves benefits of both MESI and MOSI

MESI addes an "Exclusive" state to reduce the traffic caused by writes of blocks that only in one enshroud (a silent write in MESI). Farther, MOSI adds an "Endemic" land to reduce the traffic caused by write-backs of blocks that are read by other caches.

MESI protocol

The Pulpit Rock
Fig.four - State transitions in MESI protocol.

Every cache line is marked with ane of the four following states (coded in two bits):

  • Modified: the line is present only in electric current local cache, and is muddied (memory copy is stale); A write back must be performed in future, before permitting any other read of the (no longer valid) memory.
    The write-dorsum changes the line to Shared land.
  • Due eastxclusive: the line is present only in electric current local enshroud, but is clean (matches master memory); it may be changed to Shared at any fourth dimension, in response to a read asking. Information technology may also be changed to Modified state when writing to it.
  • Shared: the line may be likewise stored in other caches and is make clean (matches main memory). The line may be discarded (changed to Invalid state) at any time.
  • Invalid: the cache line is invalid (unused).

Transitions (presume local is on core0 and remote is on core1):

  • I →: E (core0_R) / M (core0_W);
  • East →: Grand (core0_W) / S (core1_R) / I (core1_W);
  • Yard →: Due south (core1_R) / I (core1_W);
  • S →: G (core0_W) / I (core1_W).

A enshroud may satisfy a read from any country except Invalid. An invalid line must exist fetched (to the Shared or Exclusive states) to satisfy a read.

A write may only be performed if the cache line is in the Modified or Exclusive state. If it is in Shared land, all other buried copies must be invalidated beginning. This is typically washed past a broadcast operation known as RfO.

A cache may discard a non-Modified line (i.e., Shared or Sectional) at whatsoever fourth dimension, changing to the Invalid country. A Modified line must be written dorsum showtime.

A cache that holds a line in the Modified state must snoop (intercept) all attempted reads (from all of the other caches in the system) of the respective main mem location and insert the information it holds. This is typically done by forcing the read to back off (i.e., retry later on), then writing the data to main retentivity and irresolute the cache line to Shared state.

A cache that holds a line in the Shared state must heed for invalidate or RfO broadcasts from other caches, and discard the line (by moving it into Invalid) on a match.

A cache that holds a line in Exclusive state must also snoop all read transactions from all other caches, and move the line into Shared state on a match.

Snooping cache

Snooping is widely used in bus-based multiprocessors. The cache controller constantly watches the bus.

  • Write invalidate: when a processor writes to local cache C, all copies of it in other processors are invalidated. These processors have to read or valid re-create either from memory (M), or from the processor that modified the variable.
  • Write broadcast: instead of invalidating, the processor tin broadcast the updated value to other processors sharing the copy. This acts as write through for shared data, and write back for private data.

[one] Cache full general construction
[2] Virtual cache
[iii] Enshroud coherence
[4] MESI protocol
[five] MSI and variants
[6] Cache coherence (MIT)

When a Cache Block Has Been Modified Since Being Read From Main Memory

Source: https://people.cs.pitt.edu/~xianeizhang/notes/cache.html