Cache synchronization makes sure that all cached copies of data reflect the latest changes of data that is stored in the “source of truth,” which is usually a database. Write policies dictate how data updates are handled within the cache itself.
Cache Synchronization Techniques, aka Write Policies
1. Write-Through Caching
Steps:
- Data Update Request: The application initiates a request to modify data.
 - Simultaneous Writes: The data is written to both the cache and the underlying data source (database) in a single atomic transaction.
 - Operation Complete: The transaction completes only when both the cache and the data source have been successfully updated.
 
- Pros:
- Strong Consistency: The cache is always perfectly in sync with the data source.
 - Simple to Implement: No need for complex invalidation or eventual consistency mechanisms.
 
 - Cons:
- Performance Overhead: Every write operation incurs the cost of both a cache update and a database write, potentially slowing down the application.
 
 - When to Use:
- Your system needs absolute data consistency and performance is less important.
 - You need simplified caching logic.
 
 
2. Write-Back Caching
Steps:
- Data Update Request: The application initiates a request to modify data.
 - Cache Write: The data is first written to the cache.
 - Acknowledgement: The application receives an immediate acknowledgement, allowing it to proceed.
 - Asynchronous Write-Back: In the background, the updated data is written back to the underlying data source, often using a write-back queue for efficiency.
 
Note: A “dirty bit” is often used to mark data that has been changed but not yet written back to the storage.
- Pros:
- Excellent Write Performance: Writes are absorbed by the fast cache, deferring slower source updates.
 - Reduced Load on Data Source: Fewer database writes can improve overall system throughput.
 
 - Cons:
- Eventual Consistency: There might be a delay before the source reflects changes made in the cache.
 - Risk of Data Loss: If the cache fails before a write-back occurs, data updates could be lost.
 - Complexity: Requires mechanisms to manage write-back queues and handle potential cache coherence issues in distributed setups.
 
 - When to Use:
- Systems with write-heavy workloads where immediate consistency is not strictly required.
 - When offloading a database or primary data source is necessary to improve performance.
 
 
3. Cache Aside Pattern
Steps:
- Cache Read Attempt: The application attempts to read data from the cache.
 - Cache Miss: If the data is not found in the cache (cache miss), proceed to step 3.
 - Fetch from Source: The application retrieves the data from the original data source (database).
 - Update Cache: The fetched data is written into the cache for future access.
 - Validation Check (Optional): Before returning the data, a quick check can be performed against the data source to ensure it hasn’t changed since retrieval.
 - Return Data: The data (either from the cache or the updated source) is returned to the application.
 
- Pros:
- Balances Performance and Consistency: Avoids unnecessary source interactions on cache hits while offering a mechanism to maintain acceptable freshness.
 - Flexibility: The validation check before returning data provides some control over staleness tolerance.
 
 - Cons:
- Increased Complexity: Requires implementing fetch, write, and validation logic.
 - Overhead on Cache Misses: Fetches from the source occur on cache misses, impacting performance in those cases.
 
 - When to Use:
- Occasional slight staleness is tolerable in exchange for performance benefits on cache hits.
 - Systems where granular control over data staleness is desired for different types of data.
 
 
4. Cache Invalidation
Steps:
- Data Change: The underlying data source is updated (e.g., database record modified).
 - Invalidation Trigger: The data source or an associated system sends an invalidation message (e.g., broadcast notification, webhook, update to a timestamp value).
 - Caches React: Caches holding copies of the modified data mark those copies as invalid.
 - Force Refetch: On the next read request for the invalidated data, caches fetch the latest version from the original data source.
 
- Pros:
- Enforces Consistency: Actively ensures caches don’t serve stale data after the source updates.
 
 - Cons:
- Overhead: Invalidation messages or checks can add network traffic or processing overhead.
 - Complexity: In distributed systems, reliably propagating invalidation events can be complex.
 
 - When to Use:
- Systems with strict consistency requirements where data must accurately reflect changes in the source in a timely manner.
 - Often used in conjunction with other techniques (write-back or cache aside) to trigger cache updates upon data changes.