In computing, a cache is a high-speed storage layer where the required data set is typically of a temporary nature. Data at this level is accessed much faster than at its primary storage location. With the help of caching becomes possible the effective reuse of previously obtained or computed data.

How does caching work?

Data in the cache is usually stored on a fast-access device such as RAM and can be shared with software components. The main function of the cache is to speed up the data retrieval process. It eliminates the need to refer to a lower speed base storage layer.

The small memory size of the cache is compensated by the high access speed. The cache usually stores only the required set of data, and temporarily, unlike databases, where data is usually stored completely and permanently.

Anticipating the most common question about caching, "Will caching help take away the ability to download your content from the site?”

Answer: No, even giants like Twitter, using progressive methods, are not immune to the possibility download video from twitter in a minute using a special service.

Caching overview

RAM and in-memory services. Because RAM and in-memory services provide high request rates, or IOPS (input/output operations per second), caching improves retrieval rates and reduces costs for large-scale operations. To provide a similar scale of operation with traditional databases and hard disk-based hardware requires additional resources. Using these resources results in higher costs, but still does not achieve the low latency that in-memory caches provide.

Applications. Cache is used in a variety of technology layers, including operating systems, network layers including content delivery networks (CDNs) and DNS, Internet applications, and databases. Caching can significantly reduce latency and increase I/O per second for workloads of many read-heavy applications, such as question and answer portals, games, media sharing services, and social networking sites. Information in the cache can include the results of database queries, complex calculations, API requests and responses, and Internet artifacts such as HTML, JavaScript, and image files. Workloads that require large computational power to process data sets, such as recommendation services and high-performance computational modeling, can also effectively use the in-memory data layer as a cache. In these kinds of applications, huge datasets are accessed in real time on clusters of machines that can span hundreds of nodes. The speed of the underlying hardware that processes the data in disk storage presents a major hurdle for such applications.

Design patterns. In a distributed computing environment, a dedicated caching layer allows systems and applications to run independently of the cache. In this case their life cycles do not affect the cache. In this case, the cache acts as a central layer with its own lifecycle and architectural topology, and can be accessed by diverse systems. This is especially true for systems in which application nodes can be dynamically scaled in both directions. If the cache is located on the same node as the applications or systems that use it, scaling can adversely affect the integrity of the cache. In addition, only the local application using its data can benefit from using a local cache. In a distributed caching environment, for the convenience of all consumers, data may be distributed across multiple cache servers and stored in a centralized repository.

Recommendations for caching. When implementing a cache level, it is necessary to take into account the validity of cached data. An effective cache ensures a high hit rate, that is, the presence of requested data in the cache. A cache miss occurs when the requested data is not in the cache. Mechanisms such as TTL (time to live) are used to remove irrelevant data from the cache. You should also understand whether the caching environment requires high availability. If it is required, in-memory services such as Redis can be used. In some cases, the in-memory layer can be used as a separate storage layer, as opposed to caching from main storage. To decide whether this option is appropriate, you need to determine the appropriate RTO (required recovery time, i.e. how long it takes the system to recover from a failure) and RPO (required recovery point, i.e. the last point or transaction to recover) values for the data in-memory service. The characteristics and design strategies of different services in memory can be applied to meet most RTO and RPO requirements.

Level

The client part

DNS

Internet

Appendix

Database

Example of use

Speeding up content downloads from websites (for browsers or mobile devices)

Resolving domain names into IP addresses

Speeding up content downloads from web and application servers. Web session management (server-side)

Increase application performance and data access speed

Reducing delays associated with database query processing

Technology

Managing caching with HTTP headers (browsers)

DNS servers

Managing caching with HTTP headers (CDNs, reverse proxy servers, Internet gas pedals, key-value pair stores)

Key-value pair storages, local cache

Database buffer, key-value pair storages

Solutions

Depending on the browser

Amazon Route 53

Amazon CloudFront, ElastiCache for Redis, ElastiCache for Memcached, partner solutions

Application infrastructures, ElastiCache for Redis, ElastiCache for Memcached, partner solutions

 ElastiCache for Redis, ElastiCache for Memcached

The benefits of caching

Improved application performance

RAM works many times faster than hard disks (both magnetic and SSD), so reading data from the cache into memory takes fractions of a millisecond. This significantly speeds up data access and improves overall application performance.

Reducing the cost of the database

A single cache instance performs hundreds of thousands of IOPS (input/output operations per second) and can replace multiple database instances, reducing overall costs. This is especially true if the cost of the underlying database is based on bandwidth. In this case, the savings can be several tens of percent.

Reducing the load on the server side

By redirecting much of the read load from the server-side database to the in-memory layer, caching reduces the overall load on the database and also reduces the likelihood of slowdowns or crashes during peak periods.

Projected performance

A common problem with today's apps is the need to cope with peak load periods. Social applications on major sporting events or election days, online stores on sale days, etc. face this problem. Increased load on the database leads to an increased delay in receiving a response to a request, which negatively affects the overall performance of the application. This problem is effectively solved by a high memory cache throughput.

Eliminate database hot spots

In many applications, there are situations where a small part of an array of data, such as a celebrity page or a popular product, receives more requests than other elements in the array. This can lead to "hot spots" in the database and excessive allocation of server resources based on established bandwidth requirements for the most frequently requested data. Keeping frequently used keys cached in memory eliminates the need to allocate unnecessary resources, and access to frequently requested data remains fast and reliable.

Increased read throughput (IOPS)

In addition to lower latency, in-memory storage systems provide a much higher query execution rate (IOPS) than comparable hard disk-based databases. A single instance used as a distributed auxiliary cache can process

 

Leave a Reply