What is Cache?
Caching tremendously improves application performance and costs less to implement at scale. Let us see how caching works and why it is widely used.
Full-Stack’s previous post covered DNS and how it works. In one of the components, we came across “DNS lookup in cache memory”. Let us dive deep into what cache memory is and how it is helpful in computing.
Caching data is the process of storing data or files on a temporary memory space (a cache), which can be accessed to fasten the computing process.
In this week’s post, we’ll cover caching, its types, and how it effectively improves computing speed.
What is Cache?
A cache is a high-speed data storage layer that is used to store a subset of data for efficient reuse. This helps the system avoid accessing the data from the primary location every time.
Caching is primarily used for reusing previously restored or computed data.
Trade-off : High speed is attained by giving up some memory capacity for the cache.
Data stored in a database is complete, durable, and contains the entire dataset.
Cache storage is a transient data store that stores a subset of data, that will be accessed frequently.
One of my professors explained “Caching” with a demonstration. It has stayed in my mind to date.
He made one of the students stand up and asked “What is the capital of Australia?”, out of the blue.
The student took 5-10 seconds to think and answered “Sydney…Oh no no! Melbourne”.
Then the professor said “Okay”, and continued with his class. After 5 minutes, he again asked the same student, “Now, tell me the capital of Australia”.
The student instantly replied “Melbourne!”.
The professor then explained, “When I asked you the first time, you had to access your main memory to find the answer. But, when I asked you again, you just used the same answer that you previously retrieved.
This is Caching!
You will not keep this information in your memory for a long time. But you can access it immediately till it is stored there.”
By the way, the correct answer is “Canberra!”.
How does caching work?
Let us take a browser or an application as an example. When a user visits a new website, the browser fetches the website data and static files from the server.
It stores this information in the browser’s cache. Whenever the user tries to access the website again, the browser uses the data from the cache to instantly load the website data.
Cache memory offers extremely low latency.
1) User requests for data
The user requests particular data. This can be website data, application, or data from a database. The system then checks the cache memory whether the requested data is available.
2) Data served immediately
If the requested resource is found in the cache memory, it is served instantly to the user. If not, then the system skips step 2 and proceeds with step 3.
3) Access data from main memory
Now the system fetches the data from the main memory. The main purpose of using a cache is to reduce the number of times step 3 takes place. Thus we retain frequently fetched data in Cache.
4) Data is stored in the cache and served to the user
Data is stored in the cache memory and then served to user. If the user requests the same data, steps 3 and step 4 can be skipped significantly reducing latency.
TTL (Time To Live)
Time To Live (TTL) is a common terminology used during caching.
Time to live (TTL) is the amount of time the content in the cache memory should be stored. There are multiple algorithms to implement TTL for different use cases.
Types of caching
There are four major types of caching in web development.
1) Web caching
Web caching involves both browser caching and proxy/gateway caching.
Browser caching helps users quickly navigate pages they have recently visited. This requires Cache-Control and ETag headers to be present to instruct the user’s browser to cache certain files, for a certain period.
Proxy / Gateway caching can be shared with a large set of users. Data that does not change frequently can be stored in this type of caching. A good example of this would be DNS data to resolve IP Addresses for a domain name.
2) Database caching
For any database-driven applications, DB caching is a great technique to reduce direct queries to the database.
For most DB solutions, cache frequently used queries to reduce the latency for any requests.
When any data is changed or added to the database, it is good practice to clear the cache to avoid data discrepancy.
3) Application caching
Application caching utilizes server-level caching techniques that cache raw HTML. This drastically reduces the load time for websites.
4) Distributed caching
Distributed caching is the most commonly used type of caching for high-volume applications like Youtube and Google.
Distributed caching is an in-memory data store cache, shared by multiple servers, that has a cluster of cheaper machines only serving memory.
Once the cluster of machines is setup,
It is easy to add a new machine to the cluster without disrupting application availability
Allow multiple servers to pull data out of the cache
Examples of distributed caching systems are Memcached, Redis, EhCache, etc.
We will discuss more about Memcached and Redis, the most widely used caching systems, in future posts.
I hope you now have an understanding of what caching is and how it reduces system latency.
Until next time!