That Conference 2018 – An Extended Explanation of Caching

That Conference 2018, Kalahari Resort, Lake Delton, WI
An Extended Explanation of Caching – Tom Cudd

Day 3, 8 Aug 2018 2:30 PM

Disclaimer: This post contains my own thoughts and notes based on attending That Conference 2018 presentations. Some content maps directly to what was originally presented. Other content is paraphrased or represents my own thoughts and opinions and should not be construed as reflecting the opinion of the speakers.

Executive Summary

A number of places where you can do caching
Various tools for various places
Really all about 1) populating, 2) invalidating

Caching examples

Network tab in browser, pages from memory or disk cache
Content from CDN
Special applications – e..g. Varnish
WordPress plugin – specify cache settings for blog

Caching is Like Regex

If you’re not careful, you’re going to have new problems

Application Architecture

“Christmas tree” showing server, cache servers, CDN, load balancers, etc.

Problem

Caching solves one problem–performance
3-sec rule: user leaves if something takes longer than 3 secs
- 40% of users leave

Another reason

Buy some time
- e.g. before application crashes
Cost-benefit analysis–caching costs vs downtime

Measure First

User metrics
Load times
Site no crashing every 5 minutes
S.M.A.R.T. Goals – specific measurable achievable relevant time-bound

How to Not Suck at Caching

Caching is additive
- Can’t just throw caching onto server that’s already overloaded
Don’t over-engineer
- Use the simplest solution that satisfies the requirements
Measure, change, test, measure

Caching Doesn’t Help

Oversaturation
- Network hardware
- Thread death–“virus”, creating multiple threads for each request
Thundering herd
- If you get large number of initial requests, a large number of them go all the way back to server, since it takes a little time to cache the data
Lack of information
Bad decisions

End Users

Browser Caching

Setting request headers–page tells browser to cache a page
Requires a hit to operating system
Unique naming
- Rename an assets to force download of long cached object

Set with Web Server

E.g. est cache-control for certain pages

External/Edge Services

CDN – Content Delivery Network
- Point your URI to 3rd party server
- They then pull files, as needed, from your server
Most big sites on the web run on CDN
Improve page load time
- Serve request from server that has shortest travel time to client
High availability
- Cached data could still be present even when your origin server hiccups
Can maybe buy CDN servies at lower cost than scaling out actual servers

CDN Features

Setting TTLs, other configs based on extension
Region, language routing
Mobile detection
WAF/Security
- Applications can do double duty–firewall + CDN
- DDOS protection

CDN Providers

Cloudflare
Incapsula
Cloud providers
Akamai
Fastly

Akamai

Match file extensions
Honoe cache control of origin
Can have configs for some stuff on your server

Measure and Test

Your Systems

Don’t over-engineer–get strange results
Spin up different instance if you need different purpose

Varnish

Separate servers
Both memory and disk paging

Web/App Layer

Baked vs fried
- Building on demand vs preloading cache
- Cache gen’d on demand when request hits (fries)
- Preload cache ahead of time (baked)
Disk vs memory

Products

Adobe Experience Manager
Sitecore
- Prefetch, data, item, html caches
Drupal
- Performance caching (SQL queries stored)

Output Caching

IIS- – generate on request

App/Data Layer

Reducing database calls
Reducing API calls

Ehcache

Java based applications
Memory or disk

Memcached

In memory, key/value store
Reduce database load
Not synchronized
Scalable separately

Let’s Not Fight

Not worth arguing about which is better
Memcached, Redis, Mongo
All are good

Database

Redis
Firebase
Stored procedures – reduce network bandwith by calling short sproc rather than sending long query

Challenges

Invalidating stale data

Caching Procedures

Only two things you’ll really need to do with caching
- Populating
- Invalidating

Debugging, need to remind people to clear their cache

[Sean] But should users have to worry about this? If you need to have them clear cache, then something’s not working properly

Populating Caches

Pull based – cache tool pulls from your server
Push based – your code has to push out data to the cache

Cache clearing

Staggered approaches
Clearing individual items
Using APIs and automation

Sean’s Stuff

Learning new software development technologies out loud

That Conference 2018 – An Extended Explanation of Caching

Leave a comment Cancel reply

Share this:

Related

Leave a comment Cancel reply