How to avoid a page being cached

All web programmers have probably had trouble with browsers caching pages it ought not to. So what can we do about it? Well in good old HTTP 1.0 we had a nice header that simply said:

Pragma: no-cache

Easy huh? Yes. Probably to easy. If not browsers then sure some proxy server will dissobey that simple command and require that we explain it to them more thoroughly. This brings on the next HTTP-command:

Expires: -1

Acctually any invalid date format will do, the meaning should be interpreted as “this page have ceased to be” [mental image of John Cleese banging a parrot on the desk]. Only problem is still some missbehaving browsers and proxys interpret this as “well you might have written an erranous date, so we play nice and cache the page for you still”. Cue HTTP 1.1 and we have another header:

Cache-control: no-cache

Oh, remember this directive? Easy huh? Heard it before. Yes, it’s to easy to be true as well. The problem with this one is that some missbehaving reverse-proxys apparently fails to deliver these pages through the proxy in what seems to be their inability to forward it since they are not allowed to save it. At least in my case it was a reverse proxy that seemed to think very little of pages it wasn’t allowed to keep. We had to give it “Cache-control: private” in order for it to acctually pass the page on. The obvious problem with this is that it no longer prehibits the end user agent (as opposed to a in the middle proxy) to cache the page.

Now all available headers have failed in some way, add to this that someone using HTTP 1.0 might try and send a cache-control which will fail due to it not being part of 1.0 or in reverse someone using 1.1 sending Pragma header which might be ignored due to being replaced by cache-control in 1.1.

What is a programmer to do? Well, since proxys have made me not rely on normal HTTP headers the next step is into HTML and the http-equiv META tags. Let’s blast the browser with everything we have:

<meta http-equiv=”Expires” content=”-1″>
<meta http-equiv=”Pragma” content=”no-cache”>
<meta http-equiv=”Cache-Control” content=”no-cache”>

Now no proxy should ever interfere with our headers. The problem with cache-control and pragma remains so if you use HTTP 1.0 the former is ignored and in 1.1 the latter. If we include both we are safe, at least until they decide to probably change the whole thing in a future 1.2 version. We also send the expires tag which should make its way all the way to the browser without being cached. Hopefully at least one of these will be treated with respect by the browser, this is even partly recommended in an old KB-article from Microsoft. Still http-equiv is not as safe as real HTTP headers, it requires the browsers to support them. Some support them better than others (the article is old but still sends my head spinning in dissbelief).

Being dissillusioned by the current state of cache control (not the header, the subject) I ended up doing what probably most people are doing allready. Appending a random 10 character string to every call I ever make effectivly fooling the browser that this information might be improtant and making it update the page properly. Just append it to the back of every GET and include a random field in every POST.

http://www.fireflake.com/?cache=ds4R3HYh4

http://www.fireflake.com/?cache=BawqEw42cf

Not the same page. Obviously. Please don’t tell any browser developer this or they might include a “random cache of everything in the known universe”-feature in their next build.