I was working on a site using the Magento platform, and had the temporary staging site located at a subdomain of the main site. When the time came to move it to production, I:
- Took a snapshot of the EC2 instance hosting the current iteration of the site,
- Created a DNS A record pointing www.site.com to the site’s IP address,
- Changed the existing testing.site.com DNS A record to a CNAME record redirecting to www.site.com,
- Changed the Unsecure Base URL and Secure Base URL (System > Configuration > Web) to www.site.com,
- Cleared all caches within the Magento dashboard,
- Crossed my fingers and grit my teeth.
Despite my best efforts, after the DNS servers updated we began seeing errors pertaining to redirect loops and other general browser failures. I tried clearing cookies and cache on the workstations in question, but no luck. I tried Private Browsing mode to ensure a completely sterile user experience and the site appeared to work but still experienced failures.
Then I sshed into the server and cleared the contents of the magento/var/cache directory (rm -rf magento/var/cache/*). As soon as I did that, the site started working again.
…at least, for a little bit. After clearing the cache, the site would appear to work for about 10 minutes before popping up a whole host of redirection errors all over again. After another 10 minutes or so, it would appear to function again. Another 10 minutes, and it’s down again.
It doesn’t follow a strict schedule, but it is definitely regular.
Having cleared out every tmp directory I could find, deleted the contents of magento/var/session/, checked all my .htaccess files and even poked around in the Apache config. The last time I’ve seen something like this, it was an .htaccess error, where a URL rewrite was incorrectly performed, but I cannot find anywhere in my current install where this might be applicable.
Re-enabling the cache within Magento seemed to fix the problem– I was able to browse for more than 10 minutes but accessing the site from another computer did not work at all.
So, I set my sights on DNS. Nslookup showed two different A records pointing to the www subdomain. Ooops.
I deleted the incorrect one (not even sure what it was pointing to) and am waiting for propagation. We’ll see if that fixes it.