When we built our Premier Agent Websites, we decided at the beginning to base the product on the WordPress core. We knew we wanted to host it ourselves for ultimate control, we wanted to run bare minimal off-the-shelf plugins for security reasons, we needed it to scale it to tens and hundreds of thousands of sites, and we wanted it to be literally faster than any other provider out there. In order to meet these criteria and our high standards, we knew we had to make some smart infrastructure and architecture decisions.
We did things a bit different than what many of the current WordPress best practices recommend and many existing plugins offer, and we feel great about what we ended up with. With that in mind, we’d like to share some of our differentiating secret sauce with the WordPress community.
It All Starts With DNS
Before a user’s browser even hits a server for the initial load, an IP needs to be resolved via DNS. DNS performance is far more important (PDF) than many people realize, and this part of the infrastructure should never be glossed over without any thought.
In our case, we host nearly all of our clients’ DNS records on Amazon’s high-performance and globally-distributed Route 53 service. We allow each of our clients to host the free domain they get through us, or use one they already own, and manage the DNS for that domain on Route 53 via that API and a custom UI. We also host the DNS for our CDN domains (more on that later) on Route 53. The result is that the lookups on our clients’ domains and the CDN domains are incredibly fast!
Multiple Caching Layers
We cache at two different layers for two different reasons.
Our second caching layer is simply a Memcached cache facilitated by the excellent Memcached Object Cache plugin (one of the only two non-core plugins we use). For the requests that Varnish doesn’t handle, this helps keep things fast. In addition to having the WordPress core use this caching layer, we make sure to consider and use the wp_cache_* functions within all of our custom functionality.
Auto App Server Scaling
Our entire infrastructure is backed by Amazon Web Services (AWS), so we make extensive use of the functionality it offers that we couldn’t otherwise easily replicate. In the case of our application servers that actually handle PHP and process the WordPress requests, this means that we can use AWS auto-scaling to increase and decrease the number of servers we use depending on the load at any given time. The idea here is that we can scale in real time based on the demand against our servers instead of any other criteria that our clients frankly don’t care about.
Scaling Out Database Servers
The only other non-core plugin we use is HyperDb. In order to keep a good balance of clients between our database servers and allow us to add new servers into the mix with as little overhead as possible, we use the Flexihash consistent hashing library and the MD5 algorithm to determine the database server to use per client. The great thing about this setup is that we can add new servers into the mix as needed and only have to rebalance the data for a continually smaller portion of our clients for each of the new servers we add in.
One additional benefit of mod_pagespeed and the CDN-enabling functionality it offers is that we can tune our Web application servers for serving almost exclusively PHP requests. Focusing on PHP only allows us to reduce overall complexity and make smarter decisions about server scaling based on load.
Serving User Media At Breakneck Speeds
With a liquid, EC2-based fleet of Web app servers that can scale up or down at any time, we knew we couldn’t store our users’ files on those servers. What we did instead was create functionality to store uploaded files directly in Amazon S3 – which is not a CDN by the way – and then rewrite the outgoing URLs to use multiple hostnames pointed to Amazon’s CloudFront CDN.
We additionally insert a unique hash of the actual file content into the file name and then set a 10-year cache header on the file during upload. Doing this has two benefits. First, the long cache duration allows the files to be almost indefinitely cached by CloudFront itself and by users’ browsers. Second, the content hash allows our clients to modify their files (through the image editor for example) and have those changes reflected instantly in any browser.
We designed our platform and infrastructure with performance as a high-priority goal, not as a nice-to-have side feature. We spent a lot of time investigating our options before settling on our choices, and we know we could still improve a few little things, but we think we made the right decisions overall.
Do you want to work on world-class platforms such as this? Do you think you could do better job? If so, we’re hiring!