Just under three weeks ago we moved the Discourse server from Digital Ocean to Bytemark. I thought I should write up a little post to give a bit of background and a bit more information about where we are now.
Server performing better. Discourse performing better. Bytemark pretty cool.
A year-and-a-half back one of the team members, Roland, contacted Digital Ocean who provided a generous starting credit which, at the time, would have provided for several years of forum activity. The reasons behind Discourse were covered at the time of the switch so I’ll not go into them here.
However, over the following year the small droplet simply wasn’t able to cope with the Discourse requirements - Discourse is designed to scale to support hundreds of thousands of active users, and with that design choice comes “stepping” of hardware requirements. That is, performance will remain consistent for a range of users, then drop off sharply until more resource is made available to support the next “step” of users.
For the past few months there had been many concerns raised over poor forum performance, timeouts, and gateway errors (“500 errors”). Essentially, this was because interest in the forum had grown to the point where we needed to “step up” the available resources.
For us, this meant stepping the server up from a 2GB instance to a 4GB instance which doubled the server’s monthly cost, though it did reduce the number of 500 errors people were hitting. However, possibly due to “noisy neighbours” (or the fact we had been running with high disk I/O for the past year so we were the noisy neighbour…), latency of the site remained quite high.
I was made aware of Bytemark several years ago. I work with several people who know the founders personally, along with other members of their team. I also kept seeing Bytemark logos on several high-profile/big name OSS project sites and services, and they provide mirrors for a good number of Linux distros. It seemed pretty obvious to me to approach them and ask whether they could help our project too.
After only a couple of emails over a couple of days we had a shiny new VPS matching the DO droplet, and Phil migrated Discourse over. It wasn’t plain sailing as almost immediately we were hit by a Discourse bug which basically brought the site to a grinding halt… not a great start.
Once this had been corrected, and the server had received a few tweaks to make Redis happy, not only were we seeing almost no 500 errors (even when the Yandex spider hits the site) but page latencies were also vastly reduced.
Below is a poorly-illustrated view of the last couple of months of Discourse stats. The top chart shows the total number of views per day, the bottom chart shows the number of 500 errors per day.
With the 2GB DO droplet, the server registered 500 errors whenever page views went above about 50,000. When a web spider hit the site (see spikes) the forum went offline as Discourse restarted parts of its stack. Much frustration for users.
With the 4GB DO droplet, the number of 500 errors was much lower, but the web spidering only took the views above 50,000 on three occasions. The site latency effectively prevented any larger number of views.
With the move to Bytemark, the nominal page view count has increased to 50,000 per day with no corresponding 500 errors, and as others have commented, the overall responsiveness of the forum is much better than before. It has also been able to cope with several intensive web spiderings (spikes at 70k) without so much as a squeak (except one bump which I think was a full container wipe and rebuild after noticing a Docker issue).
The forum is in a much better place than it was two months ago. Not only is hosting secure for the foreseeable future, but it’s performing better, and we get to work with a pretty awesome company that does a lot for OSS projects.
I’m looking forward to see what else Bytemark will let us accomplish.