

Interesting read, it does seem like time to start looking at horizontally scaling the workers based on request pressure. Having a dynamic number of workers that can do the database updates and then quickly release the mutex should increase throughput.
Matrix is generally very nice for chat, and Discourse for forums.