Nextcloud Highly Available Cluster Notes
Nextcloud is an on-premise Dropbox alternative. It’s a competent software with great features. When you look at the documentation, there is a great deal of information about installation. But if you are after a highly available cluster, documentation vanishes, and you get redirected to the paid support site. We created this guide to complement official documentation in for Nextcloud clustering.
1. Topology and Design
1.1 Ports and Firewall
Before we get started with our installation, lets start with system topology and communication requirements between parts of the system. Unless all of your services are on the same VLAN, you will be needing following port permissions on the firewall.
WebServers will be sharing the load between them and they will provide high availability. There are also APCu instances on web servers.
External Redis utilized for shared cache; this Redis does not need to be replicated, as the system will still function without this Redis if the correct configuration exists. Without this node, Nextcloud will slow down.
PostgreSQL high availability obtained with connection pooling and WAL shipping based hot standby.
2.1 System Design
We need a web server and a database to get basic Nextcloud working.
A separate REDIS server is recommended for a production cluster. (for memory cache). Managing a single server Nextcloud is pretty straightforward.
For a production cluster, we often need multiple web servers, all behind a load balancer. There are some tricks for a healthy web cluster. When we make a configuration change on the webserver using online GUI, config/config.php
gets updated on a single node. When you add LDAP integration from the admin panel on a two-node web cluster, only one of the servers gets updated. This is highly problematic; we need to synchronize codebase for all installations. We utilized a custom script called sync.sh
for this task.
2. Web Servers
We need to start with a single web server, follow the official tutorial, and configure everything we need. After we complete our configurations, we need to clone the codebase to other nodes. When we need to make another config change, we need to access the first web server directly with IP. After we update our configurations, we need to sync the Nextcloud codebase and configs to other servers. Now on, we will call this node the master node, even though all nodes are identical. We created the following script for slave nodes to replicate changes. For this to work, we need a no password ssh key configuration.
#!/bin/bash
systemctl stop httpd
rm -rf /var/www/html
ssh masterwebnodeIP rm -v /tmp/moveinstallation.tar
ssh masterwebnodeIP tar -cpf /tmp/moveinstallation.tar /var/www/html
cd /var/www/
scp -v masterwebnodeIP:/tmp/moveinstallation.tar .
tar --same-owner -xvf moveinstallation.tar -C /var/www
mv var/www/html/ .
sed -i 's/ 0 => 'masterwebnodeIP',/ 0 => 'thisnodewebIP',/' /var/www/html/config/config.php
systemctl start httpd
This script also can be used with docker and kubernetes. Just run the following version of this script on entry point before services start to create stateless containers.
#!/bin/bash
rm -rf /var/www/html
ssh masterwebnodeIP rm -v /tmp/moveinstallation.tar
ssh masterwebnodeIP tar -cpf /tmp/moveinstallation.tar /var/www/html
cd /var/www/
scp -v masterwebnodeIP:/tmp/moveinstallation.tar .
tar --same-owner -xvf moveinstallation.tar -C /var/www
mv var/www/html/ .
sed -i 's/ 0 => 'masterwebnodeIP',/ 0 => 'thisnodewebIP',/' /var/www/html/config/config.php
When we want to add new extensions to our Nextcloud, we need to make sure all the slave nodes run the script again. This can be done manually for servers, or by recreating for containers.
3. Database Servers
Nextcloud supports almost all the major databases. Any DB is quite good. Personally, we have chosen PostgreSQL. Galera Cluster for MySQL is also very popular for Nextcloud.
While making the designs, its important to consider how our system will scale. Most important thing about DB is thinking ahead and considering our maximums. It is not possible to scale a single master DB infinitely.
2ndquadrant BDR1 is not compatible with Nextcloud.
4. Redis
For performance, we need a in memory cache system. I have chosen to go with REDIS, it can scale up if needed.
When we look at the configuration samples we can see how to configure a memory caching database. Official documentation says that for REDIS we need to
'memcache.distributed' => '\OC\Memcache\Redis',
'redis' => [
'host' => 'redis-host.example.com',
'port' => 6379, ],
But with this config, if we happen to loose REDIS , our web servers stop with our external REDIS. We find the following config is much better for production use.
'memcache.disributed' => '\\OC\\Memcache\\Redis',
'memcache.locking' => '\\OC\\Memcache\\Redis',
'memcache.local' => '\\OC\\Memcache\\APCu',
'redis' =>
array (
'host' => 'redis-host.example.com',
'port' => 6379,
),
Using APCu for local cache is a better idea, talking with an external server slows down the up. Distributed cache needs to be the same for all cluster nodes. So external REDIS server is used for this cache. When REDIS stops, cluster slows down but keeps going still.
5. Filesystem
We prefered using an S3 storage with Nextcloud. Pitfall of this method can be the fact that when we install the Nextcloud on S3 but we cannot migrate to a filesystem, vice versa is also impossible.
NFS also can be used as a storage, but it will require a serious effort to configure for a good performance. Putting a single server for NFS service to between a storage appliance and web servers is single point of failure.
GlusterFS is also a good option for file storage. For a bare-metal setup, consider installing glusterFS nodes on web servers for extra performance.
We need a strong filesystem behind the Nextcloud. Make sure we have enough space and inodes for:
- Deleted files might remain in the system for a while
- Revisions can remain on the system
- Logs will be also saved to the storage
- Backups of the system will be storage at storage during upgrades.
6. Security
Enable two factor authentication.
Nextcloud can generate extra passwords for mobile devices. When dealing LDAP integrated environments, this has security advantages.
Consider enabling SELinux if you are really serious about security.
Don't place data folder under web root, never.
Public installations can use Nextcloud Security Scan