Nextcloud – Operations Guide

Nextcloud 33 on K3s (AlmaLinux 9) · Last updated: May 2026

Overview

Nextcloud runs on a single-node K3s cluster on AlmaLinux 9. The entire infrastructure is set up and managed with an Ansible playbook (nextcloud-k3s.yml). The playbook is idempotent – it can be re-run at any time without modifying data or unnecessarily interrupting services.

Nextcloud
33.x (nextcloud:33-fpm)
Operating System
AlmaLinux 9
Kubernetes
K3s (Single-Node)
Database
MariaDB 10.11 (Host)
Cache / Locking
Redis (Host)
Online Office
Collabora CODE 25.04
Design Decision: Single-Node The server runs as a single Kubernetes node. There is no high availability. Updates and restarts cause a brief downtime (typically: 30–60 seconds). The Recreate strategy in the deployment ensures that only one pod accesses the HostPath volumes at a time.

Architecture

Overall Overview

Internet / Browser │ │ HTTPS :443 / HTTP :80 ▼ ┌─────────────────────────────────────────────────────┐ │ nginx-ingress F5 (K3s DaemonSet, hostNetwork=true) │ │ TLS termination via cert-manager (Let's Encrypt) │ │ Rate-Limiting: nc_login 5r/s (HTTP layer) │ └───────────────┬─────────────────────┬───────────────┘ │ HTTP │ HTTP nextcloud.* │ collab. │ (WebSocket) ▼ ▼ ┌──────────────────────┐ ┌──────────────────���───┐ │ Pod: nextcloud │ │ Pod: collabora │ │ ┌────────────────┐ │ │ Collabora CODE │ │ │ nginx sidecar │ │ │ 25.04 │ │ │ :80 (static + │ │ │ :9980 │ │ │ PHP proxy) │ │ └──────────────────────┘ │ └───────┬────────┘ │ │ fastcgi │ :9000 │ │ ┌───────▼────────┐ │ │ │ nextcloud-fpm │ │ │ │ PHP 8.4 + FPM │ │ │ └────────────────┘ │ └──────────────────────┘ │ │ │ TCP :3306 │ TCP :6379 ▼ ▼ ┌──────────────┐ ┌───────────┐ │ MariaDB │ │ Redis │ │ 10.11 Host │ │ 7.x Host │ └──────────────┘ └───────────┘┌───────────────────────────────────────────┐ │ HostPath Volumes (on /dev/vda4, 119 GB) │ │ /srv/nextcloud-www → /var/www/html │ │ /data → /var/www/html/data │ └───────────────────────────────────────────┘

Why This Approach?

ComponentWhere does it run?Rationale
Nextcloud PHP-FPM + nginx K3s Pod Container Simple updates via image swap; no php-fpm on the host
Collabora CODE K3s Pod Container Isolated, own TLS termination through nginx-ingress (F5)
MariaDB Host service Host Database data resides directly on the filesystem; no container volume layer
Redis Host service Host Simple, no persistence needed; binds to host IP for pod access
nginx-ingress (F5) K3s DaemonSet hostNetwork Binds directly to port 80/443 on the host; no external load balancer needed
cert-manager K3s Container Automatic Let's Encrypt certificates; replaces certbot

Nextcloud Pod in Detail

Both containers share the same pod (= the same network namespace and the same volumes). This allows nginx to communicate with PHP-FPM via 127.0.0.1:9000 without needing a Kubernetes service.

Init Container: init-dirs

Runs once before the main containers start. Sets the ownership of the HostPath directories to UID 33 (www-data in the container) and configures permissions. Without this step, Nextcloud cannot write anything to /var/www/html.

Container 1: nextcloud-fpm

The official nextcloud:33-fpm image. Runs PHP-FPM on port 9000. If /var/www/html is empty, the Docker entrypoint automatically installs Nextcloud based on the environment variables (DB, Admin, Redis). During an update, occ upgrade runs automatically.

Container 2: nginx

nginx sidecar (nginx:1.30-alpine) on port 80. Serves static requests directly from the shared volume; PHP requests are forwarded via FastCGI to port 9000. TLS is already terminated by nginx-ingress (F5) – nginx only speaks plain HTTP.

Storage & Data

Important: All directories reside on the same partition /dev/vda4 (119 GB). There are no separate volumes for app and data.
Path on HostMount Point in ContainerContentsOwner
/srv/nextcloud-www /var/www/html PHP app, apps, themes, config/ root:root 755 (dir)
33:33 for config/
/data /var/www/html/data User files, nextcloud.log 33:33 (www-data in container)
/var/lib/mysql MariaDB database (runs directly on the host) mysql:mysql

Why Are App and Data Separate?

Nextcloud app files (/srv/nextcloud-www) and user data (/data) are intentionally split into two separate HostPath volumes because:

Important Configuration File: config.php

Nextcloud automatically reads all *.php files from /var/www/html/config/. There are two sources:

Ownership Pitfall with ConfigMap Mount When Kubernetes mounts a ConfigMap subPath, it may create the parent directory as root:root. Therefore /srv/nextcloud-www/config/ must be pre-created with owner: 33 (Ansible task next_k3s_deploy/tasks/dirs.yml).

Network & TLS

IP Address Ranges

RangePurposeRelevant for
10.42.0.0/16 K3s Pod CIDR (Flannel) iptables rules, trusted_proxies, WOPI allowlist, MariaDB user grant
10.43.0.0/16 K3s Service CIDR ClusterIP addresses of K8s services
82.165.165.230 Public host IP (test server) DNS, WOPI allowlist, ExternalService endpoints

Required DNS Records

DNS records must be created at the domain registrar before the first playbook run. cert-manager requires them for the Let's Encrypt HTTP-01 challenge – without valid records certificate issuance will fail.

Name (Hostname)TypeValuePurpose
nextcloud.example.de A Public IP of the server Nextcloud web interface, Let's Encrypt TLS
collabora.example.de A Public IP of the server Collabora CODE Online (office editing), own TLS certificate
Collabora requires its own hostname Collabora runs as a separate ingress on its own subdomain name. If only one A record is created (without Collabora), the WOPI callback from Nextcloud to Collabora cannot function and documents cannot be edited online.
IPv6 (optional) If the server has a public IPv6 address, AAAA records can additionally be created for both hostnames. The firewall rules (fw_ipv6.sh) are already configured.

How Pods Reach the Host (MariaDB / Redis)

MariaDB and Redis run as host services. Pods connect to them via K8s ExternalServices: a service without a selector with manually maintained endpoints pointing to the public IP of the host.

Pod → mariadb.nextcloud.svc.cluster.local (10.43.x.x)
   → Endpoint: 82.165.165.230:3306
   → iptables ACCEPT for 10.42.0.0/16 on :3306

The MariaDB database user is restricted to the pod CIDR ('nextcloud'@'10.42.%'). A direct connection from the host only works as root via Unix socket.

TLS and Proxy Chain

Internet
  → nginx-ingress (F5) (Port 443, TLS terminated, sets X-Forwarded-For / -Proto)
    → nginx sidecar (Port 80, plain HTTP)
      → PHP-FPM (Port 9000, FastCGI)
        fastcgi_param HTTPS on
        fastcgi_param HTTP_SCHEME https

Nextcloud needs to know that requests are coming through a proxy:

'trusted_proxies'       => ['10.42.0.0/16'],
'forwarded_for_headers' => ['HTTP_X_FORWARDED_FOR'],
'overwriteprotocol'     => 'https',
Important: trusted_proxies must contain the pod CIDR If this entry is missing, Nextcloud will show the warning "Reverse proxy header configuration is incorrect" in the admin panel. nginx-ingress (F5) forwards requests with its pod IP (10.42.x.x) – this must be registered as a trusted proxy.

WOPI (Collabora)

Collabora calls Nextcloud's WOPI API for document editing. These callbacks come from the pod IP of the Collabora pod (10.42.x.x) – not from the service IP. The wopi_allowlist must therefore contain the pod CIDR:

# inventory/host_vars/<host>/vars.yml
wopi_allowed_hosts:
  - "10.42.0.0/16"   # Pod CIDR (Collabora pod → Nextcloud WOPI)
  - "10.43.0.0/16"   # Service CIDR
  - "collabora.inspired-as-code.de"

Important Files in the Repository

Inventory & Variables

FileContents
inventory/group_vars/all.yml Global settings: K3s CIDRs, container image tags, Helm chart versions, monitoring versions
inventory/host_vars/<host>/vars.yml Host-specific: hostnames, IPs, DB name, paths, WOPI allowlist
inventory/host_vars/<host>/vault.yml Encrypted! Passwords: DB root, DB user, admin, Redis, SMTP, Grafana. Encrypt before git commit: ansible-vault encrypt inventory/host_vars/test/vault.yml

Ansible Roles (Nextcloud-specific)

RoleTask
next_packagesRepositories (MariaDB 10.11), packages (MariaDB, Redis, SELinux tools)
next_mariadbMariaDB setup, create database and user, bind-address to 0.0.0.0
next_redisRedis bind configuration (localhost + host IP)
next_selinuxSELinux enforcing, container_file_t for HostPath directories
next_k3s_deployK8s manifests: namespace, secrets, ConfigMaps, deployments, ingress, CronJob
next_configPost-deploy via kubectl exec occ: trusted_domains, apps, WOPI config

Templates (generating the K8s manifests)

TemplateGeneratesImportant because
nextcloud-deployment.yml.j2 Deployment + Service Image tags, volumes, readiness probes, environment variables
custom-config-cm.yml.j2 ConfigMap nextcloud-custom-config Nextcloud PHP configuration: Redis, proxy, locale, session
nginx-cm.yml.j2 ConfigMap nextcloud-nginx-config nginx routing, .well-known redirects, MIME types, PHP FastCGI
secret.yml.j2 Secret nextcloud-secrets DB credentials, admin password, SMTP data – injected into the pod as environment variables
external-services.yml.j2 Services + Endpoints for MariaDB, Redis, Grafana Connection from the pod to the host service
collabora-deployment.yml.j2 Collabora Deployment + Service aliasgroup1 env restricts allowed Nextcloud instances
cronjob.yml.j2 K8s CronJob Runs occ cron.php every 5 minutes

Variables & Secrets

Where Credentials Are Stored

All passwords are stored in inventory/host_vars/<host>/vault.yml, encrypted with Ansible Vault. This file must never be committed unencrypted to git.

# Encrypt before committing:
ansible-vault encrypt inventory/host_vars/test/vault.yml

# Decrypt for editing:
ansible-vault decrypt inventory/host_vars/test/vault.yml
# (re-encrypt immediately afterwards!)

# Open directly in editor (preferred):
ansible-vault edit inventory/host_vars/test/vault.yml

Password File for Unattended Runs

# .vault_pass (never commit – listed in .gitignore)
echo "my-vault-password" > .vault_pass
# Uncomment in ansible.cfg:
# vault_password_file = .vault_pass

Secrets in the K8s Cluster

The vault variables flow into the K8s secret nextcloud-secrets in the namespace nextcloud during the playbook run. The secret is injected as environment variables into the pod.

# Show secret (decoded):
k3s kubectl get secret nextcloud-secrets -n nextcloud \
  -o jsonpath='{.data}' | python3 -c \
  "import sys,json,base64; [print(k,base64.b64decode(v).decode()) \
   for k,v in json.load(sys.stdin).items()]"

Container Images & Versions

All image tags are centrally defined in inventory/group_vars/all.yml. Since imagePullPolicy defaults to IfNotPresent for tagged images, K3s only pulls a new image automatically when the tag in the manifest has changed.

# inventory/group_vars/all.yml
nextcloud_image_fpm:       "nextcloud:33-fpm"      # Nextcloud PHP-FPM
nextcloud_image_nginx:     "nginx:1.30-alpine"          # nginx Sidecar
nextcloud_image_collabora: "collabora/code:25.04.9.4.1" # Collabora CODE
nextcloud_image_busybox:   "busybox:1"             # Init container

Tag Strategy

Tag FormatMeaningExample
33-fpm Follows minor and patch updates in Nextcloud 33.x automatically on pull nextcloud:33-fpm
34-fpm Next major version – must be set explicitly Change required in all.yml
1.29-alpine nginx minor track; patch updates come automatically nginx:1.30-alpine
25.04 Collabora release cycle (quarterly) collabora/code:25.0425.08
Patch Update Without Tag Change When Docker Hub updates e.g. nextcloud:33-fpm from 33.0.2 to 33.0.3, nothing happens automatically on the server. The tag is cached locally. Update manually:
crictl pull docker.io/library/nextcloud:33-fpm
k3s kubectl rollout restart deployment/nextcloud -n nextcloud

Nextcloud Configuration

Configuration is done in two ways:

  1. custom.config.php (via Ansible ConfigMap) – static settings that are updated with every playbook run
  2. occ commands (via kubectl exec) – dynamic settings such as trusted_domains, WOPI URL, app installation

custom.config.php – Overview of Key Values

// Language
'default_language'     => 'de',
'default_locale'       => 'de_DE',
'force_language'       => 'de',

// Session (1 hour, no "stay logged in")
'session_lifetime'               => 3600,
'auto_logout'                    => true,
'remember_login_cookie_lifetime' => 0,

// Redis (cache, locking, sessions)
'memcache.local'       => '\\OC\\Memcache\\Redis',
'memcache.distributed' => '\\OC\\Memcache\\Redis',
'memcache.locking'     => '\\OC\\Memcache\\Redis',
'redis' => ['host' => 'redis', 'port' => 6379],

// Reverse proxy
'trusted_proxies'       => ['10.42.0.0/16'],
'forwarded_for_headers' => ['HTTP_X_FORWARDED_FOR'],
'overwriteprotocol'     => 'https',

// Logging (in /data/nextcloud.log on the host)
'loglevel'        => 4,
'logtimezone'     => 'Europe/Berlin',
'log_rotate_size' => 104857600,   // 100 MB

// Maintenance window (1 a.m.)
'maintenance_window_start' => 1,

Verify Configuration After Playbook Run

k3s kubectl exec -n nextcloud deployment/nextcloud -c nextcloud-fpm -- \
  runuser -u www-data -- php /var/www/html/occ config:list system

k3s kubectl exec -n nextcloud deployment/nextcloud -c nextcloud-fpm -- \
  runuser -u www-data -- php /var/www/html/occ setupchecks

Run Playbook

Full Initial Run (New Server Installation)

  1. Set SSH port to 22 in inventory/host_vars/test/vars.yml: ansible_port: 22
  2. Enter vault passwords in vault.yml and encrypt
  3. Install collections: ansible-galaxy collection install -r requirements.yml
  4. Run playbook:
    ansible-playbook nextcloud-k3s.yml --limit test --ask-vault-pass
  5. Change SSH port in vars.yml to 10022 (after SSH hardening by common_ssh)

Idempotent Re-Run (e.g. After Config Change)

ansible-playbook nextcloud-k3s.yml --limit test --ask-vault-pass

The playbook detects what is already correctly configured and only changes what is necessary. Kubernetes manifests are only re-rolled when the manifest has changed (changed: "configured" in output).

Run Only Specific Roles

# Only Nextcloud configuration (occ commands)
ansible-playbook nextcloud-k3s.yml --limit test --tags next_config --ask-vault-pass

# Only update K8s deployments
ansible-playbook nextcloud-k3s.yml --limit test --tags next_k3s_deploy --ask-vault-pass

Important Operations Commands

Pod Status

# All pods in the nextcloud namespace
k3s kubectl get pods -n nextcloud

# Detailed status (events on problems)
k3s kubectl describe pod -l app=nextcloud -n nextcloud

# Logs
k3s kubectl logs deployment/nextcloud -c nextcloud-fpm -n nextcloud --tail=50
k3s kubectl logs deployment/nextcloud -c nginx -n nextcloud --tail=50
k3s kubectl logs deployment/collabora -n nextcloud --tail=50

occ Commands (Nextcloud CLI)

# Prefix for all occ commands:
k3s kubectl exec -n nextcloud deployment/nextcloud -c nextcloud-fpm -- \
  runuser -u www-data -- php /var/www/html/occ <COMMAND>

# Examples:
... occ status
... occ check
... occ setupchecks
... occ maintenance:mode --on
... occ maintenance:mode --off
... occ files:scan --all
... occ db:add-missing-indices
... occ app:list
... occ config:list system
... occ config:system:get trusted_domains

Restart Pod

# Nextcloud (rolls gracefully due to Recreate strategy)
k3s kubectl rollout restart deployment/nextcloud -n nextcloud
k3s kubectl rollout status  deployment/nextcloud -n nextcloud --timeout=300s

# Collabora
k3s kubectl rollout restart deployment/collabora -n nextcloud

Namespace Overview

k3s kubectl get all -n nextcloud

MariaDB (Host)

# As root via socket (credentials in /root/.my.cnf):
mysql test_nextcloud

# Create DB dump:
mysqldump --single-transaction test_nextcloud > /tmp/dump.sql

Redis (Host)

# Test connection:
redis-cli ping

# Flush cache (e.g. after configuration change):
redis-cli FLUSHALL

Monitoring

The monitoring stack (Prometheus + Grafana + Node Exporter + mysqld_exporter + php-fpm_exporter) runs partly as host services, partly as sidecar containers in the Nextcloud K3s pod. Grafana is accessible via K3s ingress on the Nextcloud hostname under /grafana/.

Grafana
https://nextcloud.*/grafana/

Dashboards: Node Exporter Full (1860), MySQL Overview (7362), PHP-FPM (4912)
Login: Grafana admin password from vault.yml

Prometheus
127.0.0.1:9090

Accessible locally only
Retention: 30 days

Node Exporter
127.0.0.1:9100

CPU, RAM, disk, network

mysqld_exporter
127.0.0.1:9104 (Host)

Host service (common_mysqld_exporter); connects via TCP to the host MariaDB as user prometheus
Prometheus job: mysqld_nextcloud

php-fpm_exporter
127.0.0.1:30254 (NodePort)

Sidecar in the Nextcloud pod; Prometheus scrapes via NodePort 30254
Prometheus job: php_fpm_nextcloud

MariaDB Slow Query Log

The host MariaDB writes slow queries (≥ 1 second) to /var/log/mariadb/slow.log. The file is created automatically by MariaDB and can be read via journalctl or directly.

# Read slow query log live
tail -f /var/log/mariadb/slow.log

# Filter queries from the last 24 hours
grep -A 3 "Query_time" /var/log/mariadb/slow.log | grep -v "^--$"

# Check log size
ls -lh /var/log/mariadb/slow.log

Configuration: /etc/my.cnf.d/nextcloud-k3s.cnfslow_query_log = 1, long_query_time = 1

PHP-FPM Slow Log

PHP-FPM logs requests that take longer than 5 seconds to /proc/1/fd/2 (stderr of the pod's main process). Since ptrace is not available in containers, no stack traces are written – only the request timestamp.

# Monitor slow FPM requests
k3s kubectl logs -n nextcloud deployment/nextcloud -c nextcloud-fpm -f | grep -i "slow"

Configuration: ConfigMap nextcloud-fpm-pool-configpm.status_path = /fpm-status, slowlog = /proc/1/fd/2, request_slowlog_timeout = 5s

Grafana Dashboards

Node Exporter Full (ID 1860) – Host Metrics

Shows the health of the entire server in real time.

PanelWhat to read
CPU Usage Total utilization and breakdown per core (user / system / iowait). iowait > 20% indicates a disk bottleneck.
Load Average (1/5/15 min) System load relative to the number of CPU cores. Sustained above core count → server overloaded.
RAM / Memory Used RAM, buffers, cache and swap. Swap usage > 0 indicates RAM shortage – Nextcloud/Collabora may need more memory.
Disk I/O Read and write rate per device, latency. High latency on /srv/nextcloud-www or /data slows uploads/downloads.
Disk Space Usage of filesystems. /data (user files) grows continuously – alert when > 80% used.
Network Traffic Bytes in/out per interface. Unusual spikes may indicate attacks or data leaks.

MySQL Overview (ID 7362) – MariaDB Database Metrics

Shows the performance of the Nextcloud database (host MariaDB, port 9104).

PanelWhat to read
Queries per Second (QPS) Database activity. Sudden spikes may indicate inefficient queries or attacks.
Slow Queries Queries that take longer than long_query_time (1 s). Sustained > 0 → missing indices or insufficient hardware.
InnoDB Buffer Pool Cache utilization and hit ratio. Hit ratio < 95% → buffer pool too small, many read accesses go to disk (slow).
Connections Active and maximum connections. Approaching max → increase connection limit or optimize queries.
Table Locks / Threads running Lock contention and parallel queries. Many waiting locks = transaction problem.
Aborted Connections Interrupted connections (timeout, network errors). Persistently elevated → check network or app config.

PHP-FPM (ID 4912) – PHP Process Pool

Shows how well PHP-FPM handles incoming Nextcloud requests (K3s pod, NodePort 30254).

PanelWhat to read
Active Processes Simultaneously processing PHP workers. If this panel persistently reaches pm.max_children → enlarge pool or upgrade server.
Idle Processes Free workers. Always 0 + queue > 0 = PHP-FPM is overloaded.
Request Queue Waiting requests that cannot get a free worker. Any value > 0 means latency for the user.
Requests per Second Throughput of the PHP backend. Correlates with user numbers and sync clients.
Slow Requests PHP requests that take longer than request_slowlog_timeout (5 s). Indicates inefficient Nextcloud operations (e.g. large file scans).
Max Active (Peak) Highest value since FPM started. Helps with sizing pm.max_children.
Grafana Dashboard Variables Are Initialized Automatically Dashboard 1860 (Node Exporter Full) is in the modern Grafana 11 format without an __inputs section. The variables ds_prometheus, job, nodename and node are therefore set after each import via Python script by the common_grafana task. Without this step all panels show "No data". If a dashboard is empty: re-run the playbook or manually select the dropdown variables at the top of the dashboard in Grafana.

mysqld_exporter – Check Status

# Service status
systemctl status mysqld_exporter

# Logs
journalctl -u mysqld_exporter -n 50

# Fetch metrics directly (test)
curl -s http://127.0.0.1:9104/metrics | grep mysql_up
Adjust Grafana Hostname When Switching Scenario When the server is switched from Nextcloud to WordPress (blog), the grafana_proxy_hostname in host_vars/<host>/vars.yml must be changed to the blog hostname – otherwise the /grafana/ path points to the wrong ingress.

Updates

Operating System (AlmaLinux 9)

Security updates are automatically applied daily by dnf-automatic. For a full system update:

  1. ssh -p 10022 root@server
    dnf upgrade -y
  2. Check if a reboot is needed:
    needs-restarting -r
  3. If needed: reboot the server. K3s starts automatically after the reboot, pods come back up on their own.
    reboot
    # Check afterwards:
    k3s kubectl get pods -n nextcloud

Container Update: Patch Within the Same Tag

When Docker Hub updates e.g. nextcloud:33-fpm from 33.0.2 to 33.0.3 and the tag in Ansible remains unchanged:

  1. Pull new image manually:
    crictl pull docker.io/library/nextcloud:33-fpm
  2. Restart pod (now uses the freshly pulled image):
    k3s kubectl rollout restart deployment/nextcloud -n nextcloud
    k3s kubectl rollout status  deployment/nextcloud -n nextcloud --timeout=300s
  3. Ensure DB migrations:
    ansible-playbook nextcloud-k3s.yml --limit test --tags next_config --ask-vault-pass

Container Update: Major Version Jump (e.g. 33 → 34)

Backup before every major update! A failed DB schema upgrade can damage the installation. Always run nextcloud-backup.sh first.
  1. Change tag in inventory/group_vars/all.yml:
    nextcloud_image_fpm: "nextcloud:34-fpm"
    # if applicable also:
    nextcloud_image_collabora: "collabora/code:25.08"
  2. Run playbook – K3s pulls the new image automatically and rolls out the pod:
    ansible-playbook nextcloud-k3s.yml --limit test --ask-vault-pass
  3. The Nextcloud container runs occ upgrade automatically on startup. The playbook run then executes the next_config role (DB indices, WOPI config, app settings).
  4. Check admin panel for warnings:
    k3s kubectl exec -n nextcloud deployment/nextcloud -c nextcloud-fpm -- \
      runuser -u www-data -- php /var/www/html/occ setupchecks
Why No Web Updater? The built-in Nextcloud web updater (admin panel → update button) does not work in container setups. It would attempt to overwrite PHP files in-place – but in this setup, the app files come from the Docker image. Always update via image tag change.

Where to Check for New Versions

ComponentURL
Nextcloudhttps://hub.docker.com/_/nextcloud/tags (filter: *-fpm)
Collaborahttps://hub.docker.com/r/collabora/code/tags
nginxhttps://hub.docker.com/_/nginx/tags (filter: *-alpine)
nginx-ingress (F5) Helmhttps://artifacthub.io/packages/helm/nginx-stable/nginx-ingress
cert-manager Helmhttps://artifacthub.io/packages/helm/cert-manager/cert-manager

Backup & Restore

What Is Backed Up

WhatPath (server)Destination (local)Method
User data /data /data/nextcloud/<env>/data/ rsync
App files /srv/nextcloud-www /data/nextcloud/<env>/nextcloud/ rsync (data/ excluded)
Database MariaDB <env>_nextcloud (host) /data/nextcloud/<env>/db.sql mysqldump --single-transaction
MariaDB runs on the host – not in the container The Nextcloud DB user is restricted to 'nextcloud'@'10.42.%' (pod IPs only). mysqldump therefore runs as root via Unix socket. The credentials are in /root/.my.cnf (created by Ansible). The backup script uses this connection automatically.

Run Backup

The script expects the instance name as an argument. The instance controls which server is contacted and where the backup is stored locally.

InstanceCommandLocal destination
Test server ./scripts/nextcloud-backup.sh test /data/nextcloud/test/
Sofie ./scripts/nextcloud-backup.sh sofie /data/nextcloud/sofie/
CVJM ./scripts/nextcloud-backup.sh cvjm /data/nextcloud/cvjm/
# Example: back up test instance
cd ~/www_k3s
./scripts/nextcloud-backup.sh test

# Process (applies to all instances):
# 1. Enable maintenance mode  (kubectl exec occ maintenance:mode --on)
# 2. mysqldump --single-transaction → /data/nextcloud/<env>/db.sql
# 3. rsync /srv/nextcloud-www/      → /data/nextcloud/<env>/nextcloud/
# 4. rsync /data/                   → /data/nextcloud/<env>/data/
# 5. Disable maintenance mode

The script is located in the repository directory scripts/nextcloud-backup.sh. Sofie and CVJM are only usable once IP and hostname are entered in inventory/host_vars/<env>/vars.yml – the script checks this and aborts with an error message if changeme is still present.

Restore Workflow

The restore script also expects the instance name as an argument:

InstanceCommandSource
Test server ./scripts/nextcloud-restore.sh test /data/nextcloud/test/
Sofie ./scripts/nextcloud-restore.sh sofie /data/nextcloud/sofie/
CVJM ./scripts/nextcloud-restore.sh cvjm /data/nextcloud/cvjm/
  1. Set up server (infrastructure) – replace test with instance name:
    ansible-playbook nextcloud-k3s.yml --limit test --ask-vault-pass
  2. Run restore script (overwrites the fresh installation):
    cd ~/www_k3s
    ./scripts/nextcloud-restore.sh test

    Process: enable maintenance → stop pod → rsync app files → rsync user data → SQL import → fix ownership → start pod → files:scan → disable maintenance

  3. Run Ansible again (ensures occ config: trusted_domains, WOPI, apps):
    ansible-playbook nextcloud-k3s.yml --limit test --ask-vault-pass

    This step is mandatory on a hostname change, otherwise recommended.

Manual Individual Commands (Reference)

# Maintenance mode
k3s kubectl exec -n nextcloud deployment/nextcloud -c nextcloud-fpm -- \
  runuser -u www-data -- php /var/www/html/occ maintenance:mode --on

# Database dump (on the server)
mysqldump --single-transaction <db_name> > /tmp/nc.sql

# Fix ownership after manual rsync
chown -R 33:33 /srv/nextcloud-www /data
chmod 0755 /srv/nextcloud-www && chmod 0770 /data

# Rebuild file index
k3s kubectl exec -n nextcloud deployment/nextcloud -c nextcloud-fpm -- \
  runuser -u www-data -- php /var/www/html/occ files:scan --all
Do not change instanceid after restore The instanceid in config.php is set by Nextcloud itself on first start and links the app files with the appdata directory (/data/appdata_<instanceid>/). After a restore from backup it is automatically correct – never overwrite it manually.

Troubleshooting

Pod Does Not Start / Stays in CrashLoopBackOff

k3s kubectl describe pod -l app=nextcloud -n nextcloud
k3s kubectl logs deployment/nextcloud -c nextcloud-fpm -n nextcloud --previous

Typical causes:

Nextcloud Page Loads But CSS/JS Is Missing (Page Looks Broken)

# Check MIME types of assets:
curl -sI "https://nextcloud.example.com/core/css/server.css" | grep content-type
# Must be: text/css

curl -sI "https://nextcloud.example.com/dist/core-main.js" | grep content-type
# Must be: application/javascript

In the past, the cause was a types {} block in the nginx server context that overrode all inherited MIME types. The current configuration uses a dedicated location block for .mjs.

Admin Warning: "Reverse Proxy Header Is Incorrect"

trusted_proxies is missing from custom.config.php or does not contain the K3s pod CIDR (10.42.0.0/16).

k3s kubectl exec -n nextcloud deployment/nextcloud -c nextcloud-fpm -- \
  runuser -u www-data -- php /var/www/html/occ config:system:get trusted_proxies

Collabora: "Unauthorized WOPI Host"

The wopi_allowlist does not contain the pod CIDR (10.42.x.x). Collabora pods connect to Nextcloud WOPI from their pod IP.

k3s kubectl exec -n nextcloud deployment/nextcloud -c nextcloud-fpm -- \
  runuser -u www-data -- php /var/www/html/occ \
  config:app:get richdocuments wopi_allowlist

# Fix:
k3s kubectl exec -n nextcloud deployment/nextcloud -c nextcloud-fpm -- \
  runuser -u www-data -- php /var/www/html/occ \
  config:app:set richdocuments wopi_allowlist \
  --value="10.42.0.0/16,10.43.0.0/16,collabora.example.de,82.165.165.230"

Certificate Is Not Issued

# Check cert-manager logs:
k3s kubectl logs -n cert-manager deployment/cert-manager --tail=50

# Check certificate object:
k3s kubectl describe certificate nextcloud-tls -n nextcloud

# ClusterIssuer status:
k3s kubectl describe clusterissuer letsencrypt-prod

Common cause: port 80 is blocked by iptables or the provider. cert-manager requires port 80 for the HTTP-01 challenge.

CronJob Is Not Running

# Status of recent jobs:
k3s kubectl get cronjob nextcloud-cron -n nextcloud
k3s kubectl get jobs -n nextcloud --sort-by=.metadata.creationTimestamp | tail -5

# Logs of the last job pod:
k3s kubectl logs -n nextcloud -l job-name=nextcloud-cron-<id>

Security

Multi-Layered Protection Strategy

LayerMeasureRole
Host network iptables IPv4/IPv6 – default policy DROP, port scan blocker, ICMP rate limit, only 80/443/10022 open common_firewall
SSH Port 10022, key-only, hardened sshd_config, Fail2Ban common_ssh, common_fail2ban
HTTP Rate limiting login (5r/s), HSTS, security headers common_k3s (nginx-ingress (F5) Helm values + ingress annotations)
Database User restricted to pod CIDR (10.42.%), MariaDB bind 0.0.0.0 + iptables next_mariadb, common_firewall
Redis Bind to localhost + host IP, iptables on pod CIDR next_redis, common_firewall
SELinux Enforcing, container_file_t for HostPath volumes next_selinux
Audit auditd with hardening rules (sudo, SSH, cron, kernel modules) common_auditd
Rootkits rkhunter daily scan at 03:15 common_rkhunter
Patches dnf-automatic: security updates applied daily automatically common_dnf_automatic
Secrets Ansible Vault, K8s secrets (Opaque), no_log on sensitive tasks all roles

Firewall in Detail

The script /root/fw/fw_ipv4.sh (from common_firewall) is applied via @reboot cron job after every restart. K3s then inserts its own KUBE-* chains at the top of the INPUT/OUTPUT chains.

FeatureDetail
Default policy DROP INPUT, OUTPUT and FORWARD are set to DROP immediately after the flush – no window during which unprotected traffic passes through
Port scan blocker Any packet that does not match an ACCEPT rule adds the source IP to /proc/net/xt_recent/portscan. On the next packet from that IP, --rcheck fires at the top → 24h block. Check: cat /proc/net/xt_recent/portscan
ICMP rate limit Echo-request max. 5/s, burst 10 – protects against ICMP flood from many sources. Existing ping sessions continue via ESTABLISHED.
No server_ipv4 whitelist The former rule -s server_ipv4 -j ACCEPT was removed – external packets with spoofed server IP were accepted by it. Local traffic safely goes over loopback (-i lo -j ACCEPT).
K3s OUTPUT CIDR OUTPUT explicitly allows traffic to pod CIDR (10.42.0.0/16) and service CIDR (10.43.0.0/16) – needed for kubelet probes, kube-proxy, metrics server. Without these rules, K3s-internal port 10250 breaks.
IPv6 DROP policy ip6tables starts directly with policy DROP, ICMPv6 rate-limited, echo-reply only for ESTABLISHED/RELATED
# Monitor blocked packets live
journalctl -k -f --grep "IPTables-Dropped"

# Show port scan blocklist
cat /proc/net/xt_recent/portscan | awk '{print $1}'

# Reload firewall rules (happens automatically after reboot)
/root/fw/fw_ipv4.sh && /root/fw/fw_ipv6.sh

Important Security Notes

Never commit vault.yml unencrypted Before every git commit check: ansible-vault encrypt inventory/host_vars/test/vault.yml
MariaDB bind to 0.0.0.0 MariaDB listens on all interfaces so that pods can connect via the host IP. Protection is provided exclusively by iptables (only pod CIDR allowed) and the DB user grant ('nextcloud'@'10.42.%'). Never expose port 3306 publicly.
StrictHostKeyChecking=no in ansible.cfg Disables host key verification when establishing SSH connections. Suitable for test servers with changing IPs, but not recommended for production servers. For production: set StrictHostKeyChecking=accept-new.