Gnome Online Accounts - connection to Nextcloud stops working

I stumbled over this a few times already: contacts or calendar silently not being synchronized.

When I’ve been getting aware of the problem by missing out on something or not having a number I left in my Nextcloud from a different device I look at gnome-control-center to find this situation:

I tap Sign In and have to provide the credentials to log in to my Nextcloud anew. After that the account works again for an unknown time before I stumble over that problem again.

I searched for the problem and found a somewhat old posting describing some background.

Thinking back with this knowledge I remember having restarted phosh.service several times on my Librem5.

Also, my desktop running the same version of PureOS has its gnome-shell crashing from time to time and I stay logged in to the system on a tty to restart the desktop by restarting gdm.service if the crash happened.

The situation is comparable except that the phone is running Phosh and the desktop Gnome.

On both systems there is the same systemd setting that might be related to the probem:

$ systemd-analyze cat-config systemd/logind.conf | grep -i killuserprocesses
#KillUserProcesses=no

Working hypothesis:

  • after restarting the gui on the phone gnome-online-accounts has problems talking to the running gnome-keyring

Does that make sense? And if so, why I’m not running into the same situation on my notebook (or do I not remember correctly the circumstances of the last crash of gnome)?

Any help to solve this would be very much appreciated. Having an up-to-date calendar and phone book is one of the few services on the Librem5 I really need to rely on.

Update:

  • checked the status of my Nextcloud account in gnome-control-center: o.k.
  • logged in to my Librem5 via ssh
  • ran systemctl restart phosh
  • checked again my Nextcloud account in gcc: broken as described above
  • via ssh
    • sudo systemctl stop phosh
    • sudo systemctl stop user@1000.service
    • sudo systemctl stop user-1000.slice
    • got disconnected, because systemd shut down my login
  • after reconnect via ssh:
    • checked that neither goa-daemons nor a keyring is running anymore
    • started phosh without having an active login (to avoid anything systemd might have left over for my actual user session) by root@pureos:~# echo /usr/bin/systemctl start phosh | at now+2m and then logging out before the job would start
  • checked again in gcc: Nextcloud account works again

This seems to support the hypothesis that the Nextcould gnome online account stops working when restarting phosh.

purism@pureos:~$ systemctl status `pgrep gnome-keyring`
Failed to get unit for PID 107225: PID 107225 does not belong to any loaded unit.

purism@pureos:~$ systemctl status `pgrep goa-daemon`
● dbus.service - D-Bus User Message Bus
     Loaded: loaded (/usr/lib/systemd/user/dbus.service; static)
     Active: active (running) since Fri 2023-04-21 10:09:01 CEST; 32min ago
TriggeredBy: ● dbus.socket
[...]

gnome-keyring doesn’t run under control of systemd, goa-daemon does run under control of systemd. goa-daemon needs to be able to communicate with gome-keyring to get the credentials of an account.

After restarting phosh.service gnome-kearing-daemon is running with a new PID while goa-daemon is still running with the same PID.

The phosh.service unit seems to stop and start gnome-keyring-daemon also. The keyring daemon being stopped and started seems to breakt goa-daemon.

Update:

gnome-keyring-daemon seems to be started from /etc/xdg/autostart/gnome-keyring-ssh.desktop when phosh is started via phosh.service (changed the exec= line to test and after restarting phosh the running daemon reflected that change).

Other installations start gnome-keyring-daemon via systemd user service (arch) or via pam (gdm).

I do not know what the root cause of the issue is, but also for me the Nextcloud connection is not stable. I wrote about that before here. In my opinion there should first be an up-to-date evolution-data-server before diving deeper into these issues. The current version on the Librem 5 is 3.38.3-1+deb11u1. The latest upstream version is 3.48.x.

Well, same here. But I came up with the idea that the last time I’ve been accected by this I restarted phosh on my phone beforehand. I can’t say for sure if this has been the cause for that incident, but it is a possible cause, because restarting phosh on the phone reproducibly breaks my Nextcloud gnome-online-account showing me the same error on the gui.

I’ll keep looking into this.

This much I’d say at this point: it seems wrong to me that restarting a service breaks any function. If there is a systemd service it should be designed in a way that stopping and starting it takes care that it will run afterwards the same way not loosing any functionality.

This is not the case with phosh.service. Restarting it breaks my nextcloud account - at least on my phone.

That leaves the following questions:

  • why does it break goa if the gnome-keyring-daemon is restarted?
  • why does’nt gnome-keyring-daemon run as a user service, also?
    • or why doesn’t goa* stuff restart the same way with phosh?
  • how can the integration be changed to make it work?

New information: it stopped working again for me and I looked briefly into it.

I killed goa-daemon and started it anew and got this message:

goa-daemon-Message: 20:25:09.352: /org/gnome/OnlineAccounts/Accounts/account_1611668450_0: Setting AttentionNeeded to TRUE because EnsureCredentials() failed with: Invalid password with username “purism” (goa-error-quark, 0): Cannot resolve hostname (goa-error-quark, 4)

I tried to resolve the hostname of my nextcloud and it didn’t resolve. This made me remember that the nameservers used when on mobile connection and sometimes later on a cabled or wifi connection also are not always what I’d expect.

I switched off the modem by hks and tried to resolve my nextcloud servers lan fqdn again: worked.

Looking at gnome-control-center the exclamation mark beside the online account vanished and I got the following message in the window running goa-daemon:

goa-daemon-Message: 20:26:51.320: /org/gnome/OnlineAccounts/Accounts/account_1611668450_0: Setting AttentionNeeded to FALSE because EnsureCredentials() succeded

In this case it seems that the NetworkManage magic didn’t prefer the nameserver provided by dhcp over the lan connection over the one that already had been configured for the mobile connection.

I’d need to look into that…

Update 2023-05-17:

NetworkManager (as set in my Librem5) adds DNS received for a newly activated connection to the end of /etc/resolv.conf.

Leaving connection to my lan leavs as the only nameservers in /etc/resolv.conf the ones of my mobile provider.

Coming back to my lan (connecting to wifi or ethernet through my dock) the nameserver for my lan gets added to the end of /etc/resolv.conf - below the nameserver of my mobile provider.

Trying to resolve the hostname for my nextcloud server the Librem5 asks the first nameserver in /etc/resolv.conf (mobile provider) which answers that the host doesn’t exist.

I’ll try to solve this issue by adding information about nameserver priority or domains the nameserver should be used for on the connection profile for my wifi.

If this doesn’t work I know that it works if dnsmasq is being used to resolve hostnames locally. But I’d prefer not to run another service.

2 Likes

Just remembered this threat topic.

I tried to put a new appointment into my calendar on the L5 running Byzantium and couldn’t - same error messages when I got home and looked into it.

I took the following approach:

a script to act on error messages found in the journal

purism@pureos:~/bin$ cat monitor-goa-daemon.sh 
#!/bin/bash
# 2024-08-23 goa-daemon sometimes stops working and the calendar doesn't sync nor save new entries
# Aug 23 09:37:33 pureos goa-daemon[1198]: /org/gnome/OnlineAccounts/Accounts/account_1611668450_0: Setting AttentionNeeded to TRUE because EnsureCredentials() failed with: Failed to retrieve credentials from the keyring (goa-error-quark, 4)
logger -t "$(basename $0)" "started: $@"
if $( echo $@ | grep 'Failed to retrieve credentials from the keyring' > /dev/null ); then
	notify-send "$(basename $0): keyring error" "executing goa-daemon --replace"
	logger -t "$(basename $0)" "keyring error" "executing goa-daemon --replace"
	nohup /usr/libexec/goa-daemon --replace >/dev/null &
else
	notify-send "$(basename $0): unknown" "$@"
	logger -t "$(basename $0)" "unknown" "$@"
fi

a systemd user service to start the script

purism@pureos:~$ cat .config/systemd/user/monitor-goa-daemon.service 
[Unit]
Description=act on goa-daemon errors found in journalctl
RequiresMountsFor=/run/user/1000

[Service]
Type=oneshot
ExecStart=/usr/bin/bash -c 'if J_MSGS=$(journalctl --no-pager -b0 --cursor-file=${XDG_RUNTIME_DIR}/monitor-goa-daemon-journal-cursor -g "Setting AttentionNeeded to TRUE" GLIB_DOMAIN=goa-daemon); then /home/purism/bin/monitor-goa-daemon.sh "$J_MSGS"; fi'
Nice=19
IOSchedulingClass=best-effort
IOSchedulingPriority=7

The journalctl call sets a cursor for the journal to the place up to where it has been read. The cursor is stored in an tmpfs and will vanish before the next boot when the call will read the journal from since the boot started and set a new cursor.

a timer to start the script regularly

purism@pureos:~$ cat .config/systemd/user/monitor-goa-daemon.timer 
[Unit]
Description=Monitor goa-daemon

[Timer]
OnCalendar=daily
Persistent=true
OnStartupSec=3m
OnUnitActiveSec=5m

[Install]
WantedBy=timers.target

activated those changes

# run as your user (probably `purism`)
systemctl --user daemon-reload
systemctl --user enable monitor-goa-daemon.timer
systemctl --user enable monitor-goa-daemon.service
systemctl --user start monitor-goa-daemon.timer

experiences

None. The phone works nice and smooth like before and the error didn’t occur, yet:

  • no problems using the calendar
  • no notification hinting that the script had been running

Update: The script fired a few times and --replaced the goa-daemon. I didn’t have problems using the calendar since I configured this. I updated the logging in the script which is reflected above. @janvlug: Still having the same problem?

3 Likes