This updating seems like it’s a solved problem within CDNs. The updates should be atomic: serve the old file as long as the new one isn’t completely available.
Well I only assume that the non existing file might lead to an error page. CDNs are designed for files which have an unique filename and to be cached at the nodes. However since DB files never change the names but their content we have to purge them on a regular basis. Even just a millisecond might create the error.
There is also a way to pre-fetch files from the file server if they are important. However due to the high requests the CDN explicitly told us to stop pre-fetching and let the system fetch when needed, otherwise our requests DDOS the nodes. That already happend many times with the Arch AUR server in the past due to the huge demand of Manjaro users fetching that file.
Arch is updating the file every 5 minutes, we fetch it every 10 minutes and redistribute it thru the CDN network. Even there we create a lot of traffic as already documented. So I can check if there is a solution on backend side or pamac has to jump thru hoops.
That’s why people use a “create a temporary file and rename over old one” approach in those cases…
(But yea true it’s a lot harder todo that on a CDN distribution from the mirror sides i guess)
Wont work as you have to purge the cached file to get the updated on. The file server works as a regular storage system. There we have no issue to distribute the normal files via regular rsync. This is the regular way as we also do with normal mirrors. What ever happens within the CDN network is out of our control.
- We provide the files to the storage via rsync
- We keep the CERTs up-to-date
- We maintain the cron jobs on our Servers to instruct the API to purge DB files and other files as needed which have the same name in the server structure
All the magic with quirks happens on the CDN side which we can trigger with API cmds.
Just trying to help, keep that in mind
rsync should already be using the
rename over old file technique
Do you use the
--delay-updates option? if not try that to see if it makes a difference.
(If you on your end already do use that option, check with the CDN if they do too)
I’m just wondering what you actually mean by that:
Are you removing the DB-files before you upload a new version?
If so why??
Maybe switch to systemd-timers and stop the timer/service while your side sets up things and re-start it afterwards when you are done.
That way the job in the cron entry won’t have any chance to fire in-between your work.
Just like inhibiting the sleep on a system
Again im just trying to help and find-out where things go wrong
Well, the issue is this: If the file doesn’t change its name, like AUR DBs, you have to purge them from the CDN nodes so they cache the new files. Otherwise the nodes will keep the filename as nothing changed for them. After chatting with the support a little we gained the following information:
- they will never show an error page for a file that’s being purged, instead CDN will either serve the old version or fetch the new one from the origin if the purge is done. There are no more than two outcomes in this situation
- the error page is only triggered when the file is not cached and the origin returns 404 when fetching the file
- 404 responses are returned through the CNAME used and with our own certificate we also use for manjaro.org
They are pretty confident whatever happened has nothing to do with error pages.That leave us with changes on the SSL transition, so we check any recent changes to the SNI …
Seraching the forums for “https://aur.manjaro.org/packages-meta-ext-v1.json.gz” returned several threads. I see the tag “unstable” is applied here, I am on Manjaro stable (KDE).
Error messages thrown by
pamac upgrade --force-refresh are not consistent, results vary (at least) between
Socket I/O timed out and
Unacceptable TLS certificate.
A direct download of the package via browser is fast and fine, no issues or warnings.
Checking the SSL certificate (SSL Server Test: aur.manjaro.org (Powered by Qualys SSL Labs)) takes a long time, more than 85 sec.
Hope this info can help with finding the cause.
There might be some issue with libsoup3, which got introduced with Gnome 43 and sha256 CERTs as @codesardine pointed out once. Similar to this one: PBB radio: Handle SSL certificate error with untrusted certificate (#128) · Issues · Goodvibes / Goodvibes · GitLab
(I have changed your bullets into numbered ones for ease of responding)
I would check what they claim in (1), because what they intend might not be the reality.
(They are human system admins like all of us )
Because maybe your server returns a proper 404 page (3), but they present the user with their own 404 page (2), because yours returned a 404 code.
In other words their code:
- Tries the fetch the requested resource from their cache which fails because you purged it (1).
[Not serving the old version]
- Then tries to fetch it from the source but gets a 404-code from your server (3).
- Then returns their own 404 page with bad certificate. (2)
[Not proxing your 404-page]
Sure they can be “pretty confident” and try to wave away their own fault as first defensive reaction, but that happens with all problems any consumer has right?
So i would suggest to check what they claim for yourself using test files:
- Upload test files and purging them on the CDN without uploading a new one.
- Removing the test files on your side after being uploaded to the CDN, then purge them on the CDN again.
I just checked(visual view in browser) the certificate on manjaro.org and did some digging on the internet and found:
Maybe it is related to your main domain name being put in the
CN instead of Manjaro GmbH & Co KG which trips the implementation that checks the certificates?
(Just thinking out loud here )
No. We only sync from our server to their storage server from which the nodes fetches the files. Also the purges happen within the CDN network. We only trigger those from our end via API. Main Domain-Name shouldn’t be an issue in that case when put into
CN. The certificate we used on different servers too and it worked just fine. There are some nodes @guinux found issues with when using pamac as a client. One node is in London, the other I can’t ping …
Ok then the hunt continues i guess
There has to be something more. When I posted the message,
pamac was consistently giving the error (while curl and firefox retrieved the same file without complains). I also knew 3 more persons around the globe with the same problem. I still have the “problem”, so I don’t think this is just bad luck hitting a server that is in “transition”.
Interestingly, my laptop also presents the same behavior at the same time. So probably is something connection related. I also did some test with
tcpdump and with
pamac and at some point I got this error from my router:
ICMP6, time exceeded in-transit for 2a02:6ea0:c500::3, length 92
That address is one of the IPv6 addresses for aur.manjaro.org. I’m not sure if this means something or is just the normal behavior for this TLS error.
I had the exact same combination of elements. The TLS error started appearing a short while after I’d had a set of packages where PKGBUILD not found was causing them to fail (a load of
dotnet stuff in my case). I ended up doing the following as a workaround.
- I first did the refresh -
pamac upgrade --force-refresh- This didn’t work for me.
- I ran an upgrade with yay -
pamac is still moaning about the TLS certificate, but
yay doesn’t and let me happily upgrade my AUR packages.
yay also fixed the PKGBUILD not found error (I assume there’s an inter-relationship in the dotnet upgrade packages for 7.x that
yay can handle but
pamac can’t, from reading above).
If anyone else can verify this as a viable workaround, as I’m still a bit noob-ish here, while the bug is being looked at that would be great - I’ve been stuck with this error for about a week now.
If you’re happy enough with yay, disable AUR in pamac and let yay handle those packages.
Nevermind, I just went to /var/lib/pacman and deleted the db.lck file and that resolved it
Worked for me once before.
Also, what worked for me this time:
sudo pacman -S ca-certificates
Reinstalling this often fixes this error, which also seems to affect other package managers like Flatpak, see htxxx://itsfoss[.]com/unacceptable-tls-certificate-error-linux/ and htxxx://bbs.archlinux[.]org/viewtopic.php?id=185014
Not amount of force-refresh fixed this for me, I believe for those it did it was just luck. These two solutions most likely fix 90% of occurrences in any given package manager. Maybe some operation performed by the reinstall of ca-certificates should be done by pamac when encountering this error?
I also don’t recommend using yay for updating packages since it also interacts with pacman. Pamac is smart and knows what Manjaro needs and what packages need to be switched out during updates, I’ve learned that after breaking a lot of installs by regularly updating with yay
Censored the links since I wanted to post references to help out but apparently I’m not allowed to post links…
We looked a little deeper into the issue. Seems our CDN provider staples OCSP from our and theirs. If we ignore that request it works: - Options: OCSP status request[ignored],` When the other is found it doesn’t match our domain and pamac errors out …
Tried running the
sudo pacman -S ca-certificates but it doesn’t solve my issue.
I did it behind a VPN and without it but it’s the same. I can’t update anything from AUR right now.
Is there a way to do this on my local?
If you have this problem, there is nothing you can do right now. Devs have to update pamac. But you can use something like
yay if you are in a hurry. The problem also will go away by itself. It can take from a few hours to a few days.