Add a nicer fallback page to Traefik #115
Labels
No Label
abra
abra-gandi
awaiting-feedback
backups
bug
build
ci/cd
community organising
contributing
coopcloud.tech
democracy
design
documentation
duplicate
enhancement
finance
funding
good first issue
help wanted
installer
kadabra
performance
proposal
question
recipes.coopcloud.tech
security
test
wontfix
No Milestone
No project
No Assignees
3 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: coop-cloud/organising#115
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Currently, once you've successfully deployed Traefik, unless you've enabled the dashboard (and sometimes, even if you have), the default "is it up" check fails.
Then, all URLs, except the Traefik dashboard (if it's configured) give SSL errors.
Wondering if it's possible to have a default page up as a fallback? Could be tricky with SSL.
Thanks to @timewarp for the report
Yes! Would be such a huge usability win.
First guess: additional nginx "sidecar" service running inside the traefik stack which holds the root domain with a nice "this is co-op cloud" page (like Cloudron?). This would be disabled if you run
compose.headless.yml
with traefik. It looks like we can set a priority to a rule (see docs) so the root domain could always be overruled by a "real" app?@decentral1se did you ever get the custom error pages working? I made an attempt at
a805f5de26
(not working yet) before finding this. Would be useful to be able to display nicer errors when an app is down...@mayel didn't, sadly. Please let us know if you get anywhere. Would be amazing to have this sorted.
I also tried & failed. Maybe it'd be easier with Caddy? #388
found a way with caddy to display an error when an app is down, guess it needs to be translated to use with docker labels:
For traefik maybe this plugin can be used: https://github.com/jdel/staticresponse
Here's another attempt using that plugin, not sure what I'm missing to get it working: https://git.coopcloud.tech/coop-cloud/traefik/src/branch/error-messages-attempt/compose.error-pages.yml
Some victories! 🎉
Working:
This is using the same
tarampampam/error-pages
image as Mayel's first attempt.@mayel, what @decentral1se and I found is that Traefik's custom error pages aren't used for anything unless the error
middleware
is assigned to an entrypoint or a router. This, of course, isn't mentioned in their documentation 🤬It's possible to add an error
middleware
to theweb-secure
entrypoint (using e.g.entrypoints.web-secure.http.middlewares
infile-provider.yml
BUT this has the very unfortunate side-effect of overriding all app error pages, including ones that are meant to be machine-readable, or convey app-specific useful information.It feels like Traefik should have a way to say "only override Traefik's own bare-bones default error pages", but I'm pretty sure it doesn't (as per plaintive post on Traefik forums with zero replies in 2 years).
So, I found a custom Traefik plugin called
traefik-error-pages
which allows conditionally overriding errors only if they're blank, hacked it to make showing errors conditional on an error response body matching specified text, and configured it to only over-ride 502s which exactly match "Bad Gateway".This is a bit of a cursed roundabout way of targetting Traefik's built-in error (there are no Traefik-specific headers or other content which would allow but if it matches anything else then it's guaranteed to not be JSON or be including any more useful details.
404s are handled using the same low-priority-router approach from Mayel's attempts.
What's not yet working is showing a nice page instead of SSL errors (e.g. when Traefik is waiting for a newly-deployed app's healthcheck to pass, or when someone visits a random nonexistent domain/subdomain that points to the Co-op Cloud server).
My best suggestion for waiting-for-healthcheck apps is having a separate daemon with access to the Docker socket that listens for
health: starting
apps and creates temporary services with low-priority Traefik routing rules to display errors. I think it might be worth breaking this out into a separate ticket.I can't think of any reasonable way of solving "random nonexistent domain/subdomain" right now 🤔 Even switching to Caddy wouldn't be a super-easy fix; the project that I'm involved in that switched to using on-demand SSL started facing rate-limiting issues from ZeroSSL / LetsEncrypt pretty quickly, due to folks scanning for subdomains. Another separate ticket for this case, maybe, or declaring defeat on it for now?
Last steps before closing this ticket, maybe:
traefik-error-page
changes upstream (@decentral1se pls halp with code review if you have a sec? no idea what I'm doing with golang, etc)jdel/staticresponse
)Mega-thanks @mayel and @decentral1se for helping push on this 💪
Unbelievable plumbing work here @3wordchant 👷♀️
I can find literally 0 docs on this but I thought that traefik will wait for the healthcheck to work (status: healthy) before trying to auto-tls. when the healthcheck is starting, it's 404. when the healthcheck is up and it's auto-tls'in, it's 301? could we also hijack these codes inside the traefik middleware?
Is this not also 301 response hijacking?
I can't believe I'm saying this but I could co-hack on this again one day soon...
As an example, if someone runs
abra app deploy
a newgitea
instance (the recipe has healthcheck onapp
) and hammers F5 in a browser on https://git.example.com, they'll see:status: healthy
)Or a new
wallabag
(no healthcheck), hitting https://wallabag.example.comI'm not aware of a situation where it's 301, unless you mean 301 from
http
tohttps
, in which case I think Traefik will currently do that conditionally on all URLS (e.g. http://foobar.coopcloud.tech). Halp?🤯
Fack this is gruelling. Thanks for that explanation. It does seem like this is just a real time sink and we could try to zoom back out again and do a design sprint on a "web app" portal type app which can provide a lot more information and be the stepping stone to the actual web interface. Because we could speak to traefik via the API or even just inspect the swarm ourselves for information about what's going on... to discuss!
Yeah, this is the only way I can think of to be able to do anything except an SSL error while an app is deploying.
I still think it's worth finishing off this ticket to abolish the unstyled confusing Traefik errors that pop up, though the SSL errors definitely seem like what people are more likely to run into more often.