Make internal networks not overlay networks #62

Closed
opened 2021-05-07 09:26:10 +00:00 by decentral1se · 12 comments
Owner

Apps in stacks shouldn't be able to see other apps in other stacks.

We're suspecting that internal is not quite so internal after all.

/cc @3wordchant

Apps in stacks shouldn't be able to see other apps in other stacks. We're suspecting that `internal` is not quite so internal after all. /cc @3wordchant
decentral1se added this to the (deleted) milestone 2021-05-07 09:26:10 +00:00
decentral1se added the
security
label 2021-05-07 09:26:10 +00:00
Author
Owner

image

From https://docs.docker.com/compose/compose-file/compose-file-v3/#driver-1

And...

image

From https://docs.docker.com/network/bridge/

Makes me think this is what we wanted all along?

networks:
  proxy:
    external: true
  internal:
    driver: bridge
![image](/attachments/76b513f7-7db0-4e63-ad03-af83ce88e031) From https://docs.docker.com/compose/compose-file/compose-file-v3/#driver-1 And... ![image](/attachments/bcf713e0-09ae-4ea0-b39e-66858b92b486) From https://docs.docker.com/network/bridge/ Makes me think this is what we wanted all along? ```yaml networks: proxy: external: true internal: driver: bridge ```
Owner

Makes me think this is what we wanted all along?

Sounds legit! Let's try it 👌

noises of everything breaking

> Makes me think this is what we wanted all along? Sounds legit! Let's try it 👌 *noises of everything breaking*
Author
Owner

Oh sheyat, I saw @roxxers also used this https://docs.docker.com/compose/compose-file/compose-file-v3/#internal in the mastodon configs! Another thing to ponder on!

Oh sheyat, I saw @roxxers also used this https://docs.docker.com/compose/compose-file/compose-file-v3/#internal in the mastodon configs! Another thing to ponder on!
Author
Owner

I tried:

internal:
  driver: bridge

And saw:

failed to create service foo: Error response from daemon: The network foo_internal cannot be used with services. Only networks scoped to the swarm can be used, such as those created with the overlay driver.

internal: true did work.

I tried: ```yaml internal: driver: bridge ``` And saw: > failed to create service foo: Error response from daemon: The network foo_internal cannot be used with services. Only networks scoped to the swarm can be used, such as those created with the overlay driver. `internal: true` did work.
Owner

I ran into this issue with coop-cloud/levelfly -- if it's on the same swarm box with another app that has a db container, it eventually loses track of which one it's meant to point at.

From app:

$ ping db
PING db (10.0.5.2) 56(84) bytes of data.
...

In db:

$ ip addr
...
16975: eth0@if16976: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1450 qdisc noqueue state UP 
    link/ether 02:42:0a:00:05:03 brd ff:ff:ff:ff:ff:ff
    inet 10.0.5.3/24 brd 10.0.5.255 scope global eth0
       valid_lft forever preferred_lft forever
...

Changing the internal network to both internal: true, and driver: bridge, but redeploying doesn't seem to do anything on a regular deploy; I got the same error message with driver: bridge when I did undeploy then re-deploy. internal: true has gone through OK, and I see it showing up as a bridge network now. This problem seems to take a while to crop up, so I'll check back in in a day or so.

If that fixes it, recommend we mass-update (and probably redeploy..) all our apps.

I ran into this issue with coop-cloud/levelfly -- if it's on the same swarm box with another app that has a `db` container, it eventually loses track of which one it's meant to point at. From `app`: ``` $ ping db PING db (10.0.5.2) 56(84) bytes of data. ... ``` In `db`: ``` $ ip addr ... 16975: eth0@if16976: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1450 qdisc noqueue state UP link/ether 02:42:0a:00:05:03 brd ff:ff:ff:ff:ff:ff inet 10.0.5.3/24 brd 10.0.5.255 scope global eth0 valid_lft forever preferred_lft forever ... ``` Changing the `internal` network to both `internal: true`, and `driver: bridge`, but redeploying doesn't seem to do anything on a regular `deploy`; I got the same error message with `driver: bridge` when I did `undeploy` then re-`deploy`. `internal: true` has gone through OK, and I see it showing up as a `bridge` network now. This problem seems to take a while to crop up, so I'll check back in in a day or so. If that fixes it, recommend we mass-update (and probably redeploy..) all our apps.
Author
Owner

OK nice @3wordchant! I've been low-key using internal: true on all things I package since and it has been doing what I think we originally intended. So, this sounds like the move alright. I think the mass-update would have to include a deletion of the unused overlay network to close the loop. Probably some network ls -f ... machinery can pull out only the overlay networks for a deletion command in a script.

OK nice @3wordchant! I've been low-key using `internal: true` on all things I package since and it has been doing what I think we originally intended. So, this sounds like the move alright. I think the mass-update would have to include a deletion of the unused overlay network to close the loop. Probably some `network ls -f ...` machinery can pull out only the overlay networks for a deletion command in a script.
decentral1se changed title from Investigate overlay internal network issue to Use `internal: true` to create bridge networks instead of overlay networks 2021-07-05 12:00:38 +00:00
Author
Owner

OK, I'm gonna try write a small wrapper script which helps automate these migrations over in https://git.autonomic.zone/coop-cloud/tyop. I've done enough mass typos to know that we'll be doing these as long as this project exists :)

OK, I'm gonna try write a small wrapper script which helps automate these migrations over in https://git.autonomic.zone/coop-cloud/tyop. I've done enough mass typos to know that we'll be doing these as long as this project exists :)
decentral1se self-assigned this 2021-07-06 11:23:37 +00:00
Author
Owner

Done! tyop is pretty broken but I think it will do the job 🚀

(I might have missed some, so keep an eye out for this)

https://git.autonomic.zone/coop-cloud/tyop

Done! `tyop` is pretty broken but I think it will do the job 🚀 (I might have missed some, so keep an eye out for this) https://git.autonomic.zone/coop-cloud/tyop
Author
Owner

Actually, we got this totally wrong. internal: true is still making overlay networks and in fact, it means that the container has no access to the internet! Not just the stacks on the other machines. We need to make a bridge network that is isolated from the other stacks but still has internet access. The saga continues. A new mass update coming soon...

Actually, we got this totally wrong. `internal: true` is still making overlay networks and in fact, it means that the container has no access to the internet! Not just the stacks on the other machines. We need to make a bridge network that is isolated from the other stacks but still has internet access. The saga continues. A new mass update coming soon...
decentral1se changed title from Use `internal: true` to create bridge networks instead of overlay networks to Make `internal` networks not overlay networks 2021-07-14 15:23:12 +00:00
Author
Owner
> https://docs.docker.com/compose/compose-file/compose-file-v3/#name-1 ![image](/attachments/a7e6c76c-4bde-498d-8e29-9245516ec9c4) Maybe!
Author
Owner

Summary is then:

  • Docker doesn't give us this stack level network encapsulation we need without some serious work arounds which are not workable for us (if new apps get added to the proxy network then Traefik needs to be restarted).

  • internal: true seems required for some apps perhaps (levelfly) but not for all. So it seems like reverting that mass update would be the right thing to do now.

  • Any container on the proxy network can see the rest and there is not much we can do about that! When we reference services in configs on those proxy networks, we need to do prefixing to avoid namespace conflicts (e.g. ${STACK_NAME}_app / {{ env "STACK_NAME" }}_app

  • We should write out some docs on how we understand our networking setup to be working.

I'll try squash this and then close this off once and for all 🙈

Summary is then: - Docker doesn't give us this stack level network encapsulation we need without some serious work arounds which are not workable for us (if new apps get added to the proxy network then Traefik needs to be restarted). - `internal: true` seems required for some apps perhaps (levelfly) but not for all. So it seems like reverting that mass update would be the right thing to do now. - Any container on the proxy network can see the rest and there is not much we can do about that! When we reference services in configs on those proxy networks, we need to do prefixing to avoid namespace conflicts (e.g. `${STACK_NAME}_app` / `{{ env "STACK_NAME" }}_app` - We should write out some docs on how we understand our networking setup to be working. I'll try squash this and then close this off once and for all 🙈
Author
Owner

Mass revert has taken place!

Annnddd https://docs.coopcloud.tech/networking/.

Mass revert has taken place! Annnddd https://docs.coopcloud.tech/networking/.
Sign in to join this conversation.
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: coop-cloud/organising#62
No description provided.