Kyle Swenson | 8d8f654 | 2021-03-15 11:02:55 -0600 | [diff] [blame^] | 1 | |
| 2 | To support containers, we now allow multiple instances of devpts filesystem, |
| 3 | such that indices of ptys allocated in one instance are independent of indices |
| 4 | allocated in other instances of devpts. |
| 5 | |
| 6 | To preserve backward compatibility, this support for multiple instances is |
| 7 | enabled only if: |
| 8 | |
| 9 | - CONFIG_DEVPTS_MULTIPLE_INSTANCES=y, and |
| 10 | - '-o newinstance' mount option is specified while mounting devpts |
| 11 | |
| 12 | IOW, devpts now supports both single-instance and multi-instance semantics. |
| 13 | |
| 14 | If CONFIG_DEVPTS_MULTIPLE_INSTANCES=n, there is no change in behavior and |
| 15 | this referred to as the "legacy" mode. In this mode, the new mount options |
| 16 | (-o newinstance and -o ptmxmode) will be ignored with a 'bogus option' message |
| 17 | on console. |
| 18 | |
| 19 | If CONFIG_DEVPTS_MULTIPLE_INSTANCES=y and devpts is mounted without the |
| 20 | 'newinstance' option (as in current start-up scripts) the new mount binds |
| 21 | to the initial kernel mount of devpts. This mode is referred to as the |
| 22 | 'single-instance' mode and the current, single-instance semantics are |
| 23 | preserved, i.e PTYs are common across the system. |
| 24 | |
| 25 | The only difference between this single-instance mode and the legacy mode |
| 26 | is the presence of new, '/dev/pts/ptmx' node with permissions 0000, which |
| 27 | can safely be ignored. |
| 28 | |
| 29 | If CONFIG_DEVPTS_MULTIPLE_INSTANCES=y and 'newinstance' option is specified, |
| 30 | the mount is considered to be in the multi-instance mode and a new instance |
| 31 | of the devpts fs is created. Any ptys created in this instance are independent |
| 32 | of ptys in other instances of devpts. Like in the single-instance mode, the |
| 33 | /dev/pts/ptmx node is present. To effectively use the multi-instance mode, |
| 34 | open of /dev/ptmx must be a redirected to '/dev/pts/ptmx' using a symlink or |
| 35 | bind-mount. |
| 36 | |
| 37 | Eg: A container startup script could do the following: |
| 38 | |
| 39 | $ chmod 0666 /dev/pts/ptmx |
| 40 | $ rm /dev/ptmx |
| 41 | $ ln -s pts/ptmx /dev/ptmx |
| 42 | $ ns_exec -cm /bin/bash |
| 43 | |
| 44 | # We are now in new container |
| 45 | |
| 46 | $ umount /dev/pts |
| 47 | $ mount -t devpts -o newinstance lxcpts /dev/pts |
| 48 | $ sshd -p 1234 |
| 49 | |
| 50 | where 'ns_exec -cm /bin/bash' calls clone() with CLONE_NEWNS flag and execs |
| 51 | /bin/bash in the child process. A pty created by the sshd is not visible in |
| 52 | the original mount of /dev/pts. |
| 53 | |
| 54 | User-space changes |
| 55 | ------------------ |
| 56 | |
| 57 | In multi-instance mode (i.e '-o newinstance' mount option is specified at least |
| 58 | once), following user-space issues should be noted. |
| 59 | |
| 60 | 1. If -o newinstance mount option is never used, /dev/pts/ptmx can be ignored |
| 61 | and no change is needed to system-startup scripts. |
| 62 | |
| 63 | 2. To effectively use multi-instance mode (i.e -o newinstance is specified) |
| 64 | administrators or startup scripts should "redirect" open of /dev/ptmx to |
| 65 | /dev/pts/ptmx using either a bind mount or symlink. |
| 66 | |
| 67 | $ mount -t devpts -o newinstance devpts /dev/pts |
| 68 | |
| 69 | followed by either |
| 70 | |
| 71 | $ rm /dev/ptmx |
| 72 | $ ln -s pts/ptmx /dev/ptmx |
| 73 | $ chmod 666 /dev/pts/ptmx |
| 74 | or |
| 75 | $ mount -o bind /dev/pts/ptmx /dev/ptmx |
| 76 | |
| 77 | 3. The '/dev/ptmx -> pts/ptmx' symlink is the preferred method since it |
| 78 | enables better error-reporting and treats both single-instance and |
| 79 | multi-instance mounts similarly. |
| 80 | |
| 81 | But this method requires that system-startup scripts set the mode of |
| 82 | /dev/pts/ptmx correctly (default mode is 0000). The scripts can set the |
| 83 | mode by, either |
| 84 | |
| 85 | - adding ptmxmode mount option to devpts entry in /etc/fstab, or |
| 86 | - using 'chmod 0666 /dev/pts/ptmx' |
| 87 | |
| 88 | 4. If multi-instance mode mount is needed for containers, but the system |
| 89 | startup scripts have not yet been updated, container-startup scripts |
| 90 | should bind mount /dev/ptmx to /dev/pts/ptmx to avoid breaking single- |
| 91 | instance mounts. |
| 92 | |
| 93 | Or, in general, container-startup scripts should use: |
| 94 | |
| 95 | mount -t devpts -o newinstance -o ptmxmode=0666 devpts /dev/pts |
| 96 | if [ ! -L /dev/ptmx ]; then |
| 97 | mount -o bind /dev/pts/ptmx /dev/ptmx |
| 98 | fi |
| 99 | |
| 100 | When all devpts mounts are multi-instance, /dev/ptmx can permanently be |
| 101 | a symlink to pts/ptmx and the bind mount can be ignored. |
| 102 | |
| 103 | 5. A multi-instance mount that is not accompanied by the /dev/ptmx to |
| 104 | /dev/pts/ptmx redirection would result in an unusable/unreachable pty. |
| 105 | |
| 106 | mount -t devpts -o newinstance lxcpts /dev/pts |
| 107 | |
| 108 | immediately followed by: |
| 109 | |
| 110 | open("/dev/ptmx") |
| 111 | |
| 112 | would create a pty, say /dev/pts/7, in the initial kernel mount. |
| 113 | But /dev/pts/7 would be invisible in the new mount. |
| 114 | |
| 115 | 6. The permissions for /dev/pts/ptmx node should be specified when mounting |
| 116 | /dev/pts, using the '-o ptmxmode=%o' mount option (default is 0000). |
| 117 | |
| 118 | mount -t devpts -o newinstance -o ptmxmode=0644 devpts /dev/pts |
| 119 | |
| 120 | The permissions can be later be changed as usual with 'chmod'. |
| 121 | |
| 122 | chmod 666 /dev/pts/ptmx |
| 123 | |
| 124 | 7. A mount of devpts without the 'newinstance' option results in binding to |
| 125 | initial kernel mount. This behavior while preserving legacy semantics, |
| 126 | does not provide strict isolation in a container environment. i.e by |
| 127 | mounting devpts without the 'newinstance' option, a container could |
| 128 | get visibility into the 'host' or root container's devpts. |
| 129 | |
| 130 | To workaround this and have strict isolation, all mounts of devpts, |
| 131 | including the mount in the root container, should use the newinstance |
| 132 | option. |