Enforce per-user quota on /tmp/ and /dev/shm/ as user logs in (#36010)

There's finally quota on tmpfs, hence let's use it to make it harder for
users to DoS the system by consuming all disk space in /tmp/ and
/dev/shm/.

This enforces a default limit of 80% quota of the backing fs for these
two dirs for users, but this can be overriden in the user record, if
desired.

This also adds two other interesting features:

1. mount units gain GracefulOptions= which takes optional mount options
that are added only if supported by the kernel. (this is used to enable
usrquota on /tmp/, if available.)
2. The PAM logic in service management now supports reading passwords
from service credentials and via the askpw logic. This used for make
testing easy (so that we can run0 into a homed user which strictly
requires a password).
This commit is contained in:
Lennart Poettering
2025-01-24 12:52:27 +01:00
committed by GitHub
13 changed files with 445 additions and 70 deletions

2
README
View File

@@ -61,9 +61,11 @@ REQUIREMENTS:
≥ 5.9 for close_range()
≥ 5.12 for idmapped mount
≥ 5.14 for cgroup.kill
≥ 5.14 for quotactl_fd()
≥ 6.3 for MFD_EXEC/MFD_NOEXEC_SEAL and tmpfs noswap option
≥ 6.5 for name_to_handle_at() AT_HANDLE_FID, SO_PEERPIDFD/SO_PASSPIDFD,
and MOVE_MOUNT_BENEATH
≥ 6.6 for quota support on tmpfs
≥ 6.9 for pidfs
✅ systemd utilizes several new kernel APIs, but will fall back gracefully

8
TODO
View File

@@ -304,10 +304,6 @@ Features:
* pcrlock: add support for multi-profile UKIs
* logind: when logging in use new tmpfs quota support to configure quota on
/tmp/ + /dev/shm/. But do so only in case of tmpfs, because otherwise quota
is persistent and any persistent settings mean we don#t have to reapply them.
* initrd: when transitioning from initrd to host, validate that
/lib/modules/`uname -r` exists, refuse otherwise
@@ -1480,8 +1476,6 @@ Features:
* rework recursive read-only remount to use new mount API
* PAM: pick up authentication token from credentials
* when mounting disk images: if IMAGE_ID/IMAGE_VERSION is set in os-release
data in the image, make sure the image filename actually matches this, so
that images cannot be misused.
@@ -1548,7 +1542,6 @@ Features:
- pass creds via keyring?
- pass creds via memfd?
- acquire + decrypt creds from pkcs11?
- make PAMName= acquire pw via creds logic
- make macsec code in networkd read key via creds logic (copy logic from
wireguard)
- make gatewayd/remote read key via creds logic
@@ -2414,7 +2407,6 @@ Features:
- maybe make automatic, read-only, time-based reflink-copies of LUKS disk
images (and btrfs snapshots of subvolumes) (think: time machine)
- distinguish destroy / remove (i.e. currently we can unregister a user, unregister+remove their home directory, but not just remove their home directory)
- in systemd's PAMName= logic: query passwords with ssh-askpassword, so that we can make "loginctl set-linger" mode work
- fingerprint authentication, pattern authentication, …
- make sure "classic" user records can also be managed by homed
- make size of $XDG_RUNTIME_DIR configurable in user record

View File

@@ -619,6 +619,19 @@ is allowed to edit.
`selfModifiablePrivileged` → Similar to `selfModifiableFields`, but it lists fields in
the `privileged` section that the user is allowed to edit.
`tmpLimit` → A numeric value encoding a disk quota limit in bytes enforced on
`/tmp/` on login, in case it is backed by volatile file system (such as
`tmpfs`).
`tmpLimitScale` → Similar, but encodes a relative value, normalized to
`UINT32_MAX` as 100%. This value is applied relative to the file system
size. If both `tmpLimit` and `tmpLimitScale` are set, the lower of the two
should be enforced. If neither field is set the implementation might apply a
default limit.
`devShmLimit`, `devShmLimitScale` → Similar to the previous two, but apply to
`/dev/shm/` rather than `/tmp/`.
`privileged` → An object, which contains the fields of the `privileged` section
of the user record, see below.
@@ -761,22 +774,26 @@ These two are the only two fields specific to this section.
All other fields that may be used in this section are identical to the equally named ones in the
`regular` section (i.e. at the top-level object). Specifically, these are:
`blobDirectory`, `blobManifest`, `iconName`, `location`, `shell`, `umask`, `environment`, `timeZone`,
`preferredLanguage`, `additionalLanguages`, `niceLevel`, `resourceLimits`, `locked`, `notBeforeUSec`,
`notAfterUSec`, `storage`, `diskSize`, `diskSizeRelative`, `skeletonDirectory`,
`accessMode`, `tasksMax`, `memoryHigh`, `memoryMax`, `cpuWeight`, `ioWeight`,
`blobDirectory`, `blobManifest`, `iconName`, `location`, `shell`, `umask`,
`environment`, `timeZone`, `preferredLanguage`, `additionalLanguages`,
`niceLevel`, `resourceLimits`, `locked`, `notBeforeUSec`, `notAfterUSec`,
`storage`, `diskSize`, `diskSizeRelative`, `skeletonDirectory`, `accessMode`,
`tasksMax`, `memoryHigh`, `memoryMax`, `cpuWeight`, `ioWeight`,
`mountNoDevices`, `mountNoSuid`, `mountNoExecute`, `cifsDomain`,
`cifsUserName`, `cifsService`, `cifsExtraMountOptions`, `imagePath`, `uid`,
`gid`, `memberOf`, `fileSystemType`, `partitionUuid`, `luksUuid`,
`fileSystemUuid`, `luksDiscard`, `luksOfflineDiscard`, `luksCipher`,
`luksCipherMode`, `luksVolumeKeySize`, `luksPbkdfHashAlgorithm`,
`luksPbkdfType`, `luksPbkdfForceIterations`, `luksPbkdfTimeCostUSec`, `luksPbkdfMemoryCost`,
`luksPbkdfParallelThreads`, `luksSectorSize`, `autoResizeMode`, `rebalanceWeight`,
`rateLimitIntervalUSec`, `rateLimitBurst`, `enforcePasswordPolicy`,
`autoLogin`, `preferredSessionType`, `preferredSessionLauncher`, `stopDelayUSec`, `killProcesses`,
`luksPbkdfType`, `luksPbkdfForceIterations`, `luksPbkdfTimeCostUSec`,
`luksPbkdfMemoryCost`, `luksPbkdfParallelThreads`, `luksSectorSize`,
`autoResizeMode`, `rebalanceWeight`, `rateLimitIntervalUSec`, `rateLimitBurst`,
`enforcePasswordPolicy`, `autoLogin`, `preferredSessionType`,
`preferredSessionLauncher`, `stopDelayUSec`, `killProcesses`,
`passwordChangeMinUSec`, `passwordChangeMaxUSec`, `passwordChangeWarnUSec`,
`passwordChangeInactiveUSec`, `passwordChangeNow`, `pkcs11TokenUri`,
`fido2HmacCredential`, `selfModifiableFields`, `selfModifiableBlobs`, `selfModifiablePrivileged`.
`fido2HmacCredential`, `selfModifiableFields`, `selfModifiableBlobs`,
`selfModifiablePrivileged`, `tmpLimit`, `tmpLimitScale`, `devShmLimit`,
`devShmLimitScale`.
## Fields in the `binding` section

View File

@@ -758,6 +758,22 @@
<xi:include href="version-info.xml" xpointer="v245"/></listitem>
</varlistentry>
<varlistentry>
<term><option>--tmp-limit=<replaceable>BYTES</replaceable></option></term>
<term><option>--tmp-limit=<replaceable>PERCENT</replaceable></option></term>
<term><option>--dev-shm-limit=<replaceable>BYTES</replaceable></option></term>
<term><option>--dev-shm-limit=<replaceable>PERCENT</replaceable></option></term>
<listitem><para>Controls the per-user quota on <filename>/tmp/</filename> and
<filename>/dev/shm/</filename> that is applied when the user logs in. Takes either an absolute value
in bytes (with the usual K, M, G, T suffixes to the base of 1024), or a percentage. In the latter
case the limit is applied relative to the size of the respective file system. This limit is only
applied if the relevant file system is <literal>tmpfs</literal> and has no effect otherwise. Note
that if these options are not used, a default quota might still be enforced (typically 80%.)</para>
<xi:include href="version-info.xml" xpointer="v258"/></listitem>
</varlistentry>
<varlistentry>
<term><option>--storage=<replaceable>STORAGE</replaceable></option></term>

View File

@@ -42,12 +42,13 @@
<citerefentry><refentrytitle>systemd.special</refentrytitle><manvolnum>7</manvolnum></citerefentry> for a
list of units that form the basis of the unit hierarchies of system and user units.</para>
<para><filename>user@<replaceable>UID</replaceable>.service</filename> is accompanied by the
system unit <filename>user-runtime-dir@<replaceable>UID</replaceable>.service</filename>, which
creates the user's runtime directory
<filename>/run/user/<replaceable>UID</replaceable></filename>, and then removes it when this
unit is stopped. <filename>user-runtime-dir@<replaceable>UID</replaceable>.service</filename>
executes the <filename>systemd-user-runtime-dir</filename> binary to do the actual work.</para>
<para><filename>user@<replaceable>UID</replaceable>.service</filename> is accompanied by the system unit
<filename>user-runtime-dir@<replaceable>UID</replaceable>.service</filename>, which creates the user's
runtime directory <filename>/run/user/<replaceable>UID</replaceable></filename> when started, and removes
it when it is stopped. It also might apply runtime quota settings on <filename>/tmp/</filename> and/or
<filename>/dev/shm/</filename> for the
user. <filename>user-runtime-dir@<replaceable>UID</replaceable>.service</filename> executes the
<filename>systemd-user-runtime-dir</filename> binary to do the actual work.</para>
<para>User processes may be started by the <filename>user@.service</filename> instance, in which
case they will be part of that unit in the system hierarchy. They may also be started elsewhere,

View File

@@ -9,6 +9,9 @@
int parse_devnum(const char *s, dev_t *ret);
#define DEVNUM_MAJOR_MAX ((UINT32_C(1) << 12) - 1U)
#define DEVNUM_MINOR_MAX ((UINT32_C(1) << 20) - 1U)
/* glibc and the Linux kernel have different ideas about the major/minor size. These calls will check whether the
* specified major is valid by the Linux kernel's standards, not by glibc's. Linux has 20bits of minor, and 12 bits of
* major space. See MINORBITS in linux/kdev_t.h in the kernel sources. (If you wonder why we define _y here, instead of
@@ -18,14 +21,14 @@ int parse_devnum(const char *s, dev_t *ret);
#define DEVICE_MAJOR_VALID(x) \
({ \
typeof(x) _x = (x), _y = 0; \
_x >= _y && _x < (UINT32_C(1) << 12); \
_x >= _y && _x <= DEVNUM_MAJOR_MAX; \
\
})
#define DEVICE_MINOR_VALID(x) \
({ \
typeof(x) _x = (x), _y = 0; \
_x >= _y && _x < (UINT32_C(1) << 20); \
_x >= _y && _x <= DEVNUM_MINOR_MAX; \
})
int device_path_make_major_minor(mode_t mode, dev_t devnum, char **ret);
@@ -54,3 +57,6 @@ static inline char *format_devnum(dev_t d, char buf[static DEVNUM_STR_MAX]) {
static inline bool devnum_is_zero(dev_t d) {
return major(d) == 0 && minor(d) == 0;
}
#define DEVNUM_TO_PTR(u) ((void*) (uintptr_t) (u))
#define PTR_TO_DEVNUM(p) ((dev_t) ((uintptr_t) (p)))

View File

@@ -2830,6 +2830,9 @@ static int help(int argc, char *argv[], void *userdata) {
" --memory-max=BYTES Set maximum memory limit\n"
" --cpu-weight=WEIGHT Set CPU weight\n"
" --io-weight=WEIGHT Set IO weight\n"
" --tmp-limit=BYTES|PERCENT Set limit on /tmp/\n"
" --dev-shm-limit=BYTES|PERCENT\n"
" Set limit on /dev/shm/\n"
"\n%4$sStorage User Record Properties:%5$s\n"
" --storage=STORAGE Storage type to use (luks, fscrypt, directory,\n"
" subvolume, cifs)\n"
@@ -2978,6 +2981,8 @@ static int parse_argv(int argc, char *argv[]) {
ARG_PROMPT_NEW_USER,
ARG_AVATAR,
ARG_LOGIN_BACKGROUND,
ARG_TMP_LIMIT,
ARG_DEV_SHM_LIMIT,
};
static const struct option options[] = {
@@ -3078,6 +3083,8 @@ static int parse_argv(int argc, char *argv[]) {
{ "blob", required_argument, NULL, 'b' },
{ "avatar", required_argument, NULL, ARG_AVATAR },
{ "login-background", required_argument, NULL, ARG_LOGIN_BACKGROUND },
{ "tmp-limit", required_argument, NULL, ARG_TMP_LIMIT },
{ "dev-shm-limit", required_argument, NULL, ARG_DEV_SHM_LIMIT },
{}
};
@@ -4511,6 +4518,56 @@ static int parse_argv(int argc, char *argv[]) {
break;
}
case ARG_TMP_LIMIT:
case ARG_DEV_SHM_LIMIT: {
const char *field =
c == ARG_TMP_LIMIT ? "tmpLimit" :
c == ARG_DEV_SHM_LIMIT ? "devShmLimit" : NULL;
const char *field_scale =
c == ARG_TMP_LIMIT ? "tmpLimitScale" :
c == ARG_DEV_SHM_LIMIT ? "devShmLimitScale" : NULL;
assert(field);
assert(field_scale);
if (isempty(optarg)) {
r = drop_from_identity(field);
if (r < 0)
return r;
r = drop_from_identity(field_scale);
if (r < 0)
return r;
break;
}
r = parse_permyriad(optarg);
if (r < 0) {
uint64_t u;
r = parse_size(optarg, 1024, &u);
if (r < 0)
return log_error_errno(r, "Failed to parse %s/%s parameter: %s", field, field_scale, optarg);
r = sd_json_variant_set_field_unsigned(&arg_identity_extra, field, u);
if (r < 0)
return log_error_errno(r, "Failed to set %s field: %m", field);
r = drop_from_identity(field_scale);
if (r < 0)
return r;
} else {
r = sd_json_variant_set_field_unsigned(&arg_identity_extra, field_scale, UINT32_SCALE_FROM_PERMYRIAD(r));
if (r < 0)
return log_error_errno(r, "Failed to set %s field: %m", field_scale);
r = drop_from_identity(field);
if (r < 0)
return r;
}
break;
}
case '?':
return -EINVAL;

View File

@@ -8,15 +8,20 @@
#include "bus-error.h"
#include "bus-locator.h"
#include "dev-setup.h"
#include "devnum-util.h"
#include "fd-util.h"
#include "format-util.h"
#include "fs-util.h"
#include "label-util.h"
#include "limits-util.h"
#include "main-func.h"
#include "missing_magic.h"
#include "missing_syscall.h"
#include "mkdir-label.h"
#include "mount-util.h"
#include "mountpoint-util.h"
#include "path-util.h"
#include "quota-util.h"
#include "rm-rf.h"
#include "selinux-util.h"
#include "smack-util.h"
@@ -24,6 +29,7 @@
#include "string-util.h"
#include "strv.h"
#include "user-util.h"
#include "userdb.h"
static int acquire_runtime_dir_properties(uint64_t *ret_size, uint64_t *ret_inodes) {
_cleanup_(sd_bus_error_free) sd_bus_error error = SD_BUS_ERROR_NULL;
@@ -92,39 +98,58 @@ static int user_mkdir_runtime_path(
uid, gid, runtime_dir_size, runtime_dir_inodes,
mac_smack_use() ? ",smackfsroot=*" : "");
_cleanup_free_ char *d = strdup(runtime_path);
if (!d)
return log_oom();
r = mkdir_label(runtime_path, 0700);
if (r < 0 && r != -EEXIST)
return log_error_errno(r, "Failed to create %s: %m", runtime_path);
_cleanup_(rmdir_and_freep) char *destroy = TAKE_PTR(d); /* auto-destroy */
r = mount_nofollow_verbose(LOG_DEBUG, "tmpfs", runtime_path, "tmpfs", MS_NODEV|MS_NOSUID, options);
if (r < 0) {
if (!ERRNO_IS_PRIVILEGE(r)) {
log_error_errno(r, "Failed to mount per-user tmpfs directory %s: %m", runtime_path);
goto fail;
}
if (!ERRNO_IS_PRIVILEGE(r))
return log_error_errno(r, "Failed to mount per-user tmpfs directory %s: %m", runtime_path);
log_debug_errno(r,
"Failed to mount per-user tmpfs directory %s.\n"
"Assuming containerized execution, ignoring: %m", runtime_path);
r = chmod_and_chown(runtime_path, 0700, uid, gid);
if (r < 0) {
log_error_errno(r, "Failed to change ownership and mode of \"%s\": %m", runtime_path);
goto fail;
}
if (r < 0)
return log_error_errno(r, "Failed to change ownership and mode of \"%s\": %m", runtime_path);
}
destroy = mfree(destroy); /* deactivate auto-destroy */
r = label_fix(runtime_path, 0);
if (r < 0)
log_warning_errno(r, "Failed to fix label of \"%s\", ignoring: %m", runtime_path);
}
return 0;
}
fail:
/* Try to clean up, but ignore errors */
(void) rmdir(runtime_path);
return r;
static int do_mount(UserRecord *ur) {
int r;
assert(ur);
if (!uid_is_valid(ur->uid) || !gid_is_valid(ur->gid))
return log_error_errno(SYNTHETIC_ERRNO(ENOMSG), "User '%s' lacks UID or GID, refusing.", ur->user_name);
uint64_t runtime_dir_size, runtime_dir_inodes;
r = acquire_runtime_dir_properties(&runtime_dir_size, &runtime_dir_inodes);
if (r < 0)
return r;
char runtime_path[STRLEN("/run/user/") + DECIMAL_STR_MAX(uid_t)];
xsprintf(runtime_path, "/run/user/" UID_FMT, ur->uid);
log_debug("Will mount %s owned by "UID_FMT":"GID_FMT, runtime_path, ur->uid, ur->gid);
return user_mkdir_runtime_path(runtime_path, ur->uid, ur->gid, runtime_dir_size, runtime_dir_inodes);
}
static int user_remove_runtime_path(const char *runtime_path) {
@@ -139,9 +164,9 @@ static int user_remove_runtime_path(const char *runtime_path) {
/* Ignore cases where the directory isn't mounted, as that's quite possible, if we lacked the permissions to
* mount something */
r = umount2(runtime_path, MNT_DETACH);
if (r < 0 && !IN_SET(errno, EINVAL, ENOENT))
log_debug_errno(errno, "Failed to unmount user runtime directory %s, ignoring: %m", runtime_path);
r = RET_NERRNO(umount2(runtime_path, MNT_DETACH));
if (r < 0 && !IN_SET(r, -EINVAL, -ENOENT))
log_debug_errno(r, "Failed to unmount user runtime directory %s, ignoring: %m", runtime_path);
r = rm_rf(runtime_path, REMOVE_ROOT);
if (r < 0 && r != -ENOENT)
@@ -150,31 +175,6 @@ static int user_remove_runtime_path(const char *runtime_path) {
return 0;
}
static int do_mount(const char *user) {
char runtime_path[STRLEN("/run/user/") + DECIMAL_STR_MAX(uid_t)];
uint64_t runtime_dir_size, runtime_dir_inodes;
uid_t uid;
gid_t gid;
int r;
r = get_user_creds(&user, &uid, &gid, NULL, NULL, 0);
if (r < 0)
return log_error_errno(r,
r == -ESRCH ? "No such user \"%s\"" :
r == -ENOMSG ? "UID \"%s\" is invalid or has an invalid main group"
: "Failed to look up user \"%s\": %m",
user);
r = acquire_runtime_dir_properties(&runtime_dir_size, &runtime_dir_inodes);
if (r < 0)
return r;
xsprintf(runtime_path, "/run/user/" UID_FMT, uid);
log_debug("Will mount %s owned by "UID_FMT":"GID_FMT, runtime_path, uid, gid);
return user_mkdir_runtime_path(runtime_path, uid, gid, runtime_dir_size, runtime_dir_inodes);
}
static int do_umount(const char *user) {
char runtime_path[STRLEN("/run/user/") + DECIMAL_STR_MAX(uid_t)];
uid_t uid;
@@ -198,6 +198,126 @@ static int do_umount(const char *user) {
return user_remove_runtime_path(runtime_path);
}
static int apply_tmpfs_quota(
char **paths,
uid_t uid,
uint64_t limit,
uint32_t scale) {
_cleanup_set_free_ Set *processed = NULL;
int r;
assert(uid_is_valid(uid));
STRV_FOREACH(p, paths) {
_cleanup_close_ int fd = open(*p, O_DIRECTORY|O_CLOEXEC);
if (fd < 0) {
log_warning_errno(errno, "Failed to open '%s' in order to set quota, ignoring: %m", *p);
continue;
}
struct stat st;
if (fstat(fd, &st) < 0) {
log_warning_errno(errno, "Failed to stat '%s' in order to set quota, ignoring: %m", *p);
continue;
}
/* Cover for bind mounted or symlinked /var/tmp/ + /tmp/ */
if (set_contains(processed, DEVNUM_TO_PTR(st.st_dev))) {
log_debug("Not setting quota on '%s', since already processed.", *p);
continue;
}
/* Remember we already dealt with this fs, even if the subsequent operation fails, since
* there's no point in appyling quota twice, regardless if it succeeds or not. */
if (set_ensure_put(&processed, /* hash_ops= */ NULL, DEVNUM_TO_PTR(st.st_dev)) < 0)
return log_oom();
struct statfs sfs;
if (fstatfs(fd, &sfs) < 0) {
log_warning_errno(errno, "Failed to statfs '%s' in order to set quota, ignoring: %m", *p);
continue;
}
if (!is_fs_type(&sfs, TMPFS_MAGIC)) {
log_debug("Not setting quota on '%s', since not tmpfs.", *p);
continue;
}
struct dqblk req;
r = RET_NERRNO(quotactl_fd(fd, QCMD_FIXED(Q_GETQUOTA, USRQUOTA), uid, &req));
if (r == -ESRCH)
zero(req);
else if (ERRNO_IS_NEG_NOT_SUPPORTED(r)) {
log_debug_errno(r, "No UID quota support on %s, not setting quota: %m", *p);
continue;
} else if (ERRNO_IS_NEG_PRIVILEGE(r)) {
log_debug_errno(r, "Lacking privileges to query UID quota on %s, not setting quota: %m", *p);
continue;
} else if (r < 0) {
log_warning_errno(r, "Failed to query disk quota on %s for UID " UID_FMT ", ignoring: %m", *p, uid);
continue;
}
uint64_t v =
(scale == 0) ? 0 :
(scale == UINT32_MAX) ? UINT64_MAX :
(uint64_t) ((double) (sfs.f_blocks * sfs.f_frsize) / scale * UINT32_MAX);
v = MIN(v, limit);
v /= QIF_DQBLKSIZE;
if (FLAGS_SET(req.dqb_valid, QIF_BLIMITS) && v == req.dqb_bhardlimit) {
/* Shortcut things if everything is set up properly already */
log_debug("Configured quota on '%s' already matches the intended setting, not updating quota.", *p);
continue;
}
req.dqb_valid = QIF_BLIMITS;
req.dqb_bsoftlimit = req.dqb_bhardlimit = v;
r = RET_NERRNO(quotactl_fd(fd, QCMD_FIXED(Q_SETQUOTA, USRQUOTA), uid, &req));
if (r == -ESRCH) {
log_debug_errno(r, "Not setting UID quota on %s since UID quota is not supported: %m", *p);
continue;
} else if (ERRNO_IS_NEG_PRIVILEGE(r)) {
log_debug_errno(r, "Lacking privileges to set UID quota on %s, skipping: %m", *p);
continue;
} else if (r < 0) {
log_warning_errno(r, "Failed to set disk quota on %s for UID " UID_FMT ", ignoring: %m", *p, uid);
continue;
}
log_info("Successfully configured disk quota for UID " UID_FMT " on %s to %s", uid, *p, FORMAT_BYTES(v * QIF_DQBLKSIZE));
}
return 0;
}
static int do_tmpfs_quota(UserRecord *ur) {
int r;
assert(ur);
if (user_record_is_root(ur)) {
log_debug("Not applying tmpfs quota to root user.");
return 0;
}
if (!uid_is_valid(ur->uid))
return log_error_errno(SYNTHETIC_ERRNO(ENOMSG), "User '%s' lacks UID, refusing.", ur->user_name);
r = apply_tmpfs_quota(STRV_MAKE("/tmp", "/var/tmp"), ur->uid, ur->tmp_limit.limit, user_record_tmp_limit_scale(ur));
if (r < 0)
return r;
r = apply_tmpfs_quota(STRV_MAKE("/dev/shm"), ur->uid, ur->dev_shm_limit.limit, user_record_dev_shm_limit_scale(ur));
if (r < 0)
return r;
return 0;
}
static int run(int argc, char *argv[]) {
int r;
@@ -206,7 +326,10 @@ static int run(int argc, char *argv[]) {
if (argc != 3)
return log_error_errno(SYNTHETIC_ERRNO(EINVAL),
"This program takes two arguments.");
if (!STR_IN_SET(argv[1], "start", "stop"))
const char *verb = argv[1], *user = argv[2];
if (!STR_IN_SET(verb, "start", "stop"))
return log_error_errno(SYNTHETIC_ERRNO(EINVAL),
"First argument must be either \"start\" or \"stop\".");
@@ -216,10 +339,26 @@ static int run(int argc, char *argv[]) {
if (r < 0)
return r;
if (streq(argv[1], "start"))
return do_mount(argv[2]);
if (streq(argv[1], "stop"))
return do_umount(argv[2]);
if (streq(verb, "start")) {
_cleanup_(user_record_unrefp) UserRecord *ur = NULL;
r = userdb_by_name(user, USERDB_PARSE_NUMERIC|USERDB_SUPPRESS_SHADOW, &ur);
if (r == -ESRCH)
return log_error_errno(r, "User '%s' does not exist: %m", user);
if (r < 0)
return log_error_errno(r, "Failed to resolve user '%s': %m", user);
/* We do two things here: mount the per-user XDG_RUNTIME_DIR, and set up tmpfs quota on /tmp/
* and /dev/shm/. */
r = 0;
RET_GATHER(r, do_mount(ur));
RET_GATHER(r, do_tmpfs_quota(ur));
return r;
}
if (streq(verb, "stop"))
return do_umount(user);
assert_not_reached();
}

View File

@@ -7,6 +7,7 @@
#include "hashmap.h"
#include "hexdecoct.h"
#include "path-util.h"
#include "percent-util.h"
#include "pretty-print.h"
#include "process-util.h"
#include "rlimit-util.h"
@@ -54,6 +55,26 @@ static void show_self_modifiable(
printf("%13s %s\n", i == value ? heading : "", *i);
}
static void show_tmpfs_limit(const char *tmpfs, const TmpfsLimit *limit, uint32_t scale) {
assert(tmpfs);
assert(limit);
if (!limit->is_set)
return;
printf(" %s Limit:", tmpfs);
if (limit->limit != UINT64_MAX)
printf(" %s", FORMAT_BYTES(limit->limit));
if (limit->limit == UINT64_MAX || limit->limit_scale != UINT32_MAX) {
if (limit->limit != UINT64_MAX)
printf(" or");
printf(" %i%%", UINT32_SCALE_TO_PERCENT(scale));
}
printf("\n");
}
void user_record_show(UserRecord *hr, bool show_full_group_info) {
_cleanup_strv_free_ char **langs = NULL;
const char *hd, *ip, *shell;
@@ -368,6 +389,9 @@ void user_record_show(UserRecord *hr, bool show_full_group_info) {
if (hr->io_weight != UINT64_MAX)
printf(" IO Weight: %" PRIu64 "\n", hr->io_weight);
show_tmpfs_limit("TMP", &hr->tmp_limit, user_record_tmp_limit_scale(hr));
show_tmpfs_limit("SHM", &hr->dev_shm_limit, user_record_dev_shm_limit_scale(hr));
if (hr->access_mode != MODE_INVALID)
printf(" Access Mode: 0%03o\n", user_record_access_mode(hr));

View File

@@ -15,6 +15,7 @@
#include "locale-util.h"
#include "memory-util.h"
#include "path-util.h"
#include "percent-util.h"
#include "pkcs11-util.h"
#include "rlimit-util.h"
#include "sha256.h"
@@ -95,6 +96,8 @@ UserRecord* user_record_new(void) {
.drop_caches = -1,
.auto_resize_mode = _AUTO_RESIZE_MODE_INVALID,
.rebalance_weight = REBALANCE_WEIGHT_UNSET,
.tmp_limit = TMPFS_LIMIT_NULL,
.dev_shm_limit = TMPFS_LIMIT_NULL,
};
return h;
@@ -982,6 +985,40 @@ static int dispatch_rebalance_weight(const char *name, sd_json_variant *variant,
return 0;
}
static int dispatch_tmpfs_limit(const char *name, sd_json_variant *variant, sd_json_dispatch_flags_t flags, void *userdata) {
TmpfsLimit *limit = ASSERT_PTR(userdata);
int r;
if (sd_json_variant_is_null(variant)) {
*limit = TMPFS_LIMIT_NULL;
return 0;
}
r = sd_json_dispatch_uint64(name, variant, flags, &limit->limit);
if (r < 0)
return r;
limit->is_set = true;
return 0;
}
static int dispatch_tmpfs_limit_scale(const char *name, sd_json_variant *variant, sd_json_dispatch_flags_t flags, void *userdata) {
TmpfsLimit *limit = ASSERT_PTR(userdata);
int r;
if (sd_json_variant_is_null(variant)) {
*limit = TMPFS_LIMIT_NULL;
return 0;
}
r = sd_json_dispatch_uint32(name, variant, flags, &limit->limit_scale);
if (r < 0)
return r;
limit->is_set = true;
return 0;
}
static int dispatch_privileged(const char *name, sd_json_variant *variant, sd_json_dispatch_flags_t flags, void *userdata) {
static const sd_json_dispatch_field privileged_dispatch_table[] = {
@@ -1275,6 +1312,10 @@ static int dispatch_per_machine(const char *name, sd_json_variant *variant, sd_j
{ "selfModifiableFields", SD_JSON_VARIANT_ARRAY, sd_json_dispatch_strv, offsetof(UserRecord, self_modifiable_fields), SD_JSON_STRICT },
{ "selfModifiableBlobs", SD_JSON_VARIANT_ARRAY, sd_json_dispatch_strv, offsetof(UserRecord, self_modifiable_blobs), SD_JSON_STRICT },
{ "selfModifiablePrivileged", SD_JSON_VARIANT_ARRAY, sd_json_dispatch_strv, offsetof(UserRecord, self_modifiable_privileged), SD_JSON_STRICT },
{ "tmpLimit", _SD_JSON_VARIANT_TYPE_INVALID, dispatch_tmpfs_limit, offsetof(UserRecord, tmp_limit), 0, },
{ "tmpLimitScale", _SD_JSON_VARIANT_TYPE_INVALID, dispatch_tmpfs_limit_scale, offsetof(UserRecord, tmp_limit), 0, },
{ "devShmLimit", _SD_JSON_VARIANT_TYPE_INVALID, dispatch_tmpfs_limit, offsetof(UserRecord, dev_shm_limit), 0, },
{ "devShmLimitScale", _SD_JSON_VARIANT_TYPE_INVALID, dispatch_tmpfs_limit_scale, offsetof(UserRecord, dev_shm_limit), 0, },
{},
};
@@ -1625,6 +1666,10 @@ int user_record_load(UserRecord *h, sd_json_variant *v, UserRecordLoadFlags load
{ "selfModifiableFields", SD_JSON_VARIANT_ARRAY, sd_json_dispatch_strv, offsetof(UserRecord, self_modifiable_fields), SD_JSON_STRICT },
{ "selfModifiableBlobs", SD_JSON_VARIANT_ARRAY, sd_json_dispatch_strv, offsetof(UserRecord, self_modifiable_blobs), SD_JSON_STRICT },
{ "selfModifiablePrivileged", SD_JSON_VARIANT_ARRAY, sd_json_dispatch_strv, offsetof(UserRecord, self_modifiable_privileged), SD_JSON_STRICT },
{ "tmpLimit", _SD_JSON_VARIANT_TYPE_INVALID, dispatch_tmpfs_limit, offsetof(UserRecord, tmp_limit), 0, },
{ "tmpLimitScale", _SD_JSON_VARIANT_TYPE_INVALID, dispatch_tmpfs_limit_scale, offsetof(UserRecord, tmp_limit), 0, },
{ "devShmLimit", _SD_JSON_VARIANT_TYPE_INVALID, dispatch_tmpfs_limit, offsetof(UserRecord, dev_shm_limit), 0, },
{ "devShmLimitScale", _SD_JSON_VARIANT_TYPE_INVALID, dispatch_tmpfs_limit_scale, offsetof(UserRecord, dev_shm_limit), 0, },
{ "secret", SD_JSON_VARIANT_OBJECT, dispatch_secret, 0, 0 },
{ "privileged", SD_JSON_VARIANT_OBJECT, dispatch_privileged, 0, 0 },
@@ -2138,6 +2183,32 @@ int user_record_languages(UserRecord *h, char ***ret) {
return 0;
}
uint32_t user_record_tmp_limit_scale(UserRecord *h) {
assert(h);
if (h->tmp_limit.is_set)
return h->tmp_limit.limit_scale;
/* By default grant regular users only 80% quota */
if (user_record_disposition(h) == USER_REGULAR)
return UINT32_SCALE_FROM_PERCENT(80);
return UINT32_MAX;
}
uint32_t user_record_dev_shm_limit_scale(UserRecord *h) {
assert(h);
if (h->dev_shm_limit.is_set)
return h->dev_shm_limit.limit_scale;
/* By default grant regular users only 80% quota */
if (user_record_disposition(h) == USER_REGULAR)
return UINT32_SCALE_FROM_PERCENT(80);
return UINT32_MAX;
}
const char** user_record_self_modifiable_fields(UserRecord *h) {
/* As a rule of thumb: a setting is safe if it cannot be used by a
* user to give themselves some unfair advantage over other users on

View File

@@ -230,6 +230,19 @@ typedef enum AutoResizeMode {
#define REBALANCE_WEIGHT_MAX UINT64_C(10000)
#define REBALANCE_WEIGHT_UNSET UINT64_MAX
typedef struct TmpfsLimit {
/* Absolute and relative tmpfs limits */
uint64_t limit;
uint32_t limit_scale;
bool is_set;
} TmpfsLimit;
#define TMPFS_LIMIT_NULL \
(TmpfsLimit) { \
.limit = UINT64_MAX, \
.limit_scale = UINT32_MAX, \
} \
typedef struct UserRecord {
/* The following three fields are not part of the JSON record */
unsigned n_ref;
@@ -389,6 +402,8 @@ typedef struct UserRecord {
char **self_modifiable_blobs;
char **self_modifiable_privileged;
TmpfsLimit tmp_limit, dev_shm_limit;
sd_json_variant *json;
} UserRecord;
@@ -436,6 +451,8 @@ uint64_t user_record_rebalance_weight(UserRecord *h);
uint64_t user_record_capability_bounding_set(UserRecord *h);
uint64_t user_record_capability_ambient_set(UserRecord *h);
int user_record_languages(UserRecord *h, char ***ret);
uint32_t user_record_tmp_limit_scale(UserRecord *h);
uint32_t user_record_dev_shm_limit_scale(UserRecord *h);
const char **user_record_self_modifiable_fields(UserRecord *h);
const char **user_record_self_modifiable_blobs(UserRecord *h);

View File

@@ -121,4 +121,21 @@ TEST(devnum_format_str) {
test_devnum_format_str_one(makedev(4095, 1048575), "4095:1048575");
}
TEST(devnum_to_ptr) {
dev_t m = makedev(0, 0);
ASSERT_EQ(major(m), 0U);
ASSERT_EQ(minor(m), 0U);
ASSERT_EQ(m, PTR_TO_DEVNUM(DEVNUM_TO_PTR(m)));
m = makedev(DEVNUM_MAJOR_MAX, DEVNUM_MINOR_MAX);
ASSERT_EQ(major(m), DEVNUM_MAJOR_MAX);
ASSERT_EQ(minor(m), DEVNUM_MINOR_MAX);
ASSERT_EQ(m, PTR_TO_DEVNUM(DEVNUM_TO_PTR(m)));
m = makedev(5, 8);
ASSERT_EQ(major(m), 5U);
ASSERT_EQ(minor(m), 8U);
ASSERT_EQ(m, PTR_TO_DEVNUM(DEVNUM_TO_PTR(m)));
}
DEFINE_TEST_MAIN(LOG_INFO);

View File

@@ -652,6 +652,22 @@ getent passwd aliastest@myrealm
getent passwd aliastest2@myrealm
getent passwd aliastest3@myrealm
if findmnt -n -o options /tmp | grep -q usrquota ; then
NEWPASSWORD=quux homectl create tmpfsquota --storage=subvolume --dev-shm-limit=50K -P
run0 --property=SetCredential=pam.authtok.systemd-run0:quux -u tmpfsquota dd if=/dev/urandom of=/dev/shm/quotatestfile1 bs=1024 count=30
(! run0 --property=SetCredential=pam.authtok.systemd-run0:quux -u tmpfsquota dd if=/dev/urandom of=/dev/shm/quotatestfile2 bs=1024 count=30)
run0 --property=SetCredential=pam.authtok.systemd-run0:quux -u tmpfsquota rm /dev/shm/quotatestfile1 /dev/shm/quotatestfile2
run0 --property=SetCredential=pam.authtok.systemd-run0:quux -u tmpfsquota dd if=/dev/urandom of=/dev/shm/quotatestfile1 bs=1024 count=30
run0 --property=SetCredential=pam.authtok.systemd-run0:quux -u tmpfsquota rm /dev/shm/quotatestfile1
systemctl stop user@"$(id -u tmpfsquota)".service
wait_for_state tmpfsquota inactive
homectl remove tmpfsquota
fi
systemd-analyze log-level info
touch /testok