Rework cmdline printing to use unicode

The functions to retrieve and print process cmdlines were based on the assumption that they contain printable ASCII, and everything else should be filtered out. That assumption doesn't hold in today's world, where people are free to use unicode everywhere. This replaces the custom cmdline reading code with a more generic approach using utf8_escape_non_printable_full(). For kernel threads, truncation is done on the parenthesized name, so we'll get "[worker]", "[worker…]", …, "[w…]", "[…", "…" as we reduce the number of available columns. This implementation is most likely slower for very long cmdlines, but I don't think this is very important. The common case is to have short commandlines, and should print those properly. Absurdly long cmdlines are the exception, which needs to be handled correctly and safely, but speed is not too important. Fixes #12532. v2: - use size_t for the number of columns. This change propagates into various other functions that call get_process_cmdline(), increasing the size of the patch, but the changes are rather trivial.
2026-04-14 16:37:19 +09:00 · 2019-05-15 11:20:26 +02:00
parent da88f542d9
commit bc28751ed2
10 changed files with 115 additions and 173 deletions
--- a/src/basic/string-util.h
+++ b/src/basic/string-util.h
@@ -257,3 +257,16 @@ static inline void *memory_startswith_no_case(const void *p, size_t sz, const ch

        return (uint8_t*) p + n;
 }
+
+static inline char* str_realloc(char **p) {
+        /* Reallocate *p to actual size */
+
+        if (!*p)
+                return NULL;
+
+        char *t = realloc(*p, strlen(*p) + 1);
+        if (!t)
+                return NULL;
+
+        return (*p = t);
+}