Rework cmdline printing to use unicode

The functions to retrieve and print process cmdlines were based on the
assumption that they contain printable ASCII, and everything else
should be filtered out. That assumption doesn't hold in today's world,
where people are free to use unicode everywhere.

This replaces the custom cmdline reading code with a more generic approach
using utf8_escape_non_printable_full().
For kernel threads, truncation is done on the parenthesized name, so we'll
get "[worker]", "[worker…]", …, "[w…]", "[…", "…" as we reduce the number of
available columns.

This implementation is most likely slower for very long cmdlines, but I don't
think this is very important. The common case is to have short commandlines,
and should print those properly. Absurdly long cmdlines are the exception,
which needs to be handled correctly and safely, but speed is not too important.

Fixes #12532.

v2:
- use size_t for the number of columns. This change propagates into various
  other functions that call get_process_cmdline(), increasing the size of the
  patch, but the changes are rather trivial.
This commit is contained in:
Zbigniew Jędrzejewski-Szmek
2019-05-15 11:20:26 +02:00
parent da88f542d9
commit bc28751ed2
10 changed files with 115 additions and 173 deletions

View File

@@ -257,3 +257,16 @@ static inline void *memory_startswith_no_case(const void *p, size_t sz, const ch
return (uint8_t*) p + n;
}
static inline char* str_realloc(char **p) {
/* Reallocate *p to actual size */
if (!*p)
return NULL;
char *t = realloc(*p, strlen(*p) + 1);
if (!t)
return NULL;
return (*p = t);
}