Skip to content

Implement length modifiers for printf %n conv spec #1278

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

GabrielRavier
Copy link
Collaborator

@GabrielRavier GabrielRavier commented Sep 4, 2024

The C Standard specifies that, when a conversion specification specifies
a conversion specifier of n, the type of the passed pointer is specified
by the length modifier (if any), i.e. that e.g. the argument for %hhn is
of type signed char *, but Cosmopolitan currently does not handle this -
instead always simply assuming that the pointer always points to an int.

This patch implements, and tests, length modifiers with the n conversion
specifier, with the tests testing all of the available length modifiers.

@github-actions github-actions bot added the libc label Sep 4, 2024
@GabrielRavier GabrielRavier force-pushed the feat/printf-add-length-modifiers-n-conversion-specifier branch from b26edcc to a54634b Compare September 4, 2024 01:19
libc/stdio/fmt.c Outdated
@@ -1125,7 +1125,34 @@ int __fmt(void *fn, void *arg, const char *format, va_list va, int *wrote) {
}
break;
case 'n':
*va_arg(va, int *) = *wrote;
// this technically invokes undefined behavior by using the wrong types
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this UB? Cosmopolitan only supports an LP64 data model. Until that changes, I would assume this is kosher. Just remove this whole comment if you agree.

Where you're more likely to run into undefined behavior is if the user passes a pointer that isn't aligned. Does the standard say anything about this? My guess is we should probably do nothing, and just let UBSAN report it to the user.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The actual sizes of the data types involved don't really matter w.r.t. the C Standard's definition of illegal aliasing. Even if sizeof(long) == sizeof(long long), aliasing the two (i.e. using a pointer of one type to access an object of the other) still invokes undefined behavior, so that the data model is LP64 doesn't matter from that POV, and compilers will take advantage of this from what I know (though here it appears unrealistic, at least for now, that they would be able to statically analyse the behavior of printf functions all the way through to this part of __fmt to such an extent).

The standard makes no provision for storing to an unaligned pointer, and as such I expect passing an unaligned pointer to %n is UB, and no libc I know of makes any provisions for those, so I agree UBSAN is sufficient for this.

I suppose the concern is small enough that the comment is not particularly useful though, so I'll remove it.

PS: well, I would guess is there's a 50% chance or so that by 2030 some compiler will throw out random printf calls because of this and this conversation will look very stupid but for now it's fine, I suppose.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's UB because different platforms define them differently. We don't support any environments where sizeof(long) != sizeof(long long) yet so it isn't an issue for us. We also only support three compilers too. We know they won't do anything funny. If it can be proven that gcc, clang, or chibicc can do what you're describing, then I'll care.

You'd be proud of the kprintf() implementation though. It actually does have a separate case for long long since I was smarter and more experienced when I wrote that. But it's not worth refactoring printf() to do something like that unless there's tangible issues.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aliasing long and long long is UB regardless of the platform, regardless of whether sizeof(long) == sizeof(long long), sizeof(long) < sizeof(long long) or sizeof(long) > sizeof(long long), the way the standard sees it, the size of the type has absolutely nothing to do with whether or not it aliases another type and long and long long do not alias, regardless of their size.

This can be easily shown for both GCC and Clang: https://godbolt.org/z/e1naq8zqY (if long and long long could alias, one would expect that program to either fail the assertion or print 3 - you can see it prints 2 on both GCC and Clang, and if you add -fno-strict-aliasing to their options, you can see it prints 3 when strict aliasing is deactivated)

Of course, as mentioned earlier in practice this isn't an issue (at least for now), this is just to clarify the way the standard specifies these things.

@GabrielRavier GabrielRavier force-pushed the feat/printf-add-length-modifiers-n-conversion-specifier branch 2 times, most recently from f009d23 to 9329b26 Compare September 6, 2024 17:33
The C Standard specifies that, when a conversion specification specifies
a conversion specifier of n, the type of the passed pointer is specified
by the length modifier (if any), i.e. that e.g. the argument for %hhn is
of type signed char *, but Cosmopolitan currently does not handle this -
instead always simply assuming that the pointer always points to an int.

This patch implements, and tests, length modifiers with the n conversion
specifier, with the tests testing all of the available length modifiers.
@GabrielRavier GabrielRavier force-pushed the feat/printf-add-length-modifiers-n-conversion-specifier branch from 9329b26 to a46ad53 Compare September 7, 2024 02:23
@jart jart merged commit c66abd7 into jart:master Sep 7, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants