Ruby 3.3.7p123 (2025-01-15 revision be31f993d7fa0219d85f7b3c694d454da4ecc10b)
|
A custom strpbrk implementation. More...
Go to the source code of this file.
Functions | |
const uint8_t * | pm_strpbrk (const pm_parser_t *parser, const uint8_t *source, const uint8_t *charset, ptrdiff_t length) |
Here we have rolled our own version of strpbrk. | |
A custom strpbrk implementation.
Definition in file pm_strpbrk.h.
const uint8_t * pm_strpbrk | ( | const pm_parser_t * | parser, |
const uint8_t * | source, | ||
const uint8_t * | charset, | ||
ptrdiff_t | length ) |
Here we have rolled our own version of strpbrk.
The standard library strpbrk has undefined behavior when the source string is not null-terminated. We want to support strings that are not null-terminated because pm_parse does not have the contract that the string is null-terminated. (This is desirable because it means the extension can call pm_parse with the result of a call to mmap).
The standard library strpbrk also does not support passing a maximum length to search. We want to support this for the reason mentioned above, but we also don't want it to stop on null bytes. Ruby actually allows null bytes within strings, comments, regular expressions, etc. So we need to be able to skip past them.
Finally, we want to support encodings wherein the charset could contain characters that are trailing bytes of multi-byte characters. For example, in Shift-JIS, the backslash character can be a trailing byte. In that case we need to take a slower path and iterate one multi-byte character at a time.
parser | The parser. |
source | The source to search. |
charset | The charset to search for. |
length | The maximum number of bytes to search. |
The standard library strpbrk has undefined behavior when the source string is not null-terminated. We want to support strings that are not null-terminated because pm_parse does not have the contract that the string is null-terminated. (This is desirable because it means the extension can call pm_parse with the result of a call to mmap).
The standard library strpbrk also does not support passing a maximum length to search. We want to support this for the reason mentioned above, but we also don't want it to stop on null bytes. Ruby actually allows null bytes within strings, comments, regular expressions, etc. So we need to be able to skip past them.
Finally, we want to support encodings wherein the charset could contain characters that are trailing bytes of multi-byte characters. For example, in Shift-JIS, the backslash character can be a trailing byte. In that case we need to take a slower path and iterate one multi-byte character at a time.
Definition at line 64 of file pm_strpbrk.c.