Basics

Guides

API Reference

Menu

Basics

Guides

API Reference

class: Regex

[87:7] extends: object

A GRegex is a compiled form of a regular expression. After instantiating a GRegex, you can use its methods to find matches in a string, replace matches within a string, or split the string at matches. GRegex implements regular expression pattern matching using syntax and semantics (such as character classes, quantifiers, and capture groups) similar to Perl regular expression. See the PCRE documentation for details. A typical scenario for regex pattern matching is to check if a string matches a pattern. The following statements implement this scenario. { .c } const char *regex_pattern = ".*GLib.*"; const char *string_to_search = "You will love the GLib implementation of regex"; g_autoptr(GMatchInfo) match_info = NULL; g_autoptr(GRegex) regex = NULL; regex = g_regex_new (regex_pattern, G_REGEX_DEFAULT, G_REGEX_MATCH_DEFAULT, NULL); g_assert (regex != NULL); if (g_regex_match (regex, string_to_search, G_REGEX_MATCH_DEFAULT, &match_info)) { int start_pos, end_pos; g_match_info_fetch_pos (match_info, 0, &start_pos, &end_pos); g_print ("Match successful! Overall pattern matches bytes %d to %d\n", start_pos, end_pos); } else { g_print ("No match!\n"); } The constructor for GRegex includes two sets of bitmapped flags: * [flags@GLib.RegexCompileFlags]—These flags control how GLib compiles the regex. There are options for case sensitivity, multiline, ignoring whitespace, etc. * [flags@GLib.RegexMatchFlags]—These flags control GRegex’s matching behavior, such as anchoring and customizing definitions for newline characters. Some regex patterns include backslash assertions, such as \d (digit) or \D (non-digit). The regex pattern must escape those backslashes. For example, the pattern "\\d\\D" matches a digit followed by a non-digit. GLib’s implementation of pattern matching includes a start_position argument for some of the match, replace, and split methods. Specifying a start position provides flexibility when you want to ignore the first n characters of a string, but want to incorporate backslash assertions at character n - 1. For example, a database field contains inconsistent spelling for a job title: healthcare provider and health-care provider. The database manager wants to make the spelling consistent by adding a hyphen when it is missing. The following regex pattern tests for the string care preceded by a non-word boundary character (instead of a hyphen) and followed by a space. ``` { .c } const char *regex_pattern = "\Bcare\s";

`start_position` 6 in the string `healthcare` or `health-care`. ``` { .c }
const char *regex_pattern = "\\Bcare\\s"; const char *string_to_search =
"healthcare provider"; g_autoptr(GMatchInfo) match_info = NULL;
g_autoptr(GRegex) regex = NULL; regex = g_regex_new ( regex_pattern,
G_REGEX_DEFAULT, G_REGEX_MATCH_DEFAULT, NULL); g_assert (regex != NULL);
g_regex_match_full ( regex, string_to_search, -1, 6, // position of 'c' in
the test string. G_REGEX_MATCH_DEFAULT, &match_info, NULL); ``` The method
[method@GLib.Regex.match_full] (and other methods implementing `start_pos`)
allow for lookback before the start position to determine if the previous
character satisfies an assertion. Unless you set the
[flags@GLib.RegexCompileFlags.RAW] as one of the `GRegexCompileFlags`, all
the strings passed to `GRegex` methods must be encoded in UTF-8. The lengths
and the positions inside the strings are in bytes and not in characters, so,
for instance, `\xc3\xa0` (i.e., `à`) is two bytes long but it is treated as a
single character. If you set `G_REGEX_RAW`, the strings can be non-valid
UTF-8 strings and a byte is treated as a character, so `\xc3\xa0` is two
bytes and two characters long. Regarding line endings, `\n` matches a `\n`
character, and `\r` matches a `\r` character. More generally, `\R` matches
all typical line endings: CR + LF (`\r\n`), LF (linefeed, U+000A, `\n`), VT
(vertical tab, U+000B, `\v`), FF (formfeed, U+000C, `\f`), CR (carriage
return, U+000D, `\r`), NEL (next line, U+0085), LS (line separator, U+2028),
and PS (paragraph separator, U+2029). The behaviour of the dot, circumflex,
and dollar metacharacters are affected by newline characters. By default,
`GRegex` matches any newline character matched by `\R`. You can limit the
matched newline characters by specifying the
[flags@GLib.RegexMatchFlags.NEWLINE_CR],
[flags@GLib.RegexMatchFlags.NEWLINE_LF], and
[flags@GLib.RegexMatchFlags.NEWLINE_CRLF] compile options, and with
[flags@GLib.RegexMatchFlags.NEWLINE_ANY],
[flags@GLib.RegexMatchFlags.NEWLINE_CR],
[flags@GLib.RegexMatchFlags.NEWLINE_LF] and
[flags@GLib.RegexMatchFlags.NEWLINE_CRLF] match options. These settings are
also relevant when compiling a pattern if
[flags@GLib.RegexCompileFlags.EXTENDED] is set and an unescaped `#` outside a
character class is encountered. This indicates a comment that lasts until
after the next newline. Because `GRegex` does not modify its internal state
between creation and destruction, you can create and modify the same `GRegex`
instance from different threads. In contrast, [struct@GLib.MatchInfo] is not
thread safe. The regular expression low-level functionalities are obtained
through the excellent [PCRE](http://www.pcre.org/) library written by Philip
Hazel.

#### Members
- **handleObj**
- **lib**
- **retainedCallbacks**
- **signalHandlerNames**
- **signalSetterHandlers**

#### Methods

- **Regex** (`Handle = null`)

	> Creates a new `Regex` by wrapping a native handle or another wrapper.

	- **@p** `Handle` is the native handle or another wrapper whose handle to adopt.


- **toNativeHandle** (`Source`)

	> Normalizes a constructor argument into a raw pointer carrier. Accepts a raw NativeHandle, a raw NativeBuffer returned from `fn.call(...)`, another generated wrapper exposing `handle()`, or null. Returns null when the argument carries no pointer.

	- **@p** `Source` is the raw handle, raw buffer, wrapper, or null.
	- **@r** `A` raw pointer carrier or null when no pointer is present.


- **getLib** ()

	> Returns the opened native library for this generated wrapper.

	- **@r** `The` opened native library.


- **handle** ()

	> Returns the wrapped NativeHandle.

	- **@r** `The` wrapped NativeHandle.


- **isNull** ()

	> Returns true when the wrapped handle is null.

	- **@r** `A` bool.


- **describe** ()

	> Returns a small string for debugging generated wrappers.

	- **@r** `A` string.