xdev.patterns module¶
An encapsulation of regex and glob (and maybe other) patterns.
Note
This implementation is maintained in kwutil and xdev. These versions should be kept in sync.
- See:
~/code/kwutil/kwutil/util_pattern.py ~/code/xdev/xdev/patterns.py
Todo
rectify with xdev / whatever package this goes in
- xdev.patterns._maybe_expandable_glob(pat)[source]¶
Determine if a string might be a expandable glob pattern by looking for special glob characters: *, ? and [].
Note
! is also special, but always inside of a [] braket, so we dont need to check it.
- Returns:
- if False then the input is 100% not an expandable glob pattern
(although it could still be a glob pattern, but it is equivalant to strict matching). if True, then there are special glob characters in the string, but it is not guarenteed to be a valid glob pattern.
- Return type:
- class xdev.patterns.Pattern(pattern, backend)[source]¶
Bases:
PatternBase
,NiceRepr
Provides a common API to several common pattern matching syntaxes.
A general patterns class, which can use a backend from BACKENDS
- Parameters:
pattern (str | object) – The pattern text or a precompiled backend pattern object
backend (str) – Code indicating what backend the pattern text should be interpereted with. See BACKENDS for available choices.
Notes
# BACKENDS
The glob backend uses the
fnmatch
module [fnmatch_docs]. The regex backend uses the Pythonre
module. The strict backend uses the “==” string equality testing. The parse backend uses theparse
module.References
Example
>>> # Test Regex backend >>> repat = Pattern.coerce('foo.*', 'regex') >>> assert repat.match('foobar') >>> assert not repat.match('barfoo') >>> match = repat.search('baz-biz-foobar') >>> match = repat.match('baz-biz-foobar') >>> # Test Glob backend >>> globpat = Pattern.coerce('foo*', 'glob') >>> assert globpat.match('foobar') >>> assert not globpat.match('barfoo') >>> globpat = Pattern.coerce('[foo|bar]', 'glob') >>> globpat.match('foo')
Example
>>> # xdoctest: +REQUIRES(module:parse) >>> # Test parse backend >>> pattern1 = Pattern.coerce('A {adjective} pattern', 'parse') >>> result1 = pattern1.match('A cool pattern') >>> print(f'result1.named = {ub.urepr(result1.named, nl=1)}') >>> pattern2 = pattern1.to_regex() >>> result2 = pattern2.match('A cool pattern')
- to_regex()[source]¶
Returns an equivalent pattern with the regular expression backend
- Returns:
Pattern
Example
>>> globpat = Pattern.coerce('foo*', 'glob') >>> strictpat = Pattern.coerce('foo*', 'strict') >>> repat1 = strictpat.to_regex() >>> repat2 = globpat.to_regex() >>> print(f'repat1={repat1}') >>> print(f'repat2={repat2}')
- classmethod from_regex(data, flags=0, multiline=False, dotall=False, ignorecase=False)[source]¶
Create a Pattern object with a regex backend.
- classmethod coerce_backend(data, hint='auto')[source]¶
Example
>>> assert Pattern.coerce_backend('foo', hint='auto') == 'strict' >>> assert Pattern.coerce_backend('foo*', hint='auto') == 'glob' >>> assert Pattern.coerce_backend(re.compile('foo*'), hint='auto') == 'regex'
- classmethod coerce(data, hint='auto')[source]¶
Attempt to automatically determine the input data as the appropriate pattern. If it cannot be determined, then fallback to the hint.
- Parameters:
data (str | Pattern | PathLike)
hint (str) – can be ‘glob’, ‘regex’, ‘strict’ or ‘auto’. In ‘auto’ we will use ‘glob’ if the input is a string and ‘*’ is in the pattern, otherwise we will use strict. Pattern inputs keep their existing interpretation.
Example
>>> pat = Pattern.coerce('foo*', 'glob') >>> pat2 = Pattern.coerce(pat, 'regex') >>> print('pat = {}'.format(ub.urepr(pat, nl=1))) >>> print('pat2 = {}'.format(ub.urepr(pat2, nl=1)))
- class xdev.patterns.MultiPattern(patterns, predicate)[source]¶
Bases:
PatternBase
,NiceRepr
Example
>>> dpath = ub.Path.appdir('xdev/tests/multipattern_paths').ensuredir().delete().ensuredir() >>> (dpath / 'file0.txt').touch() >>> (dpath / 'data0.dat').touch() >>> (dpath / 'other0.txt').touch() >>> ((dpath / 'dir1').ensuredir() / 'file1.txt').touch() >>> ((dpath / 'dir2').ensuredir() / 'file2.txt').touch() >>> ((dpath / 'dir2').ensuredir() / 'file3.txt').touch() >>> ((dpath / 'dir1').ensuredir() / 'data.dat').touch() >>> ((dpath / 'dir2').ensuredir() / 'data.dat').touch() >>> ((dpath / 'dir2').ensuredir() / 'data.dat').touch() >>> pat = MultiPattern.coerce(['*.txt'], 'glob') >>> print(list(pat.paths(cwd=dpath))) >>> pat = MultiPattern.coerce(['*0*', '**/*.txt'], 'glob') >>> print(list(pat.paths(cwd=dpath, recursive=1))) >>> pat = MultiPattern.coerce(['*.txt', '**/*.txt', '**/*.dat'], 'glob') >>> print(list(pat.paths(cwd=dpath)))
- classmethod coerce(data, hint='auto', predicate='any')[source]¶
- Parameters:
data (str | List | Pattern | PathLike | MultiPattern)
hint (str) – can be ‘glob’, ‘regex’, ‘strict’ or ‘auto’. In ‘auto’ we will use ‘glob’ if the input is a string and ‘*’ is in the pattern, otherwise we will use strict. Pattern inputs keep their existing interpretation.
- Returns:
MultiPattern
Example
>>> pat = MultiPattern.coerce('foo*', 'glob') >>> pat2 = MultiPattern.coerce(pat, 'regex') >>> pat3 = MultiPattern.coerce([pat, pat], 'regex') >>> pat4 = MultiPattern.coerce([ub.Path('bar*'), pat], 'regex') >>> print('pat = {}'.format(ub.urepr(pat, nl=1))) >>> print('pat2 = {}'.format(ub.urepr(pat2, nl=1))) >>> print('pat3 = {!r}'.format(pat3)) >>> print('pat4 = {!r}'.format(pat4))
>>> pat00 = MultiPattern.coerce('foo', 'glob') >>> pat01 = MultiPattern.coerce('foo*', 'glob') >>> pat02 = MultiPattern.coerce('foo*', 'regex') >>> pat5 = MultiPattern.coerce(['foo', 'foo*', pat, pat00, pat01, pat02]) >>> print(f'pat5={pat5}')
Example
>>> # Test all acceptable input types >>> import itertools as it >>> str_pat = 'pattern*' >>> scalar_inputs = { >>> 'str': str_pat, >>> 'path': ub.Path(str_pat), >>> 'pat': Pattern.coerce(str_pat), >>> 'mpat': MultiPattern.coerce(str_pat) >>> } >>> # Test scalar input types >>> scalar_outputs = {} >>> for k, v in scalar_inputs.items(): >>> scalar_outputs[k] = MultiPattern.coerce(v) >>> print('scalar_outputs = {}'.format(ub.urepr(scalar_outputs, nl=1))) >>> # >>> # Test iterable input types >>> multi_outputs = [] >>> for v in it.combinations(scalar_inputs.values(), 2): >>> multi_outputs.append(MultiPattern.coerce(v)) >>> for v in it.combinations(scalar_inputs.values(), 3): >>> multi_outputs.append(MultiPattern.coerce(v)) >>> # Higher order nesting test >>> higher_order_output = MultiPattern.coerce(multi_outputs) >>> print('higher_order_output = {}'.format(ub.urepr(higher_order_output, nl=1)))