diff options
| author | Alexei Șerșun <alexei.sersun@yahoo.com> | 2025-02-17 13:50:15 +0200 |
|---|---|---|
| committer | GitHub <noreply@github.com> | 2025-02-17 20:50:15 +0900 |
| commit | 01d9d9c8c895de728e7abcad1723db019280452d (patch) | |
| tree | 00fe007ef911a5017a3a98d59c24cac563394379 /src | |
| parent | 1eafc4e5d917e3b77f18807337b4ad770048a22a (diff) | |
| download | fzf-01d9d9c8c895de728e7abcad1723db019280452d.tar.gz | |
Normalize char before pattern lookup (#4252)
There is an edge-case in FuzzyMatchV1 during backward scan, related to
normalization: if string is initially denormalized (e.g. Unicode symbol),
backward scan will proceed further to the next char; however, when the
score is computed, the string is normalized first, then scanned based on
the pattern. This leads to accessing pattern index increment, which
itself leads to out-of-bound index access, resulting in a panic.
To illustrate the process, here's the sequence of operations when search
is perfored:
1. during backward scan by "minim" pattern
```
xxxxx Minímal example
^^^^^^^^^^^^
||||||||||||
miniiiiiiiim <- compute score for this substring
```
2. during compute score by "minim" pattern
```
Minímal exam
minimal exam <- normalize chars before computing the score
^^^^^^
||||||
minim <- at this point the pattern is already fully scanned and index
is out-of-the-bound
```
In this commit the char is normalized during backward scan, to detect
properly the boundaries for the pattern.
Diffstat (limited to 'src')
| -rw-r--r-- | src/algo/algo.go | 3 | ||||
| -rw-r--r-- | src/algo/algo_test.go | 9 |
2 files changed, 12 insertions, 0 deletions
diff --git a/src/algo/algo.go b/src/algo/algo.go index c0022475..d6a9a663 100644 --- a/src/algo/algo.go +++ b/src/algo/algo.go @@ -767,6 +767,9 @@ func FuzzyMatchV1(caseSensitive bool, normalize bool, forward bool, text *util.C char = unicode.To(unicode.LowerCase, char) } } + if normalize { + char = normalizeRune(char) + } pidx_ := indexAt(pidx, lenPattern, forward) pchar := pattern[pidx_] diff --git a/src/algo/algo_test.go b/src/algo/algo_test.go index b5ed0e77..aab03b0a 100644 --- a/src/algo/algo_test.go +++ b/src/algo/algo_test.go @@ -200,3 +200,12 @@ func TestLongString(t *testing.T) { bytes[math.MaxUint16] = 'z' assertMatch(t, FuzzyMatchV2, true, true, string(bytes), "zx", math.MaxUint16, math.MaxUint16+2, scoreMatch*2+bonusConsecutive) } + +func TestLongStringWithNormalize(t *testing.T) { + bytes := make([]byte, 30000) + for i := range bytes { + bytes[i] = 'x' + } + unicodeString := string(bytes) + " Minímal example" + assertMatch2(t, FuzzyMatchV1, false, true, false, unicodeString, "minim", 30001, 30006, 140) +} |
