Revise ranking algorithm

This commit is contained in:
Junegunn Choi 2016-09-07 09:58:18 +09:00
parent 8ef2420677
commit 2fc7c18747
No known key found for this signature in database
GPG Key ID: 254BC280FEF9C627
20 changed files with 961 additions and 460 deletions

View File

@ -1,6 +1,13 @@
CHANGELOG CHANGELOG
========= =========
0.15.0
------
- Improved fuzzy search algorithm
- Added `--algo=[v1|v2]` option so one can still choose the old algorithm
which values the search performance over the quality of the result
- Advanced scoring criteria
0.13.5 0.13.5
------ ------
- Memory and performance optimization - Memory and performance optimization

View File

@ -2,8 +2,8 @@
set -u set -u
[[ "$@" =~ --pre ]] && version=0.13.5 pre=1 || [[ "$@" =~ --pre ]] && version=0.15.0 pre=1 ||
version=0.13.5 pre=0 version=0.15.0 pre=0
auto_completion= auto_completion=
key_bindings= key_bindings=

View File

@ -21,7 +21,7 @@ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE. THE SOFTWARE.
.. ..
.TH fzf-tmux 1 "Aug 2016" "fzf 0.13.5" "fzf-tmux - open fzf in tmux split pane" .TH fzf-tmux 1 "Sep 2016" "fzf 0.15.0" "fzf-tmux - open fzf in tmux split pane"
.SH NAME .SH NAME
fzf-tmux - open fzf in tmux split pane fzf-tmux - open fzf in tmux split pane

View File

@ -21,7 +21,7 @@ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE. THE SOFTWARE.
.. ..
.TH fzf 1 "Aug 2016" "fzf 0.13.5" "fzf - a command-line fuzzy finder" .TH fzf 1 "Sep 2016" "fzf 0.15.0" "fzf - a command-line fuzzy finder"
.SH NAME .SH NAME
fzf - a command-line fuzzy finder fzf - a command-line fuzzy finder
@ -47,6 +47,16 @@ Case-insensitive match (default: smart-case match)
.TP .TP
.B "+i" .B "+i"
Case-sensitive match Case-sensitive match
.TP
.BI "--algo=" TYPE
Fuzzy matching algorithm (default: v2)
.br
.BR v2 " Optimal scoring algorithm (quality)"
.br
.BR v1 " Faster but not guaranteed to find the optimal result (performance)"
.br
.TP .TP
.BI "-n, --nth=" "N[,..]" .BI "-n, --nth=" "N[,..]"
Comma-separated list of field index expressions for limiting search scope. Comma-separated list of field index expressions for limiting search scope.

View File

@ -61,6 +61,23 @@ make install
make linux make linux
``` ```
Test
----
Unit tests can be run with `make test`. Integration tests are written in Ruby
script that should be run on tmux.
```sh
# Unit tests
make test
# Install the executable to ../bin directory
make install
# Integration tests
ruby ../test/test_go.rb
```
Third-party libraries used Third-party libraries used
-------------------------- --------------------------

View File

@ -1,19 +1,91 @@
package algo package algo
/*
Algorithm
---------
FuzzyMatchV1 finds the first "fuzzy" occurrence of the pattern within the given
text in O(n) time where n is the length of the text. Once the position of the
last character is located, it traverses backwards to see if there's a shorter
substring that matches the pattern.
a_____b___abc__ To find "abc"
*-----*-----*> 1. Forward scan
<*** 2. Backward scan
The algorithm is simple and fast, but as it only sees the first occurrence,
it is not guaranteed to find the occurrence with the highest score.
a_____b__c__abc
*-----*--* ***
FuzzyMatchV2 implements a modified version of Smith-Waterman algorithm to find
the optimal solution (highest score) according to the scoring criteria. Unlike
the original algorithm, omission or mismatch of a character in the pattern is
not allowed.
Performance
-----------
The new V2 algorithm is slower than V1 as it examines all occurrences of the
pattern instead of stopping immediately after finding the first one. The time
complexity of the algorithm is O(nm) if a match is found and O(n) otherwise
where n is the length of the item and m is the length of the pattern. Thus, the
performance overhead may not be noticeable for a query with high selectivity.
However, if the performance is more important than the quality of the result,
you can still choose v1 algorithm with --algo=v1.
Scoring criteria
----------------
- We prefer matches at special positions, such as the start of a word, or
uppercase character in camelCase words.
- That is, we prefer an occurrence of the pattern with more characters
matching at special positions, even if the total match length is longer.
e.g. "fuzzyfinder" vs. "fuzzy-finder" on "ff"
````````````
- Also, if the first character in the pattern appears at one of the special
positions, the bonus point for the position is multiplied by a constant
as it is extremely likely that the first character in the typed pattern
has more significance than the rest.
e.g. "fo-bar" vs. "foob-r" on "br"
``````
- But since fzf is still a fuzzy finder, not an acronym finder, we should also
consider the total length of the matched substring. This is why we have the
gap penalty. The gap penalty increases as the length of the gap (distance
between the matching characters) increases, so the effect of the bonus is
eventually cancelled at some point.
e.g. "fuzzyfinder" vs. "fuzzy-blurry-finder" on "ff"
```````````
- Consequently, it is crucial to find the right balance between the bonus
and the gap penalty. The parameters were chosen that the bonus is cancelled
when the gap size increases beyond 8 characters.
- The bonus mechanism can have the undesirable side effect where consecutive
matches are ranked lower than the ones with gaps.
e.g. "foobar" vs. "foo-bar" on "foob"
```````
- To correct this anomaly, we also give extra bonus point to each character
in a consecutive matching chunk.
e.g. "foobar" vs. "foo-bar" on "foob"
``````
- The amount of consecutive bonus is primarily determined by the bonus of the
first character in the chunk.
e.g. "foobar" vs. "out-of-bound" on "oob"
````````````
*/
import ( import (
"fmt"
"strings" "strings"
"unicode" "unicode"
"github.com/junegunn/fzf/src/util" "github.com/junegunn/fzf/src/util"
) )
/* var DEBUG bool
* String matching algorithms here do not use strings.ToLower to avoid
* performance penalty. And they assume pattern runes are given in lowercase
* letters when caseSensitive is false.
*
* In short: They try to do as little work as possible.
*/
func indexAt(index int, max int, forward bool) int { func indexAt(index int, max int, forward bool) int {
if forward { if forward {
@ -24,14 +96,48 @@ func indexAt(index int, max int, forward bool) int {
// Result contains the results of running a match function. // Result contains the results of running a match function.
type Result struct { type Result struct {
// TODO int32 should suffice
Start int Start int
End int End int
Score int
// Items are basically sorted by the lengths of matched substrings.
// But we slightly adjust the score with bonus for better results.
Bonus int
} }
const (
scoreMatch = 16
scoreGapStart = -3
scoreGapExtention = -1
// We prefer matches at the beginning of a word, but the bonus should not be
// too great to prevent the longer acronym matches from always winning over
// shorter fuzzy matches. The bonus point here was specifically chosen that
// the bonus is cancelled when the gap between the acronyms grows over
// 8 characters, which is approximately the average length of the words found
// in web2 dictionary and my file system.
bonusBoundary = scoreMatch / 2
// Although bonus point for non-word characters is non-contextual, we need it
// for computing bonus points for consecutive chunks starting with a non-word
// character.
bonusNonWord = scoreMatch / 2
// Edge-triggered bonus for matches in camelCase words.
// Compared to word-boundary case, they don't accompany single-character gaps
// (e.g. FooBar vs. foo-bar), so we deduct bonus point accordingly.
bonusCamel123 = bonusBoundary + scoreGapExtention
// Minimum bonus point given to characters in consecutive chunks.
// Note that bonus points for consecutive matches shouldn't have needed if we
// used fixed match score as in the original algorithm.
bonusConsecutive = -(scoreGapStart + scoreGapExtention)
// The first character in the typed pattern usually has more significance
// than the rest so it's important that it appears at special positions where
// bonus points are given. e.g. "to-go" vs. "ongoing" on "og" or on "ogo".
// The amount of the extra bonus should be limited so that the gap penalty is
// still respected.
bonusFirstCharMultiplier = 2
)
type charClass int type charClass int
const ( const (
@ -42,39 +148,299 @@ const (
charNumber charNumber
) )
func evaluateBonus(caseSensitive bool, text util.Chars, pattern []rune, sidx int, eidx int) int { func posArray(withPos bool, len int) *[]int {
var bonus int if withPos {
pidx := 0 pos := make([]int, 0, len)
lenPattern := len(pattern) return &pos
consecutive := false
prevClass := charNonWord
for index := util.Max(0, sidx-1); index < eidx; index++ {
char := text.Get(index)
var class charClass
if unicode.IsLower(char) {
class = charLower
} else if unicode.IsUpper(char) {
class = charUpper
} else if unicode.IsLetter(char) {
class = charLetter
} else if unicode.IsNumber(char) {
class = charNumber
} else {
class = charNonWord
} }
return nil
}
var point int func alloc16(offset int, slab *util.Slab, size int, clear bool) (int, []int16) {
if slab != nil && cap(slab.I16) > offset+size {
slice := slab.I16[offset : offset+size]
if clear {
for idx := range slice {
slice[idx] = 0
}
}
return offset + size, slice
}
return offset, make([]int16, size)
}
func alloc32(offset int, slab *util.Slab, size int, clear bool) (int, []int32) {
if slab != nil && cap(slab.I32) > offset+size {
slice := slab.I32[offset : offset+size]
if clear {
for idx := range slice {
slice[idx] = 0
}
}
return offset + size, slice
}
return offset, make([]int32, size)
}
func charClassOfAscii(char rune) charClass {
if char >= 'a' && char <= 'z' {
return charLower
} else if char >= 'A' && char <= 'Z' {
return charUpper
} else if char >= '0' && char <= '9' {
return charNumber
}
return charNonWord
}
func charClassOfNonAscii(char rune) charClass {
if unicode.IsLower(char) {
return charLower
} else if unicode.IsUpper(char) {
return charUpper
} else if unicode.IsNumber(char) {
return charNumber
} else if unicode.IsLetter(char) {
return charLetter
}
return charNonWord
}
func charClassOf(char rune) charClass {
if char <= unicode.MaxASCII {
return charClassOfAscii(char)
}
return charClassOfNonAscii(char)
}
func bonusFor(prevClass charClass, class charClass) int16 {
if prevClass == charNonWord && class != charNonWord { if prevClass == charNonWord && class != charNonWord {
// Word boundary // Word boundary
point = 2 return bonusBoundary
} else if prevClass == charLower && class == charUpper || } else if prevClass == charLower && class == charUpper ||
prevClass != charNumber && class == charNumber { prevClass != charNumber && class == charNumber {
// camelCase letter123 // camelCase letter123
point = 1 return bonusCamel123
} else if class == charNonWord {
return bonusNonWord
} }
return 0
}
func bonusAt(input util.Chars, idx int) int16 {
if idx == 0 {
return bonusBoundary
}
return bonusFor(charClassOf(input.Get(idx-1)), charClassOf(input.Get(idx)))
}
type Algo func(caseSensitive bool, forward bool, input util.Chars, pattern []rune, withPos bool, slab *util.Slab) (Result, *[]int)
func FuzzyMatchV2(caseSensitive bool, forward bool, input util.Chars, pattern []rune, withPos bool, slab *util.Slab) (Result, *[]int) {
// Assume that pattern is given in lowercase if case-insensitive.
// First check if there's a match and calculate bonus for each position.
// If the input string is too long, consider finding the matching chars in
// this phase as well (non-optimal alignment).
N := input.Length()
M := len(pattern)
switch M {
case 0:
return Result{0, 0, 0}, posArray(withPos, M)
case 1:
return ExactMatchNaive(caseSensitive, forward, input, pattern[0:1], withPos, slab)
}
// Since O(nm) algorithm can be prohibitively expensive for large input,
// we fall back to the greedy algorithm.
if slab != nil && N*M > cap(slab.I16) {
return FuzzyMatchV1(caseSensitive, forward, input, pattern, withPos, slab)
}
// Reuse pre-allocated integer slice to avoid unnecessary sweeping of garbages
offset := 0
// Bonus point for each position
offset, B := alloc16(offset, slab, N, false)
// The first occurrence of each character in the pattern
offset, F := alloc16(offset, slab, M, false)
// Rune array
_, T := alloc32(0, slab, N, false)
// Phase 1. Check if there's a match and calculate bonus for each point
pidx, lastIdx, prevClass := 0, 0, charNonWord
for idx := 0; idx < N; idx++ {
char := input.Get(idx)
var class charClass
if char <= unicode.MaxASCII {
class = charClassOfAscii(char)
} else {
class = charClassOfNonAscii(char)
}
if !caseSensitive && class == charUpper {
if char <= unicode.MaxASCII {
char += 32
} else {
char = unicode.To(unicode.LowerCase, char)
}
}
T[idx] = char
B[idx] = bonusFor(prevClass, class)
prevClass = class prevClass = class
if index >= sidx { if pidx < M {
if char == pattern[pidx] {
lastIdx = idx
F[pidx] = int16(idx)
pidx++
}
} else {
if char == pattern[M-1] {
lastIdx = idx
}
}
}
if pidx != M {
return Result{-1, -1, 0}, nil
}
// Phase 2. Fill in score matrix (H)
// Unlike the original algorithm, we do not allow omission.
width := lastIdx - int(F[0]) + 1
offset, H := alloc16(offset, slab, width*M, false)
// Possible length of consecutive chunk at each position.
offset, C := alloc16(offset, slab, width*M, false)
maxScore, maxScorePos := int16(0), 0
for i := 0; i < M; i++ {
I := i * width
inGap := false
for j := int(F[i]); j <= lastIdx; j++ {
j0 := j - int(F[0])
var s1, s2, consecutive int16
if j > int(F[i]) {
if inGap {
s2 = H[I+j0-1] + scoreGapExtention
} else {
s2 = H[I+j0-1] + scoreGapStart
}
}
if pattern[i] == T[j] {
var diag int16
if i > 0 && j0 > 0 {
diag = H[I-width+j0-1]
}
s1 = diag + scoreMatch
b := B[j]
if i > 0 {
// j > 0 if i > 0
consecutive = C[I-width+j0-1] + 1
// Break consecutive chunk
if b == bonusBoundary {
consecutive = 1
} else if consecutive > 1 {
b = util.Max16(b, util.Max16(bonusConsecutive, B[j-int(consecutive)+1]))
}
} else {
consecutive = 1
b *= bonusFirstCharMultiplier
}
if s1+b < s2 {
s1 += B[j]
consecutive = 0
} else {
s1 += b
}
}
C[I+j0] = consecutive
inGap = s1 < s2
score := util.Max16(util.Max16(s1, s2), 0)
if i == M-1 && (forward && score > maxScore || !forward && score >= maxScore) {
maxScore, maxScorePos = score, j
}
H[I+j0] = score
}
if DEBUG {
if i == 0 {
fmt.Print(" ")
for j := int(F[i]); j <= lastIdx; j++ {
fmt.Printf(" " + string(input.Get(j)) + " ")
}
fmt.Println()
}
fmt.Print(string(pattern[i]) + " ")
for idx := int(F[0]); idx < int(F[i]); idx++ {
fmt.Print(" 0 ")
}
for idx := int(F[i]); idx <= lastIdx; idx++ {
fmt.Printf("%2d ", H[i*width+idx-int(F[0])])
}
fmt.Println()
fmt.Print(" ")
for idx, p := range C[I : I+width] {
if idx+int(F[0]) < int(F[i]) {
p = 0
}
fmt.Printf("%2d ", p)
}
fmt.Println()
}
}
// Phase 3. (Optional) Backtrace to find character positions
pos := posArray(withPos, M)
j := int(F[0])
if withPos {
i := M - 1
j = maxScorePos
preferMatch := true
for {
I := i * width
j0 := j - int(F[0])
s := H[I+j0]
var s1, s2 int16
if i > 0 && j >= int(F[i]) {
s1 = H[I-width+j0-1]
}
if j > int(F[i]) {
s2 = H[I+j0-1]
}
if s > s1 && (s > s2 || s == s2 && preferMatch) {
*pos = append(*pos, j)
if i == 0 {
break
}
i--
}
preferMatch = C[I+j0] > 1 || I+width+j0+1 < len(C) && C[I+width+j0+1] > 0
j--
}
}
// Start offset we return here is only relevant when begin tiebreak is used.
// However finding the accurate offset requires backtracking, and we don't
// want to pay extra cost for the option that has lost its importance.
return Result{j, maxScorePos + 1, int(maxScore)}, pos
}
// Implement the same sorting criteria as V2
func calculateScore(caseSensitive bool, text util.Chars, pattern []rune, sidx int, eidx int, withPos bool) (int, *[]int) {
pidx, score, inGap, consecutive, firstBonus := 0, 0, false, 0, int16(0)
pos := posArray(withPos, len(pattern))
prevClass := charNonWord
if sidx > 0 {
prevClass = charClassOf(text.Get(sidx - 1))
}
for idx := sidx; idx < eidx; idx++ {
char := text.Get(idx)
class := charClassOf(char)
if !caseSensitive { if !caseSensitive {
if char >= 'A' && char <= 'Z' { if char >= 'A' && char <= 'Z' {
char += 32 char += 32
@ -82,45 +448,50 @@ func evaluateBonus(caseSensitive bool, text util.Chars, pattern []rune, sidx int
char = unicode.To(unicode.LowerCase, char) char = unicode.To(unicode.LowerCase, char)
} }
} }
pchar := pattern[pidx] if char == pattern[pidx] {
if pchar == char { if withPos {
// Boost bonus for the first character in the pattern *pos = append(*pos, idx)
if pidx == 0 {
point *= 2
} }
// Bonus to consecutive matching chars score += scoreMatch
if consecutive { bonus := bonusFor(prevClass, class)
point++ if consecutive == 0 {
} firstBonus = bonus
bonus += point
if pidx++; pidx == lenPattern {
break
}
consecutive = true
} else { } else {
consecutive = false // Break consecutive chunk
if bonus == bonusBoundary {
firstBonus = bonus
} }
bonus = util.Max16(util.Max16(bonus, firstBonus), bonusConsecutive)
} }
if pidx == 0 {
score += int(bonus * bonusFirstCharMultiplier)
} else {
score += int(bonus)
} }
return bonus inGap = false
consecutive++
pidx++
} else {
if inGap {
score += scoreGapExtention
} else {
score += scoreGapStart
}
inGap = true
consecutive = 0
firstBonus = 0
}
prevClass = class
}
return score, pos
} }
// FuzzyMatch performs fuzzy-match // FuzzyMatchV1 performs fuzzy-match
func FuzzyMatch(caseSensitive bool, forward bool, text util.Chars, pattern []rune) Result { func FuzzyMatchV1(caseSensitive bool, forward bool, text util.Chars, pattern []rune, withPos bool, slab *util.Slab) (Result, *[]int) {
if len(pattern) == 0 { if len(pattern) == 0 {
return Result{0, 0, 0} return Result{0, 0, 0}, nil
} }
// 0. (FIXME) How to find the shortest match?
// a_____b__c__abc
// ^^^^^^^^^^ ^^^
// 1. forward scan (abc)
// *-----*-----*>
// a_____b___abc__
// 2. reverse scan (cba)
// a_____b___abc__
// <***
pidx := 0 pidx := 0
sidx := -1 sidx := -1
eidx := -1 eidx := -1
@ -157,7 +528,8 @@ func FuzzyMatch(caseSensitive bool, forward bool, text util.Chars, pattern []run
if sidx >= 0 && eidx >= 0 { if sidx >= 0 && eidx >= 0 {
pidx-- pidx--
for index := eidx - 1; index >= sidx; index-- { for index := eidx - 1; index >= sidx; index-- {
char := text.Get(indexAt(index, lenRunes, forward)) tidx := indexAt(index, lenRunes, forward)
char := text.Get(tidx)
if !caseSensitive { if !caseSensitive {
if char >= 'A' && char <= 'Z' { if char >= 'A' && char <= 'Z' {
char += 32 char += 32
@ -166,7 +538,8 @@ func FuzzyMatch(caseSensitive bool, forward bool, text util.Chars, pattern []run
} }
} }
pchar := pattern[indexAt(pidx, lenPattern, forward)] pidx_ := indexAt(pidx, lenPattern, forward)
pchar := pattern[pidx_]
if char == pchar { if char == pchar {
if pidx--; pidx < 0 { if pidx--; pidx < 0 {
sidx = index sidx = index
@ -175,16 +548,14 @@ func FuzzyMatch(caseSensitive bool, forward bool, text util.Chars, pattern []run
} }
} }
// Calculate the bonus. This can't be done at the same time as the
// pattern scan above because 'forward' may be false.
if !forward { if !forward {
sidx, eidx = lenRunes-eidx, lenRunes-sidx sidx, eidx = lenRunes-eidx, lenRunes-sidx
} }
return Result{sidx, eidx, score, pos := calculateScore(caseSensitive, text, pattern, sidx, eidx, withPos)
evaluateBonus(caseSensitive, text, pattern, sidx, eidx)} return Result{sidx, eidx, score}, pos
} }
return Result{-1, -1, 0} return Result{-1, -1, 0}, nil
} }
// ExactMatchNaive is a basic string searching algorithm that handles case // ExactMatchNaive is a basic string searching algorithm that handles case
@ -192,23 +563,28 @@ func FuzzyMatch(caseSensitive bool, forward bool, text util.Chars, pattern []run
// of strings.ToLower + strings.Index for typical fzf use cases where input // of strings.ToLower + strings.Index for typical fzf use cases where input
// strings and patterns are not very long. // strings and patterns are not very long.
// //
// We might try to implement better algorithms in the future: // Since 0.15.0, this function searches for the match with the highest
// http://en.wikipedia.org/wiki/String_searching_algorithm // bonus point, instead of stopping immediately after finding the first match.
func ExactMatchNaive(caseSensitive bool, forward bool, text util.Chars, pattern []rune) Result { // The solution is much cheaper since there is only one possible alignment of
// the pattern.
func ExactMatchNaive(caseSensitive bool, forward bool, text util.Chars, pattern []rune, withPos bool, slab *util.Slab) (Result, *[]int) {
if len(pattern) == 0 { if len(pattern) == 0 {
return Result{0, 0, 0} return Result{0, 0, 0}, nil
} }
lenRunes := text.Length() lenRunes := text.Length()
lenPattern := len(pattern) lenPattern := len(pattern)
if lenRunes < lenPattern { if lenRunes < lenPattern {
return Result{-1, -1, 0} return Result{-1, -1, 0}, nil
} }
// For simplicity, only look at the bonus at the first character position
pidx := 0 pidx := 0
bestPos, bonus, bestBonus := -1, int16(0), int16(-1)
for index := 0; index < lenRunes; index++ { for index := 0; index < lenRunes; index++ {
char := text.Get(indexAt(index, lenRunes, forward)) index_ := indexAt(index, lenRunes, forward)
char := text.Get(index_)
if !caseSensitive { if !caseSensitive {
if char >= 'A' && char <= 'Z' { if char >= 'A' && char <= 'Z' {
char += 32 char += 32
@ -216,33 +592,51 @@ func ExactMatchNaive(caseSensitive bool, forward bool, text util.Chars, pattern
char = unicode.To(unicode.LowerCase, char) char = unicode.To(unicode.LowerCase, char)
} }
} }
pchar := pattern[indexAt(pidx, lenPattern, forward)] pidx_ := indexAt(pidx, lenPattern, forward)
pchar := pattern[pidx_]
if pchar == char { if pchar == char {
if pidx_ == 0 {
bonus = bonusAt(text, index_)
}
pidx++ pidx++
if pidx == lenPattern { if pidx == lenPattern {
var sidx, eidx int if bonus > bestBonus {
if forward { bestPos, bestBonus = index, bonus
sidx = index - lenPattern + 1
eidx = index + 1
} else {
sidx = lenRunes - (index + 1)
eidx = lenRunes - (index - lenPattern + 1)
} }
return Result{sidx, eidx, if bonus == bonusBoundary {
evaluateBonus(caseSensitive, text, pattern, sidx, eidx)} break
}
index -= pidx - 1
pidx, bonus = 0, 0
} }
} else { } else {
index -= pidx index -= pidx
pidx = 0 pidx, bonus = 0, 0
} }
} }
return Result{-1, -1, 0} if bestPos >= 0 {
var sidx, eidx int
if forward {
sidx = bestPos - lenPattern + 1
eidx = bestPos + 1
} else {
sidx = lenRunes - (bestPos + 1)
eidx = lenRunes - (bestPos - lenPattern + 1)
}
score, _ := calculateScore(caseSensitive, text, pattern, sidx, eidx, false)
return Result{sidx, eidx, score}, nil
}
return Result{-1, -1, 0}, nil
} }
// PrefixMatch performs prefix-match // PrefixMatch performs prefix-match
func PrefixMatch(caseSensitive bool, forward bool, text util.Chars, pattern []rune) Result { func PrefixMatch(caseSensitive bool, forward bool, text util.Chars, pattern []rune, withPos bool, slab *util.Slab) (Result, *[]int) {
if len(pattern) == 0 {
return Result{0, 0, 0}, nil
}
if text.Length() < len(pattern) { if text.Length() < len(pattern) {
return Result{-1, -1, 0} return Result{-1, -1, 0}, nil
} }
for index, r := range pattern { for index, r := range pattern {
@ -251,20 +645,24 @@ func PrefixMatch(caseSensitive bool, forward bool, text util.Chars, pattern []ru
char = unicode.ToLower(char) char = unicode.ToLower(char)
} }
if char != r { if char != r {
return Result{-1, -1, 0} return Result{-1, -1, 0}, nil
} }
} }
lenPattern := len(pattern) lenPattern := len(pattern)
return Result{0, lenPattern, score, _ := calculateScore(caseSensitive, text, pattern, 0, lenPattern, false)
evaluateBonus(caseSensitive, text, pattern, 0, lenPattern)} return Result{0, lenPattern, score}, nil
} }
// SuffixMatch performs suffix-match // SuffixMatch performs suffix-match
func SuffixMatch(caseSensitive bool, forward bool, text util.Chars, pattern []rune) Result { func SuffixMatch(caseSensitive bool, forward bool, text util.Chars, pattern []rune, withPos bool, slab *util.Slab) (Result, *[]int) {
trimmedLen := text.Length() - text.TrailingWhitespaces() lenRunes := text.Length()
trimmedLen := lenRunes - text.TrailingWhitespaces()
if len(pattern) == 0 {
return Result{trimmedLen, trimmedLen, 0}, nil
}
diff := trimmedLen - len(pattern) diff := trimmedLen - len(pattern)
if diff < 0 { if diff < 0 {
return Result{-1, -1, 0} return Result{-1, -1, 0}, nil
} }
for index, r := range pattern { for index, r := range pattern {
@ -273,28 +671,29 @@ func SuffixMatch(caseSensitive bool, forward bool, text util.Chars, pattern []ru
char = unicode.ToLower(char) char = unicode.ToLower(char)
} }
if char != r { if char != r {
return Result{-1, -1, 0} return Result{-1, -1, 0}, nil
} }
} }
lenPattern := len(pattern) lenPattern := len(pattern)
sidx := trimmedLen - lenPattern sidx := trimmedLen - lenPattern
eidx := trimmedLen eidx := trimmedLen
return Result{sidx, eidx, score, _ := calculateScore(caseSensitive, text, pattern, sidx, eidx, false)
evaluateBonus(caseSensitive, text, pattern, sidx, eidx)} return Result{sidx, eidx, score}, nil
} }
// EqualMatch performs equal-match // EqualMatch performs equal-match
func EqualMatch(caseSensitive bool, forward bool, text util.Chars, pattern []rune) Result { func EqualMatch(caseSensitive bool, forward bool, text util.Chars, pattern []rune, withPos bool, slab *util.Slab) (Result, *[]int) {
// Note: EqualMatch always return a zero bonus. lenPattern := len(pattern)
if text.Length() != len(pattern) { if text.Length() != lenPattern {
return Result{-1, -1, 0} return Result{-1, -1, 0}, nil
} }
runesStr := text.ToString() runesStr := text.ToString()
if !caseSensitive { if !caseSensitive {
runesStr = strings.ToLower(runesStr) runesStr = strings.ToLower(runesStr)
} }
if runesStr == string(pattern) { if runesStr == string(pattern) {
return Result{0, len(pattern), 0} return Result{0, lenPattern, (scoreMatch+bonusBoundary)*lenPattern +
(bonusFirstCharMultiplier-1)*bonusBoundary}, nil
} }
return Result{-1, -1, 0} return Result{-1, -1, 0}, nil
} }

View File

@ -1,95 +1,154 @@
package algo package algo
import ( import (
"sort"
"strings" "strings"
"testing" "testing"
"github.com/junegunn/fzf/src/util" "github.com/junegunn/fzf/src/util"
) )
func assertMatch(t *testing.T, fun func(bool, bool, util.Chars, []rune) Result, caseSensitive, forward bool, input, pattern string, sidx int, eidx int, bonus int) { func assertMatch(t *testing.T, fun Algo, caseSensitive, forward bool, input, pattern string, sidx int, eidx int, score int) {
if !caseSensitive { if !caseSensitive {
pattern = strings.ToLower(pattern) pattern = strings.ToLower(pattern)
} }
res := fun(caseSensitive, forward, util.RunesToChars([]rune(input)), []rune(pattern)) res, pos := fun(caseSensitive, forward, util.RunesToChars([]rune(input)), []rune(pattern), true, nil)
if res.Start != sidx { var start, end int
t.Errorf("Invalid start index: %d (expected: %d, %s / %s)", res.Start, sidx, input, pattern) if pos == nil || len(*pos) == 0 {
start = res.Start
end = res.End
} else {
sort.Ints(*pos)
start = (*pos)[0]
end = (*pos)[len(*pos)-1] + 1
} }
if res.End != eidx { if start != sidx {
t.Errorf("Invalid end index: %d (expected: %d, %s / %s)", res.End, eidx, input, pattern) t.Errorf("Invalid start index: %d (expected: %d, %s / %s)", start, sidx, input, pattern)
} }
if res.Bonus != bonus { if end != eidx {
t.Errorf("Invalid bonus: %d (expected: %d, %s / %s)", res.Bonus, bonus, input, pattern) t.Errorf("Invalid end index: %d (expected: %d, %s / %s)", end, eidx, input, pattern)
}
if res.Score != score {
t.Errorf("Invalid score: %d (expected: %d, %s / %s)", res.Score, score, input, pattern)
} }
} }
func TestFuzzyMatch(t *testing.T) { func TestFuzzyMatch(t *testing.T) {
assertMatch(t, FuzzyMatch, false, true, "fooBarbaz", "oBZ", 2, 9, 2) for _, fn := range []Algo{FuzzyMatchV1, FuzzyMatchV2} {
assertMatch(t, FuzzyMatch, false, true, "foo bar baz", "fbb", 0, 9, 8) for _, forward := range []bool{true, false} {
assertMatch(t, FuzzyMatch, false, true, "/AutomatorDocument.icns", "rdoc", 9, 13, 4) assertMatch(t, fn, false, forward, "fooBarbaz1", "oBZ", 2, 9,
assertMatch(t, FuzzyMatch, false, true, "/man1/zshcompctl.1", "zshc", 6, 10, 7) scoreMatch*3+bonusCamel123+scoreGapStart+scoreGapExtention*3)
assertMatch(t, FuzzyMatch, false, true, "/.oh-my-zsh/cache", "zshc", 8, 13, 8) assertMatch(t, fn, false, forward, "foo bar baz", "fbb", 0, 9,
assertMatch(t, FuzzyMatch, false, true, "ab0123 456", "12356", 3, 10, 3) scoreMatch*3+bonusBoundary*bonusFirstCharMultiplier+
assertMatch(t, FuzzyMatch, false, true, "abc123 456", "12356", 3, 10, 5) bonusBoundary*2+2*scoreGapStart+4*scoreGapExtention)
assertMatch(t, fn, false, forward, "/AutomatorDocument.icns", "rdoc", 9, 13,
scoreMatch*4+bonusCamel123+bonusConsecutive*2)
assertMatch(t, fn, false, forward, "/man1/zshcompctl.1", "zshc", 6, 10,
scoreMatch*4+bonusBoundary*bonusFirstCharMultiplier+bonusBoundary*3)
assertMatch(t, fn, false, forward, "/.oh-my-zsh/cache", "zshc", 8, 13,
scoreMatch*4+bonusBoundary*bonusFirstCharMultiplier+bonusBoundary*3+scoreGapStart)
assertMatch(t, fn, false, forward, "ab0123 456", "12356", 3, 10,
scoreMatch*5+bonusConsecutive*3+scoreGapStart+scoreGapExtention)
assertMatch(t, fn, false, forward, "abc123 456", "12356", 3, 10,
scoreMatch*5+bonusCamel123*bonusFirstCharMultiplier+bonusCamel123*2+bonusConsecutive+scoreGapStart+scoreGapExtention)
assertMatch(t, fn, false, forward, "foo/bar/baz", "fbb", 0, 9,
scoreMatch*3+bonusBoundary*bonusFirstCharMultiplier+
bonusBoundary*2+2*scoreGapStart+4*scoreGapExtention)
assertMatch(t, fn, false, forward, "fooBarBaz", "fbb", 0, 7,
scoreMatch*3+bonusBoundary*bonusFirstCharMultiplier+
bonusCamel123*2+2*scoreGapStart+2*scoreGapExtention)
assertMatch(t, fn, false, forward, "foo barbaz", "fbb", 0, 8,
scoreMatch*3+bonusBoundary*bonusFirstCharMultiplier+bonusBoundary+
scoreGapStart*2+scoreGapExtention*3)
assertMatch(t, fn, false, forward, "fooBar Baz", "foob", 0, 4,
scoreMatch*4+bonusBoundary*bonusFirstCharMultiplier+bonusBoundary*3)
assertMatch(t, fn, false, forward, "xFoo-Bar Baz", "foo-b", 1, 6,
scoreMatch*5+bonusCamel123*bonusFirstCharMultiplier+bonusCamel123*2+
bonusNonWord+bonusBoundary)
assertMatch(t, FuzzyMatch, false, true, "foo/bar/baz", "fbb", 0, 9, 8) assertMatch(t, fn, true, forward, "fooBarbaz", "oBz", 2, 9,
assertMatch(t, FuzzyMatch, false, true, "fooBarBaz", "fbb", 0, 7, 6) scoreMatch*3+bonusCamel123+scoreGapStart+scoreGapExtention*3)
assertMatch(t, FuzzyMatch, false, true, "foo barbaz", "fbb", 0, 8, 6) assertMatch(t, fn, true, forward, "Foo/Bar/Baz", "FBB", 0, 9,
assertMatch(t, FuzzyMatch, false, true, "fooBar Baz", "foob", 0, 4, 8) scoreMatch*3+bonusBoundary*(bonusFirstCharMultiplier+2)+
assertMatch(t, FuzzyMatch, true, true, "fooBarbaz", "oBZ", -1, -1, 0) scoreGapStart*2+scoreGapExtention*4)
assertMatch(t, FuzzyMatch, true, true, "fooBarbaz", "oBz", 2, 9, 2) assertMatch(t, fn, true, forward, "FooBarBaz", "FBB", 0, 7,
assertMatch(t, FuzzyMatch, true, true, "Foo Bar Baz", "fbb", -1, -1, 0) scoreMatch*3+bonusBoundary*bonusFirstCharMultiplier+bonusCamel123*2+
assertMatch(t, FuzzyMatch, true, true, "Foo/Bar/Baz", "FBB", 0, 9, 8) scoreGapStart*2+scoreGapExtention*2)
assertMatch(t, FuzzyMatch, true, true, "FooBarBaz", "FBB", 0, 7, 6) assertMatch(t, fn, true, forward, "FooBar Baz", "FooB", 0, 4,
assertMatch(t, FuzzyMatch, true, true, "foo BarBaz", "fBB", 0, 8, 7) scoreMatch*4+bonusBoundary*bonusFirstCharMultiplier+bonusBoundary*2+
assertMatch(t, FuzzyMatch, true, true, "FooBar Baz", "FooB", 0, 4, 8) util.Max(bonusCamel123, bonusBoundary))
assertMatch(t, FuzzyMatch, true, true, "fooBarbaz", "fooBarbazz", -1, -1, 0)
// Consecutive bonus updated
assertMatch(t, fn, true, forward, "foo-bar", "o-ba", 2, 6,
scoreMatch*4+bonusBoundary*3)
// Non-match
assertMatch(t, fn, true, forward, "fooBarbaz", "oBZ", -1, -1, 0)
assertMatch(t, fn, true, forward, "Foo Bar Baz", "fbb", -1, -1, 0)
assertMatch(t, fn, true, forward, "fooBarbaz", "fooBarbazz", -1, -1, 0)
}
}
} }
func TestFuzzyMatchBackward(t *testing.T) { func TestFuzzyMatchBackward(t *testing.T) {
assertMatch(t, FuzzyMatch, false, true, "foobar fb", "fb", 0, 4, 4) assertMatch(t, FuzzyMatchV1, false, true, "foobar fb", "fb", 0, 4,
assertMatch(t, FuzzyMatch, false, false, "foobar fb", "fb", 7, 9, 5) scoreMatch*2+bonusBoundary*bonusFirstCharMultiplier+
scoreGapStart+scoreGapExtention)
assertMatch(t, FuzzyMatchV1, false, false, "foobar fb", "fb", 7, 9,
scoreMatch*2+bonusBoundary*bonusFirstCharMultiplier+bonusBoundary)
} }
func TestExactMatchNaive(t *testing.T) { func TestExactMatchNaive(t *testing.T) {
for _, dir := range []bool{true, false} { for _, dir := range []bool{true, false} {
assertMatch(t, ExactMatchNaive, false, dir, "fooBarbaz", "oBA", 2, 5, 3)
assertMatch(t, ExactMatchNaive, true, dir, "fooBarbaz", "oBA", -1, -1, 0) assertMatch(t, ExactMatchNaive, true, dir, "fooBarbaz", "oBA", -1, -1, 0)
assertMatch(t, ExactMatchNaive, true, dir, "fooBarbaz", "fooBarbazz", -1, -1, 0) assertMatch(t, ExactMatchNaive, true, dir, "fooBarbaz", "fooBarbazz", -1, -1, 0)
assertMatch(t, ExactMatchNaive, false, dir, "/AutomatorDocument.icns", "rdoc", 9, 13, 4) assertMatch(t, ExactMatchNaive, false, dir, "fooBarbaz", "oBA", 2, 5,
assertMatch(t, ExactMatchNaive, false, dir, "/man1/zshcompctl.1", "zshc", 6, 10, 7) scoreMatch*3+bonusCamel123+bonusConsecutive)
assertMatch(t, ExactMatchNaive, false, dir, "/.oh-my-zsh/cache", "zsh/c", 8, 13, 10) assertMatch(t, ExactMatchNaive, false, dir, "/AutomatorDocument.icns", "rdoc", 9, 13,
scoreMatch*4+bonusCamel123+bonusConsecutive*2)
assertMatch(t, ExactMatchNaive, false, dir, "/man1/zshcompctl.1", "zshc", 6, 10,
scoreMatch*4+bonusBoundary*(bonusFirstCharMultiplier+3))
assertMatch(t, ExactMatchNaive, false, dir, "/.oh-my-zsh/cache", "zsh/c", 8, 13,
scoreMatch*5+bonusBoundary*(bonusFirstCharMultiplier+4))
} }
} }
func TestExactMatchNaiveBackward(t *testing.T) { func TestExactMatchNaiveBackward(t *testing.T) {
assertMatch(t, ExactMatchNaive, false, true, "foobar foob", "oo", 1, 3, 1) assertMatch(t, ExactMatchNaive, false, true, "foobar foob", "oo", 1, 3,
assertMatch(t, ExactMatchNaive, false, false, "foobar foob", "oo", 8, 10, 1) scoreMatch*2+bonusConsecutive)
assertMatch(t, ExactMatchNaive, false, false, "foobar foob", "oo", 8, 10,
scoreMatch*2+bonusConsecutive)
} }
func TestPrefixMatch(t *testing.T) { func TestPrefixMatch(t *testing.T) {
score := (scoreMatch+bonusBoundary)*3 + bonusBoundary*(bonusFirstCharMultiplier-1)
for _, dir := range []bool{true, false} { for _, dir := range []bool{true, false} {
assertMatch(t, PrefixMatch, true, dir, "fooBarbaz", "Foo", -1, -1, 0) assertMatch(t, PrefixMatch, true, dir, "fooBarbaz", "Foo", -1, -1, 0)
assertMatch(t, PrefixMatch, false, dir, "fooBarBaz", "baz", -1, -1, 0) assertMatch(t, PrefixMatch, false, dir, "fooBarBaz", "baz", -1, -1, 0)
assertMatch(t, PrefixMatch, false, dir, "fooBarbaz", "Foo", 0, 3, 6) assertMatch(t, PrefixMatch, false, dir, "fooBarbaz", "Foo", 0, 3, score)
assertMatch(t, PrefixMatch, false, dir, "foOBarBaZ", "foo", 0, 3, 7) assertMatch(t, PrefixMatch, false, dir, "foOBarBaZ", "foo", 0, 3, score)
assertMatch(t, PrefixMatch, false, dir, "f-oBarbaz", "f-o", 0, 3, 8) assertMatch(t, PrefixMatch, false, dir, "f-oBarbaz", "f-o", 0, 3, score)
} }
} }
func TestSuffixMatch(t *testing.T) { func TestSuffixMatch(t *testing.T) {
for _, dir := range []bool{true, false} { for _, dir := range []bool{true, false} {
assertMatch(t, SuffixMatch, false, dir, "fooBarbaz", "Foo", -1, -1, 0)
assertMatch(t, SuffixMatch, false, dir, "fooBarbaz", "baz", 6, 9, 2)
assertMatch(t, SuffixMatch, false, dir, "fooBarBaZ", "baz", 6, 9, 5)
assertMatch(t, SuffixMatch, true, dir, "fooBarbaz", "Baz", -1, -1, 0) assertMatch(t, SuffixMatch, true, dir, "fooBarbaz", "Baz", -1, -1, 0)
assertMatch(t, SuffixMatch, false, dir, "fooBarbaz", "Foo", -1, -1, 0)
assertMatch(t, SuffixMatch, false, dir, "fooBarbaz", "baz", 6, 9,
scoreMatch*3+bonusConsecutive*2)
assertMatch(t, SuffixMatch, false, dir, "fooBarBaZ", "baz", 6, 9,
(scoreMatch+bonusCamel123)*3+bonusCamel123*(bonusFirstCharMultiplier-1))
} }
} }
func TestEmptyPattern(t *testing.T) { func TestEmptyPattern(t *testing.T) {
for _, dir := range []bool{true, false} { for _, dir := range []bool{true, false} {
assertMatch(t, FuzzyMatch, true, dir, "foobar", "", 0, 0, 0) assertMatch(t, FuzzyMatchV1, true, dir, "foobar", "", 0, 0, 0)
assertMatch(t, FuzzyMatchV2, true, dir, "foobar", "", 0, 0, 0)
assertMatch(t, ExactMatchNaive, true, dir, "foobar", "", 0, 0, 0) assertMatch(t, ExactMatchNaive, true, dir, "foobar", "", 0, 0, 0)
assertMatch(t, PrefixMatch, true, dir, "foobar", "", 0, 0, 0) assertMatch(t, PrefixMatch, true, dir, "foobar", "", 0, 0, 0)
assertMatch(t, SuffixMatch, true, dir, "foobar", "", 6, 6, 0) assertMatch(t, SuffixMatch, true, dir, "foobar", "", 6, 6, 0)

View File

@ -9,7 +9,7 @@ import (
func TestChunkList(t *testing.T) { func TestChunkList(t *testing.T) {
// FIXME global // FIXME global
sortCriteria = []criterion{byMatchLen, byLength} sortCriteria = []criterion{byScore, byLength}
cl := NewChunkList(func(s []byte, i int) *Item { cl := NewChunkList(func(s []byte, i int) *Item {
return &Item{text: util.ToChars(s), index: int32(i * 2)} return &Item{text: util.ToChars(s), index: int32(i * 2)}

View File

@ -8,7 +8,7 @@ import (
const ( const (
// Current version // Current version
version = "0.13.5" version = "0.15.0"
// Core // Core
coordinatorDelayMax time.Duration = 100 * time.Millisecond coordinatorDelayMax time.Duration = 100 * time.Millisecond
@ -24,11 +24,17 @@ const (
spinnerDuration = 200 * time.Millisecond spinnerDuration = 200 * time.Millisecond
// Matcher // Matcher
numPartitionsMultiplier = 8
maxPartitions = 32
progressMinDuration = 200 * time.Millisecond progressMinDuration = 200 * time.Millisecond
// Capacity of each chunk // Capacity of each chunk
chunkSize int = 100 chunkSize int = 100
// Pre-allocated memory slices to minimize GC
slab16Size int = 100 * 1024 // 200KB * 32 = 12.8MB
slab32Size int = 2048 // 8KB * 32 = 256KB
// Do not cache results of low selectivity queries // Do not cache results of low selectivity queries
queryCacheMax int = chunkSize / 5 queryCacheMax int = chunkSize / 5

View File

@ -143,8 +143,8 @@ func Run(opts *Options) {
} }
patternBuilder := func(runes []rune) *Pattern { patternBuilder := func(runes []rune) *Pattern {
return BuildPattern( return BuildPattern(
opts.Fuzzy, opts.Extended, opts.Case, forward, opts.Filter == nil, opts.Fuzzy, opts.FuzzyAlgo, opts.Extended, opts.Case, forward,
opts.Nth, opts.Delimiter, runes) opts.Filter == nil, opts.Nth, opts.Delimiter, runes)
} }
matcher := NewMatcher(patternBuilder, sort, opts.Tac, eventBox) matcher := NewMatcher(patternBuilder, sort, opts.Tac, eventBox)
@ -158,11 +158,12 @@ func Run(opts *Options) {
found := false found := false
if streamingFilter { if streamingFilter {
slab := util.MakeSlab(slab16Size, slab32Size)
reader := Reader{ reader := Reader{
func(runes []byte) bool { func(runes []byte) bool {
item := chunkList.trans(runes, 0) item := chunkList.trans(runes, 0)
if item != nil { if item != nil {
if result, _ := pattern.MatchItem(item); result != nil { if result, _, _ := pattern.MatchItem(item, false, slab); result != nil {
fmt.Println(item.text.ToString()) fmt.Println(item.text.ToString())
found = true found = true
} }

View File

@ -26,6 +26,7 @@ type Matcher struct {
eventBox *util.EventBox eventBox *util.EventBox
reqBox *util.EventBox reqBox *util.EventBox
partitions int partitions int
slab []*util.Slab
mergerCache map[string]*Merger mergerCache map[string]*Merger
} }
@ -37,13 +38,15 @@ const (
// NewMatcher returns a new Matcher // NewMatcher returns a new Matcher
func NewMatcher(patternBuilder func([]rune) *Pattern, func NewMatcher(patternBuilder func([]rune) *Pattern,
sort bool, tac bool, eventBox *util.EventBox) *Matcher { sort bool, tac bool, eventBox *util.EventBox) *Matcher {
partitions := util.Min(numPartitionsMultiplier*runtime.NumCPU(), maxPartitions)
return &Matcher{ return &Matcher{
patternBuilder: patternBuilder, patternBuilder: patternBuilder,
sort: sort, sort: sort,
tac: tac, tac: tac,
eventBox: eventBox, eventBox: eventBox,
reqBox: util.NewEventBox(), reqBox: util.NewEventBox(),
partitions: util.Min(8*runtime.NumCPU(), 32), partitions: partitions,
slab: make([]*util.Slab, partitions),
mergerCache: make(map[string]*Merger)} mergerCache: make(map[string]*Merger)}
} }
@ -153,12 +156,15 @@ func (m *Matcher) scan(request MatchRequest) (*Merger, bool) {
for idx, chunks := range slices { for idx, chunks := range slices {
waitGroup.Add(1) waitGroup.Add(1)
go func(idx int, chunks []*Chunk) { if m.slab[idx] == nil {
m.slab[idx] = util.MakeSlab(slab16Size, slab32Size)
}
go func(idx int, slab *util.Slab, chunks []*Chunk) {
defer func() { waitGroup.Done() }() defer func() { waitGroup.Done() }()
count := 0 count := 0
allMatches := make([][]*Result, len(chunks)) allMatches := make([][]*Result, len(chunks))
for idx, chunk := range chunks { for idx, chunk := range chunks {
matches := request.pattern.Match(chunk) matches := request.pattern.Match(chunk, slab)
allMatches[idx] = matches allMatches[idx] = matches
count += len(matches) count += len(matches)
if cancelled.Get() { if cancelled.Get() {
@ -178,7 +184,7 @@ func (m *Matcher) scan(request MatchRequest) (*Merger, bool) {
} }
} }
resultChan <- partialResult{idx, sliceMatches} resultChan <- partialResult{idx, sliceMatches}
}(idx, chunks) }(idx, m.slab[idx], chunks)
} }
wait := func() bool { wait := func() bool {

View File

@ -8,6 +8,7 @@ import (
"strings" "strings"
"unicode/utf8" "unicode/utf8"
"github.com/junegunn/fzf/src/algo"
"github.com/junegunn/fzf/src/curses" "github.com/junegunn/fzf/src/curses"
"github.com/junegunn/go-shellwords" "github.com/junegunn/go-shellwords"
@ -19,6 +20,7 @@ const usage = `usage: fzf [options]
-x, --extended Extended-search mode -x, --extended Extended-search mode
(enabled by default; +x or --no-extended to disable) (enabled by default; +x or --no-extended to disable)
-e, --exact Enable Exact-match -e, --exact Enable Exact-match
--algo=TYPE Fuzzy matching algorithm: [v1|v2] (default: v2)
-i Case-insensitive match (default: smart-case match) -i Case-insensitive match (default: smart-case match)
+i Case-sensitive match +i Case-sensitive match
-n, --nth=N[,..] Comma-separated list of field index expressions -n, --nth=N[,..] Comma-separated list of field index expressions
@ -94,8 +96,7 @@ const (
type criterion int type criterion int
const ( const (
byMatchLen criterion = iota byScore criterion = iota
byBonus
byLength byLength
byBegin byBegin
byEnd byEnd
@ -129,6 +130,7 @@ type previewOpts struct {
// Options stores the values of command-line options // Options stores the values of command-line options
type Options struct { type Options struct {
Fuzzy bool Fuzzy bool
FuzzyAlgo algo.Algo
Extended bool Extended bool
Case Case Case Case
Nth []Range Nth []Range
@ -172,6 +174,7 @@ type Options struct {
func defaultOptions() *Options { func defaultOptions() *Options {
return &Options{ return &Options{
Fuzzy: true, Fuzzy: true,
FuzzyAlgo: algo.FuzzyMatchV2,
Extended: true, Extended: true,
Case: CaseSmart, Case: CaseSmart,
Nth: make([]Range, 0), Nth: make([]Range, 0),
@ -179,7 +182,7 @@ func defaultOptions() *Options {
Delimiter: Delimiter{}, Delimiter: Delimiter{},
Sort: 1000, Sort: 1000,
Tac: false, Tac: false,
Criteria: []criterion{byMatchLen, byBonus, byLength}, Criteria: []criterion{byScore, byLength},
Multi: false, Multi: false,
Ansi: false, Ansi: false,
Mouse: true, Mouse: true,
@ -322,6 +325,18 @@ func isAlphabet(char uint8) bool {
return char >= 'a' && char <= 'z' return char >= 'a' && char <= 'z'
} }
func parseAlgo(str string) algo.Algo {
switch str {
case "v1":
return algo.FuzzyMatchV1
case "v2":
return algo.FuzzyMatchV2
default:
errorExit("invalid algorithm (expected: v1 or v2)")
}
return algo.FuzzyMatchV2
}
func parseKeyChords(str string, message string) map[int]string { func parseKeyChords(str string, message string) map[int]string {
if len(str) == 0 { if len(str) == 0 {
errorExit(message) errorExit(message)
@ -407,7 +422,7 @@ func parseKeyChords(str string, message string) map[int]string {
} }
func parseTiebreak(str string) []criterion { func parseTiebreak(str string) []criterion {
criteria := []criterion{byMatchLen, byBonus} criteria := []criterion{byScore}
hasIndex := false hasIndex := false
hasLength := false hasLength := false
hasBegin := false hasBegin := false
@ -834,6 +849,8 @@ func parseOptions(opts *Options, allArgs []string) {
case "-f", "--filter": case "-f", "--filter":
filter := nextString(allArgs, &i, "query string required") filter := nextString(allArgs, &i, "query string required")
opts.Filter = &filter opts.Filter = &filter
case "--algo":
opts.FuzzyAlgo = parseAlgo(nextString(allArgs, &i, "algorithm required (v1|v2)"))
case "--expect": case "--expect":
opts.Expect = parseKeyChords(nextString(allArgs, &i, "key names required"), "key names required") opts.Expect = parseKeyChords(nextString(allArgs, &i, "key names required"), "key names required")
case "--tiebreak": case "--tiebreak":
@ -962,7 +979,9 @@ func parseOptions(opts *Options, allArgs []string) {
case "--version": case "--version":
opts.Version = true opts.Version = true
default: default:
if match, value := optString(arg, "-q", "--query="); match { if match, value := optString(arg, "--algo="); match {
opts.FuzzyAlgo = parseAlgo(value)
} else if match, value := optString(arg, "-q", "--query="); match {
opts.Query = value opts.Query = value
} else if match, value := optString(arg, "-f", "--filter="); match { } else if match, value := optString(arg, "-f", "--filter="); match {
opts.Filter = &value opts.Filter = &value

View File

@ -40,6 +40,7 @@ type termSet []term
// Pattern represents search pattern // Pattern represents search pattern
type Pattern struct { type Pattern struct {
fuzzy bool fuzzy bool
fuzzyAlgo algo.Algo
extended bool extended bool
caseSensitive bool caseSensitive bool
forward bool forward bool
@ -48,7 +49,7 @@ type Pattern struct {
cacheable bool cacheable bool
delimiter Delimiter delimiter Delimiter
nth []Range nth []Range
procFun map[termType]func(bool, bool, util.Chars, []rune) algo.Result procFun map[termType]algo.Algo
} }
var ( var (
@ -74,7 +75,7 @@ func clearChunkCache() {
} }
// BuildPattern builds Pattern object from the given arguments // BuildPattern builds Pattern object from the given arguments
func BuildPattern(fuzzy bool, extended bool, caseMode Case, forward bool, func BuildPattern(fuzzy bool, fuzzyAlgo algo.Algo, extended bool, caseMode Case, forward bool,
cacheable bool, nth []Range, delimiter Delimiter, runes []rune) *Pattern { cacheable bool, nth []Range, delimiter Delimiter, runes []rune) *Pattern {
var asString string var asString string
@ -116,6 +117,7 @@ func BuildPattern(fuzzy bool, extended bool, caseMode Case, forward bool,
ptr := &Pattern{ ptr := &Pattern{
fuzzy: fuzzy, fuzzy: fuzzy,
fuzzyAlgo: fuzzyAlgo,
extended: extended, extended: extended,
caseSensitive: caseSensitive, caseSensitive: caseSensitive,
forward: forward, forward: forward,
@ -124,9 +126,9 @@ func BuildPattern(fuzzy bool, extended bool, caseMode Case, forward bool,
cacheable: cacheable, cacheable: cacheable,
nth: nth, nth: nth,
delimiter: delimiter, delimiter: delimiter,
procFun: make(map[termType]func(bool, bool, util.Chars, []rune) algo.Result)} procFun: make(map[termType]algo.Algo)}
ptr.procFun[termFuzzy] = algo.FuzzyMatch ptr.procFun[termFuzzy] = fuzzyAlgo
ptr.procFun[termEqual] = algo.EqualMatch ptr.procFun[termEqual] = algo.EqualMatch
ptr.procFun[termExact] = algo.ExactMatchNaive ptr.procFun[termExact] = algo.ExactMatchNaive
ptr.procFun[termPrefix] = algo.PrefixMatch ptr.procFun[termPrefix] = algo.PrefixMatch
@ -234,7 +236,7 @@ func (p *Pattern) CacheKey() string {
} }
// Match returns the list of matches Items in the given Chunk // Match returns the list of matches Items in the given Chunk
func (p *Pattern) Match(chunk *Chunk) []*Result { func (p *Pattern) Match(chunk *Chunk, slab *util.Slab) []*Result {
// ChunkCache: Exact match // ChunkCache: Exact match
cacheKey := p.CacheKey() cacheKey := p.CacheKey()
if p.cacheable { if p.cacheable {
@ -260,7 +262,7 @@ Loop:
} }
} }
matches := p.matchChunk(chunk, space) matches := p.matchChunk(chunk, space, slab)
if p.cacheable { if p.cacheable {
_cache.Add(chunk, cacheKey, matches) _cache.Add(chunk, cacheKey, matches)
@ -268,18 +270,18 @@ Loop:
return matches return matches
} }
func (p *Pattern) matchChunk(chunk *Chunk, space []*Result) []*Result { func (p *Pattern) matchChunk(chunk *Chunk, space []*Result, slab *util.Slab) []*Result {
matches := []*Result{} matches := []*Result{}
if space == nil { if space == nil {
for _, item := range *chunk { for _, item := range *chunk {
if match, _ := p.MatchItem(item); match != nil { if match, _, _ := p.MatchItem(item, false, slab); match != nil {
matches = append(matches, match) matches = append(matches, match)
} }
} }
} else { } else {
for _, result := range space { for _, result := range space {
if match, _ := p.MatchItem(result.item); match != nil { if match, _, _ := p.MatchItem(result.item, false, slab); match != nil {
matches = append(matches, match) matches = append(matches, match)
} }
} }
@ -288,62 +290,75 @@ func (p *Pattern) matchChunk(chunk *Chunk, space []*Result) []*Result {
} }
// MatchItem returns true if the Item is a match // MatchItem returns true if the Item is a match
func (p *Pattern) MatchItem(item *Item) (*Result, []Offset) { func (p *Pattern) MatchItem(item *Item, withPos bool, slab *util.Slab) (*Result, []Offset, *[]int) {
if p.extended { if p.extended {
if offsets, bonus, trimLen := p.extendedMatch(item); len(offsets) == len(p.termSets) { if offsets, bonus, trimLen, pos := p.extendedMatch(item, withPos, slab); len(offsets) == len(p.termSets) {
return buildResult(item, offsets, bonus, trimLen), offsets return buildResult(item, offsets, bonus, trimLen), offsets, pos
} }
return nil, nil return nil, nil, nil
} }
offset, bonus, trimLen := p.basicMatch(item) offset, bonus, trimLen, pos := p.basicMatch(item, withPos, slab)
if sidx := offset[0]; sidx >= 0 { if sidx := offset[0]; sidx >= 0 {
offsets := []Offset{offset} offsets := []Offset{offset}
return buildResult(item, offsets, bonus, trimLen), offsets return buildResult(item, offsets, bonus, trimLen), offsets, pos
} }
return nil, nil return nil, nil, nil
} }
func (p *Pattern) basicMatch(item *Item) (Offset, int, int) { func (p *Pattern) basicMatch(item *Item, withPos bool, slab *util.Slab) (Offset, int, int, *[]int) {
input := p.prepareInput(item) input := p.prepareInput(item)
if p.fuzzy { if p.fuzzy {
return p.iter(algo.FuzzyMatch, input, p.caseSensitive, p.forward, p.text) return p.iter(p.fuzzyAlgo, input, p.caseSensitive, p.forward, p.text, withPos, slab)
} }
return p.iter(algo.ExactMatchNaive, input, p.caseSensitive, p.forward, p.text) return p.iter(algo.ExactMatchNaive, input, p.caseSensitive, p.forward, p.text, withPos, slab)
} }
func (p *Pattern) extendedMatch(item *Item) ([]Offset, int, int) { func (p *Pattern) extendedMatch(item *Item, withPos bool, slab *util.Slab) ([]Offset, int, int, *[]int) {
input := p.prepareInput(item) input := p.prepareInput(item)
offsets := []Offset{} offsets := []Offset{}
var totalBonus int var totalScore int
var totalTrimLen int var totalTrimLen int
var allPos *[]int
if withPos {
allPos = &[]int{}
}
for _, termSet := range p.termSets { for _, termSet := range p.termSets {
var offset Offset var offset Offset
var bonus int var currentScore int
var trimLen int var trimLen int
matched := false matched := false
for _, term := range termSet { for _, term := range termSet {
pfun := p.procFun[term.typ] pfun := p.procFun[term.typ]
off, pen, tLen := p.iter(pfun, input, term.caseSensitive, p.forward, term.text) off, score, tLen, pos := p.iter(pfun, input, term.caseSensitive, p.forward, term.text, withPos, slab)
if sidx := off[0]; sidx >= 0 { if sidx := off[0]; sidx >= 0 {
if term.inv { if term.inv {
continue continue
} }
offset, bonus, trimLen = off, pen, tLen offset, currentScore, trimLen = off, score, tLen
matched = true matched = true
if withPos {
if pos != nil {
*allPos = append(*allPos, *pos...)
} else {
for idx := off[0]; idx < off[1]; idx++ {
*allPos = append(*allPos, int(idx))
}
}
}
break break
} else if term.inv { } else if term.inv {
offset, bonus, trimLen = Offset{0, 0}, 0, 0 offset, currentScore, trimLen = Offset{0, 0}, 0, 0
matched = true matched = true
continue continue
} }
} }
if matched { if matched {
offsets = append(offsets, offset) offsets = append(offsets, offset)
totalBonus += bonus totalScore += currentScore
totalTrimLen += trimLen totalTrimLen += trimLen
} }
} }
return offsets, totalBonus, totalTrimLen return offsets, totalScore, totalTrimLen, allPos
} }
func (p *Pattern) prepareInput(item *Item) []Token { func (p *Pattern) prepareInput(item *Item) []Token {
@ -362,14 +377,18 @@ func (p *Pattern) prepareInput(item *Item) []Token {
return ret return ret
} }
func (p *Pattern) iter(pfun func(bool, bool, util.Chars, []rune) algo.Result, func (p *Pattern) iter(pfun algo.Algo, tokens []Token, caseSensitive bool, forward bool, pattern []rune, withPos bool, slab *util.Slab) (Offset, int, int, *[]int) {
tokens []Token, caseSensitive bool, forward bool, pattern []rune) (Offset, int, int) {
for _, part := range tokens { for _, part := range tokens {
if res := pfun(caseSensitive, forward, *part.text, pattern); res.Start >= 0 { if res, pos := pfun(caseSensitive, forward, *part.text, pattern, withPos, slab); res.Start >= 0 {
sidx := int32(res.Start) + part.prefixLength sidx := int32(res.Start) + part.prefixLength
eidx := int32(res.End) + part.prefixLength eidx := int32(res.End) + part.prefixLength
return Offset{sidx, eidx}, res.Bonus, int(part.trimLength) if pos != nil {
for idx := range *pos {
(*pos)[idx] += int(part.prefixLength)
} }
} }
return Offset{-1, -1}, 0, -1 return Offset{sidx, eidx}, res.Score, int(part.trimLength), pos
}
}
return Offset{-1, -1}, 0, -1, nil
} }

View File

@ -8,6 +8,12 @@ import (
"github.com/junegunn/fzf/src/util" "github.com/junegunn/fzf/src/util"
) )
var slab *util.Slab
func init() {
slab = util.MakeSlab(slab16Size, slab32Size)
}
func TestParseTermsExtended(t *testing.T) { func TestParseTermsExtended(t *testing.T) {
terms := parseTerms(true, CaseSmart, terms := parseTerms(true, CaseSmart,
"| aaa 'bbb ^ccc ddd$ !eee !'fff !^ggg !hhh$ | ^iii$ ^xxx | 'yyy | | zzz$ | !ZZZ |") "| aaa 'bbb ^ccc ddd$ !eee !'fff !^ggg !hhh$ | ^iii$ ^xxx | 'yyy | | zzz$ | !ZZZ |")
@ -69,26 +75,32 @@ func TestParseTermsEmpty(t *testing.T) {
func TestExact(t *testing.T) { func TestExact(t *testing.T) {
defer clearPatternCache() defer clearPatternCache()
clearPatternCache() clearPatternCache()
pattern := BuildPattern(true, true, CaseSmart, true, true, pattern := BuildPattern(true, algo.FuzzyMatchV2, true, CaseSmart, true, true,
[]Range{}, Delimiter{}, []rune("'abc")) []Range{}, Delimiter{}, []rune("'abc"))
res := algo.ExactMatchNaive( res, pos := algo.ExactMatchNaive(
pattern.caseSensitive, pattern.forward, util.RunesToChars([]rune("aabbcc abc")), pattern.termSets[0][0].text) pattern.caseSensitive, pattern.forward, util.RunesToChars([]rune("aabbcc abc")), pattern.termSets[0][0].text, true, nil)
if res.Start != 7 || res.End != 10 { if res.Start != 7 || res.End != 10 {
t.Errorf("%s / %d / %d", pattern.termSets, res.Start, res.End) t.Errorf("%s / %d / %d", pattern.termSets, res.Start, res.End)
} }
if pos != nil {
t.Errorf("pos is expected to be nil")
}
} }
func TestEqual(t *testing.T) { func TestEqual(t *testing.T) {
defer clearPatternCache() defer clearPatternCache()
clearPatternCache() clearPatternCache()
pattern := BuildPattern(true, true, CaseSmart, true, true, []Range{}, Delimiter{}, []rune("^AbC$")) pattern := BuildPattern(true, algo.FuzzyMatchV2, true, CaseSmart, true, true, []Range{}, Delimiter{}, []rune("^AbC$"))
match := func(str string, sidxExpected int, eidxExpected int) { match := func(str string, sidxExpected int, eidxExpected int) {
res := algo.EqualMatch( res, pos := algo.EqualMatch(
pattern.caseSensitive, pattern.forward, util.RunesToChars([]rune(str)), pattern.termSets[0][0].text) pattern.caseSensitive, pattern.forward, util.RunesToChars([]rune(str)), pattern.termSets[0][0].text, true, nil)
if res.Start != sidxExpected || res.End != eidxExpected { if res.Start != sidxExpected || res.End != eidxExpected {
t.Errorf("%s / %d / %d", pattern.termSets, res.Start, res.End) t.Errorf("%s / %d / %d", pattern.termSets, res.Start, res.End)
} }
if pos != nil {
t.Errorf("pos is expected to be nil")
}
} }
match("ABC", -1, -1) match("ABC", -1, -1)
match("AbC", 0, 3) match("AbC", 0, 3)
@ -97,17 +109,17 @@ func TestEqual(t *testing.T) {
func TestCaseSensitivity(t *testing.T) { func TestCaseSensitivity(t *testing.T) {
defer clearPatternCache() defer clearPatternCache()
clearPatternCache() clearPatternCache()
pat1 := BuildPattern(true, false, CaseSmart, true, true, []Range{}, Delimiter{}, []rune("abc")) pat1 := BuildPattern(true, algo.FuzzyMatchV2, false, CaseSmart, true, true, []Range{}, Delimiter{}, []rune("abc"))
clearPatternCache() clearPatternCache()
pat2 := BuildPattern(true, false, CaseSmart, true, true, []Range{}, Delimiter{}, []rune("Abc")) pat2 := BuildPattern(true, algo.FuzzyMatchV2, false, CaseSmart, true, true, []Range{}, Delimiter{}, []rune("Abc"))
clearPatternCache() clearPatternCache()
pat3 := BuildPattern(true, false, CaseIgnore, true, true, []Range{}, Delimiter{}, []rune("abc")) pat3 := BuildPattern(true, algo.FuzzyMatchV2, false, CaseIgnore, true, true, []Range{}, Delimiter{}, []rune("abc"))
clearPatternCache() clearPatternCache()
pat4 := BuildPattern(true, false, CaseIgnore, true, true, []Range{}, Delimiter{}, []rune("Abc")) pat4 := BuildPattern(true, algo.FuzzyMatchV2, false, CaseIgnore, true, true, []Range{}, Delimiter{}, []rune("Abc"))
clearPatternCache() clearPatternCache()
pat5 := BuildPattern(true, false, CaseRespect, true, true, []Range{}, Delimiter{}, []rune("abc")) pat5 := BuildPattern(true, algo.FuzzyMatchV2, false, CaseRespect, true, true, []Range{}, Delimiter{}, []rune("abc"))
clearPatternCache() clearPatternCache()
pat6 := BuildPattern(true, false, CaseRespect, true, true, []Range{}, Delimiter{}, []rune("Abc")) pat6 := BuildPattern(true, algo.FuzzyMatchV2, false, CaseRespect, true, true, []Range{}, Delimiter{}, []rune("Abc"))
if string(pat1.text) != "abc" || pat1.caseSensitive != false || if string(pat1.text) != "abc" || pat1.caseSensitive != false ||
string(pat2.text) != "Abc" || pat2.caseSensitive != true || string(pat2.text) != "Abc" || pat2.caseSensitive != true ||
@ -120,7 +132,7 @@ func TestCaseSensitivity(t *testing.T) {
} }
func TestOrigTextAndTransformed(t *testing.T) { func TestOrigTextAndTransformed(t *testing.T) {
pattern := BuildPattern(true, true, CaseSmart, true, true, []Range{}, Delimiter{}, []rune("jg")) pattern := BuildPattern(true, algo.FuzzyMatchV2, true, CaseSmart, true, true, []Range{}, Delimiter{}, []rune("jg"))
tokens := Tokenize(util.RunesToChars([]rune("junegunn")), Delimiter{}) tokens := Tokenize(util.RunesToChars([]rune("junegunn")), Delimiter{})
trans := Transform(tokens, []Range{Range{1, 1}}) trans := Transform(tokens, []Range{Range{1, 1}})
@ -133,24 +145,29 @@ func TestOrigTextAndTransformed(t *testing.T) {
transformed: trans}, transformed: trans},
} }
pattern.extended = extended pattern.extended = extended
matches := pattern.matchChunk(&chunk, nil) // No cache matches := pattern.matchChunk(&chunk, nil, slab) // No cache
if matches[0].item.text.ToString() != "junegunn" || string(*matches[0].item.origText) != "junegunn.choi" || if !(matches[0].item.text.ToString() == "junegunn" &&
!reflect.DeepEqual(matches[0].item.transformed, trans) { string(*matches[0].item.origText) == "junegunn.choi" &&
reflect.DeepEqual(matches[0].item.transformed, trans)) {
t.Error("Invalid match result", matches) t.Error("Invalid match result", matches)
} }
match, offsets := pattern.MatchItem(chunk[0]) match, offsets, pos := pattern.MatchItem(chunk[0], true, slab)
if match.item.text.ToString() != "junegunn" || string(*match.item.origText) != "junegunn.choi" || if !(match.item.text.ToString() == "junegunn" &&
offsets[0][0] != 0 || offsets[0][1] != 5 || string(*match.item.origText) == "junegunn.choi" &&
!reflect.DeepEqual(match.item.transformed, trans) { offsets[0][0] == 0 && offsets[0][1] == 5 &&
t.Error("Invalid match result", match) reflect.DeepEqual(match.item.transformed, trans)) {
t.Error("Invalid match result", match, offsets, extended)
}
if !((*pos)[0] == 4 && (*pos)[1] == 0) {
t.Error("Invalid pos array", *pos)
} }
} }
} }
func TestCacheKey(t *testing.T) { func TestCacheKey(t *testing.T) {
test := func(extended bool, patStr string, expected string, cacheable bool) { test := func(extended bool, patStr string, expected string, cacheable bool) {
pat := BuildPattern(true, extended, CaseSmart, true, true, []Range{}, Delimiter{}, []rune(patStr)) pat := BuildPattern(true, algo.FuzzyMatchV2, extended, CaseSmart, true, true, []Range{}, Delimiter{}, []rune(patStr))
if pat.CacheKey() != expected { if pat.CacheKey() != expected {
t.Errorf("Expected: %s, actual: %s", expected, pat.CacheKey()) t.Errorf("Expected: %s, actual: %s", expected, pat.CacheKey())
} }

View File

@ -19,8 +19,7 @@ type colorOffset struct {
} }
type rank struct { type rank struct {
// byMatchLen, byBonus, ... points [4]uint16
points [5]uint16
index int32 index int32
} }
@ -29,51 +28,36 @@ type Result struct {
rank rank rank rank
} }
func buildResult(item *Item, offsets []Offset, bonus int, trimLen int) *Result { func buildResult(item *Item, offsets []Offset, score int, trimLen int) *Result {
if len(offsets) > 1 { if len(offsets) > 1 {
sort.Sort(ByOrder(offsets)) sort.Sort(ByOrder(offsets))
} }
result := Result{item: item, rank: rank{index: item.index}} result := Result{item: item, rank: rank{index: item.index}}
matchlen := 0
prevEnd := 0
minBegin := math.MaxInt32
numChars := item.text.Length() numChars := item.text.Length()
minBegin := math.MaxUint16
maxEnd := 0
validOffsetFound := false
for _, offset := range offsets { for _, offset := range offsets {
begin := int(offset[0]) b, e := int(offset[0]), int(offset[1])
end := int(offset[1]) if b < e {
if prevEnd > begin { minBegin = util.Min(b, minBegin)
begin = prevEnd maxEnd = util.Max(e, maxEnd)
} validOffsetFound = true
if end > prevEnd {
prevEnd = end
}
if end > begin {
if begin < minBegin {
minBegin = begin
}
matchlen += end - begin
} }
} }
for idx, criterion := range sortCriteria { for idx, criterion := range sortCriteria {
var val uint16 val := uint16(math.MaxUint16)
switch criterion { switch criterion {
case byMatchLen: case byScore:
if matchlen == 0 {
val = math.MaxUint16
} else {
val = util.AsUint16(matchlen)
}
case byBonus:
// Higher is better // Higher is better
val = math.MaxUint16 - util.AsUint16(bonus) val = math.MaxUint16 - util.AsUint16(score)
case byLength: case byLength:
// If offsets is empty, trimLen will be 0, but we don't care // If offsets is empty, trimLen will be 0, but we don't care
val = util.AsUint16(trimLen) val = util.AsUint16(trimLen)
case byBegin: case byBegin:
// We can't just look at item.offsets[0][0] because it can be an inverse term if validOffsetFound {
whitePrefixLen := 0 whitePrefixLen := 0
for idx := 0; idx < numChars; idx++ { for idx := 0; idx < numChars; idx++ {
r := item.text.Get(idx) r := item.text.Get(idx)
@ -83,12 +67,10 @@ func buildResult(item *Item, offsets []Offset, bonus int, trimLen int) *Result {
} }
} }
val = util.AsUint16(minBegin - whitePrefixLen) val = util.AsUint16(minBegin - whitePrefixLen)
}
case byEnd: case byEnd:
if prevEnd > 0 { if validOffsetFound {
val = util.AsUint16(1 + numChars - prevEnd) val = util.AsUint16(1 + numChars - maxEnd)
} else {
// Empty offsets due to inverse terms.
val = 1
} }
} }
result.rank.points[idx] = val result.rank.points[idx] = val
@ -106,7 +88,7 @@ func (result *Result) Index() int32 {
} }
func minRank() rank { func minRank() rank {
return rank{index: 0, points: [5]uint16{0, math.MaxUint16, 0, 0, 0}} return rank{index: 0, points: [4]uint16{math.MaxUint16, 0, 0, 0}}
} }
func (result *Result) colorOffsets(matchOffsets []Offset, color int, bold bool, current bool) []colorOffset { func (result *Result) colorOffsets(matchOffsets []Offset, color int, bold bool, current bool) []colorOffset {
@ -245,7 +227,7 @@ func (a ByRelevanceTac) Less(i, j int) bool {
} }
func compareRanks(irank rank, jrank rank, tac bool) bool { func compareRanks(irank rank, jrank rank, tac bool) bool {
for idx := 0; idx < 5; idx++ { for idx := 0; idx < 4; idx++ {
left := irank.points[idx] left := irank.points[idx]
right := jrank.points[idx] right := jrank.points[idx]
if left < right { if left < right {

View File

@ -26,7 +26,7 @@ func TestOffsetSort(t *testing.T) {
func TestRankComparison(t *testing.T) { func TestRankComparison(t *testing.T) {
rank := func(vals ...uint16) rank { rank := func(vals ...uint16) rank {
return rank{ return rank{
points: [5]uint16{vals[0], 0, vals[1], vals[2], vals[3]}, points: [4]uint16{vals[0], vals[1], vals[2], vals[3]},
index: int32(vals[4])} index: int32(vals[4])}
} }
if compareRanks(rank(3, 0, 0, 0, 5), rank(2, 0, 0, 0, 7), false) || if compareRanks(rank(3, 0, 0, 0, 5), rank(2, 0, 0, 0, 7), false) ||
@ -47,11 +47,15 @@ func TestRankComparison(t *testing.T) {
// Match length, string length, index // Match length, string length, index
func TestResultRank(t *testing.T) { func TestResultRank(t *testing.T) {
// FIXME global // FIXME global
sortCriteria = []criterion{byMatchLen, byBonus, byLength} sortCriteria = []criterion{byScore, byLength}
strs := [][]rune{[]rune("foo"), []rune("foobar"), []rune("bar"), []rune("baz")} strs := [][]rune{[]rune("foo"), []rune("foobar"), []rune("bar"), []rune("baz")}
item1 := buildResult(&Item{text: util.RunesToChars(strs[0]), index: 1}, []Offset{}, 2, 3) item1 := buildResult(&Item{text: util.RunesToChars(strs[0]), index: 1}, []Offset{}, 2, 3)
if item1.rank.points[0] != math.MaxUint16 || item1.rank.points[1] != math.MaxUint16-2 || item1.rank.points[2] != 3 || item1.item.index != 1 { if item1.rank.points[0] != math.MaxUint16-2 || // Bonus
item1.rank.points[1] != 3 || // Length
item1.rank.points[2] != 0 || // Unused
item1.rank.points[3] != 0 || // Unused
item1.item.index != 1 {
t.Error(item1.rank) t.Error(item1.rank)
} }
// Only differ in index // Only differ in index
@ -71,16 +75,16 @@ func TestResultRank(t *testing.T) {
} }
// Sort by relevance // Sort by relevance
item3 := buildResult(&Item{index: 2}, []Offset{Offset{1, 3}, Offset{5, 7}}, 0, 0) item3 := buildResult(&Item{index: 2}, []Offset{Offset{1, 3}, Offset{5, 7}}, 3, 0)
item4 := buildResult(&Item{index: 2}, []Offset{Offset{1, 2}, Offset{6, 7}}, 0, 0) item4 := buildResult(&Item{index: 2}, []Offset{Offset{1, 2}, Offset{6, 7}}, 4, 0)
item5 := buildResult(&Item{index: 2}, []Offset{Offset{1, 3}, Offset{5, 7}}, 0, 0) item5 := buildResult(&Item{index: 2}, []Offset{Offset{1, 3}, Offset{5, 7}}, 5, 0)
item6 := buildResult(&Item{index: 2}, []Offset{Offset{1, 2}, Offset{6, 7}}, 0, 0) item6 := buildResult(&Item{index: 2}, []Offset{Offset{1, 2}, Offset{6, 7}}, 6, 0)
items = []*Result{item1, item2, item3, item4, item5, item6} items = []*Result{item1, item2, item3, item4, item5, item6}
sort.Sort(ByRelevance(items)) sort.Sort(ByRelevance(items))
if items[0] != item6 || items[1] != item4 || if !(items[0] == item6 && items[1] == item5 &&
items[2] != item5 || items[3] != item3 || items[2] == item4 && items[3] == item3 &&
items[4] != item2 || items[5] != item1 { items[4] == item2 && items[5] == item1) {
t.Error(items) t.Error(items, item1, item2, item3, item4, item5, item6)
} }
} }

View File

@ -18,6 +18,8 @@ import (
"github.com/junegunn/go-runewidth" "github.com/junegunn/go-runewidth"
) )
// import "github.com/pkg/profile"
type jumpMode int type jumpMode int
const ( const (
@ -73,6 +75,7 @@ type Terminal struct {
initFunc func() initFunc func()
suppress bool suppress bool
startChan chan bool startChan chan bool
slab *util.Slab
} }
type selectedItem struct { type selectedItem struct {
@ -276,6 +279,7 @@ func NewTerminal(opts *Options, eventBox *util.EventBox) *Terminal {
eventBox: eventBox, eventBox: eventBox,
mutex: sync.Mutex{}, mutex: sync.Mutex{},
suppress: true, suppress: true,
slab: util.MakeSlab(slab16Size, slab32Size),
startChan: make(chan bool, 1), startChan: make(chan bool, 1),
initFunc: func() { initFunc: func() {
C.Init(opts.Theme, opts.Black, opts.Mouse) C.Init(opts.Theme, opts.Black, opts.Mouse)
@ -674,14 +678,25 @@ func (t *Terminal) printHighlighted(result *Result, bold bool, col1 int, col2 in
text := make([]rune, item.text.Length()) text := make([]rune, item.text.Length())
copy(text, item.text.ToRunes()) copy(text, item.text.ToRunes())
matchOffsets := []Offset{} matchOffsets := []Offset{}
var pos *[]int
if t.merger.pattern != nil { if t.merger.pattern != nil {
_, matchOffsets = t.merger.pattern.MatchItem(item) _, matchOffsets, pos = t.merger.pattern.MatchItem(item, true, t.slab)
}
charOffsets := matchOffsets
if pos != nil {
charOffsets = make([]Offset, len(*pos))
for idx, p := range *pos {
offset := Offset{int32(p), int32(p + 1)}
charOffsets[idx] = offset
}
sort.Sort(ByOrder(charOffsets))
} }
var maxe int var maxe int
for _, offset := range matchOffsets { for _, offset := range charOffsets {
maxe = util.Max(maxe, int(offset[1])) maxe = util.Max(maxe, int(offset[1]))
} }
offsets := result.colorOffsets(matchOffsets, col2, bold, current)
offsets := result.colorOffsets(charOffsets, col2, bold, current)
maxWidth := t.window.Width - 3 maxWidth := t.window.Width - 3
maxe = util.Constrain(maxe+util.Min(maxWidth/2-2, t.hscrollOff), 0, len(text)) maxe = util.Constrain(maxe+util.Min(maxWidth/2-2, t.hscrollOff), 0, len(text))
if overflow(text, maxWidth) { if overflow(text, maxWidth) {
@ -876,6 +891,7 @@ func (t *Terminal) current() string {
// Loop is called to start Terminal I/O // Loop is called to start Terminal I/O
func (t *Terminal) Loop() { func (t *Terminal) Loop() {
// prof := profile.Start(profile.ProfilePath("/tmp/"))
<-t.startChan <-t.startChan
{ // Late initialization { // Late initialization
intChan := make(chan os.Signal, 1) intChan := make(chan os.Signal, 1)
@ -953,6 +969,7 @@ func (t *Terminal) Loop() {
if code <= exitNoMatch && t.history != nil { if code <= exitNoMatch && t.history != nil {
t.history.append(string(t.input)) t.history.append(string(t.input))
} }
// prof.Stop()
os.Exit(code) os.Exit(code)
} }

12
src/util/slab.go Normal file
View File

@ -0,0 +1,12 @@
package util
type Slab struct {
I16 []int16
I32 []int32
}
func MakeSlab(size16 int, size32 int) *Slab {
return &Slab{
I16: make([]int16, size16),
I32: make([]int32, size32)}
}

View File

@ -18,6 +18,22 @@ func Max(first int, second int) int {
return second return second
} }
// Max16 returns the largest integer
func Max16(first int16, second int16) int16 {
if first >= second {
return first
}
return second
}
// Max32 returns the largest 32-bit integer
func Max32(first int32, second int32) int32 {
if first > second {
return first
}
return second
}
// Min returns the smallest integer // Min returns the smallest integer
func Min(first int, second int) int { func Min(first int, second int) int {
if first <= second { if first <= second {
@ -34,14 +50,6 @@ func Min32(first int32, second int32) int32 {
return second return second
} }
// Max32 returns the largest 32-bit integer
func Max32(first int32, second int32) int32 {
if first > second {
return first
}
return second
}
// Constrain32 limits the given 32-bit integer with the upper and lower bounds // Constrain32 limits the given 32-bit integer with the upper and lower bounds
func Constrain32(val int32, min int32, max int32) int32 { func Constrain32(val int32, min int32, max int32) int32 {
if val < min { if val < min {

View File

@ -517,162 +517,91 @@ class TestGoFZF < TestBase
assert_equal input, `#{FZF} -f"!z" -x --tiebreak end < #{tempname}`.split($/) assert_equal input, `#{FZF} -f"!z" -x --tiebreak end < #{tempname}`.split($/)
end end
# Since 0.11.2 def test_tiebreak_index_begin
def test_tiebreak_list writelines tempname, [
input = %w[ 'xoxxxxxoxx',
f-o-o-b-a-r 'xoxxxxxox',
foobar---- 'xxoxxxoxx',
--foobar 'xxxoxoxxx',
----foobar 'xxxxoxox',
foobar-- ' xxoxoxxx',
--foobar--
foobar
] ]
writelines tempname, input
assert_equal %w[ assert_equal [
foobar---- 'xxxxoxox',
--foobar ' xxoxoxxx',
----foobar 'xxxoxoxxx',
foobar-- 'xxoxxxoxx',
--foobar-- 'xoxxxxxox',
foobar 'xoxxxxxoxx',
f-o-o-b-a-r ], `#{FZF} -foo < #{tempname}`.split($/)
], `#{FZF} -ffb --tiebreak=index < #{tempname}`.split($/)
by_length = %w[ assert_equal [
foobar 'xxxoxoxxx',
--foobar 'xxxxoxox',
foobar-- ' xxoxoxxx',
foobar---- 'xxoxxxoxx',
----foobar 'xoxxxxxoxx',
--foobar-- 'xoxxxxxox',
f-o-o-b-a-r ], `#{FZF} -foo --tiebreak=index < #{tempname}`.split($/)
]
assert_equal by_length, `#{FZF} -ffb < #{tempname}`.split($/)
assert_equal by_length, `#{FZF} -ffb --tiebreak=length < #{tempname}`.split($/)
assert_equal %w[ # Note that --tiebreak=begin is now based on the first occurrence of the
foobar # first character on the pattern
foobar-- assert_equal [
--foobar ' xxoxoxxx',
foobar---- 'xxxoxoxxx',
--foobar-- 'xxxxoxox',
----foobar 'xxoxxxoxx',
f-o-o-b-a-r 'xoxxxxxoxx',
], `#{FZF} -ffb --tiebreak=length,begin < #{tempname}`.split($/) 'xoxxxxxox',
], `#{FZF} -foo --tiebreak=begin < #{tempname}`.split($/)
assert_equal %w[ assert_equal [
foobar ' xxoxoxxx',
--foobar 'xxxoxoxxx',
foobar-- 'xxxxoxox',
----foobar 'xxoxxxoxx',
--foobar-- 'xoxxxxxox',
foobar---- 'xoxxxxxoxx',
f-o-o-b-a-r ], `#{FZF} -foo --tiebreak=begin,length < #{tempname}`.split($/)
], `#{FZF} -ffb --tiebreak=length,end < #{tempname}`.split($/)
assert_equal %w[
foobar----
foobar--
foobar
--foobar
--foobar--
----foobar
f-o-o-b-a-r
], `#{FZF} -ffb --tiebreak=begin < #{tempname}`.split($/)
by_begin_end = %w[
foobar
foobar--
foobar----
--foobar
--foobar--
----foobar
f-o-o-b-a-r
]
assert_equal by_begin_end, `#{FZF} -ffb --tiebreak=begin,length < #{tempname}`.split($/)
assert_equal by_begin_end, `#{FZF} -ffb --tiebreak=begin,end < #{tempname}`.split($/)
assert_equal %w[
--foobar
----foobar
foobar
foobar--
--foobar--
foobar----
f-o-o-b-a-r
], `#{FZF} -ffb --tiebreak=end < #{tempname}`.split($/)
by_begin_end = %w[
foobar
--foobar
----foobar
foobar--
--foobar--
foobar----
f-o-o-b-a-r
]
assert_equal by_begin_end, `#{FZF} -ffb --tiebreak=end,begin < #{tempname}`.split($/)
assert_equal by_begin_end, `#{FZF} -ffb --tiebreak=end,length < #{tempname}`.split($/)
end end
def test_tiebreak_white_prefix def test_tiebreak_end
writelines tempname, [ writelines tempname, [
'f o o b a r', 'xoxxxxxxxx',
' foo bar', 'xxoxxxxxxx',
' foobar', 'xxxoxxxxxx',
'----foo bar', 'xxxxoxxxx',
'----foobar', 'xxxxxoxxx',
' foo bar', ' xxxxoxxx',
' foobar--',
' foobar',
'--foo bar',
'--foobar',
'foobar',
] ]
assert_equal [ assert_equal [
' foobar', ' xxxxoxxx',
' foobar', 'xxxxoxxxx',
'foobar', 'xxxxxoxxx',
' foobar--', 'xoxxxxxxxx',
'--foobar', 'xxoxxxxxxx',
'----foobar', 'xxxoxxxxxx',
' foo bar', ], `#{FZF} -fo < #{tempname}`.split($/)
' foo bar',
'--foo bar',
'----foo bar',
'f o o b a r',
], `#{FZF} -ffb < #{tempname}`.split($/)
assert_equal [ assert_equal [
' foobar', 'xxxxxoxxx',
' foobar--', ' xxxxoxxx',
' foobar', 'xxxxoxxxx',
'foobar', 'xxxoxxxxxx',
'--foobar', 'xxoxxxxxxx',
'----foobar', 'xoxxxxxxxx',
' foo bar', ], `#{FZF} -fo --tiebreak=end < #{tempname}`.split($/)
' foo bar',
'--foo bar',
'----foo bar',
'f o o b a r',
], `#{FZF} -ffb --tiebreak=begin < #{tempname}`.split($/)
assert_equal [ assert_equal [
' foobar', ' xxxxoxxx',
' foobar', 'xxxxxoxxx',
'foobar', 'xxxxoxxxx',
' foobar--', 'xxxoxxxxxx',
'--foobar', 'xxoxxxxxxx',
'----foobar', 'xoxxxxxxxx',
' foo bar', ], `#{FZF} -fo --tiebreak=end,length,begin < #{tempname}`.split($/)
' foo bar',
'--foo bar',
'----foo bar',
'f o o b a r',
], `#{FZF} -ffb --tiebreak=begin,length < #{tempname}`.split($/)
end end
def test_tiebreak_length_with_nth def test_tiebreak_length_with_nth
@ -748,17 +677,6 @@ class TestGoFZF < TestBase
assert_equal output, `#{FZF} -fi -n2,1..2 < #{tempname}`.split($/) assert_equal output, `#{FZF} -fi -n2,1..2 < #{tempname}`.split($/)
end end
def test_tiebreak_end_backward_scan
input = %w[
foobar-fb
fubar
]
writelines tempname, input
assert_equal input.reverse, `#{FZF} -f fb < #{tempname}`.split($/)
assert_equal input, `#{FZF} -f fb --tiebreak=end < #{tempname}`.split($/)
end
def test_invalid_cache def test_invalid_cache
tmux.send_keys "(echo d; echo D; echo x) | #{fzf '-q d'}", :Enter tmux.send_keys "(echo d; echo D; echo x) | #{fzf '-q d'}", :Enter
tmux.until { |lines| lines[-2].include? '2/3' } tmux.until { |lines| lines[-2].include? '2/3' }