KMP algorithm

字串比對問題:

給定一個字串(txt)和另一個字串(pattern),確認pattern字串是否存在txt字串內

EX:

txt: ABABAC , pattern: BAC => pattern存在txt內

txt: ABABAC , pattern: BAD => pattern不存在txt內

法一) Brute force

txt從index = 0開始比對pattern,如果遇到某個字元不match,就從 index += 1整個pattern重新比對,一直比到index = m-n個位置

// Time complexity: O((m-n)*n)
// Note: m為txt字串長度, n為pattern字串長度
func checkPattern(txt: String, pattern: String) -> Bool {
    let txtArr = Array(txt)
    let patternArr = Array(pattern)
    for i in 0 ..< (txtArr.count - patternArr.count) {
        var found = true
        for j in 0 ..<  patternArr.count {
            if patternArr[j] != txtArr[i + j] {
                found = false
                break
            }
        }
        if found {
            return true
        }
    }
    return false
}

print(checkPattern(txt: "AABAABAAA", pattern: "BAAB"))
print(checkPattern(txt: "AABAABAAA", pattern: "BAAC"))

法二) KMP algorithm

概念是利用pattern先去建立prefix table, 如果遇到字元不匹配,則根據前一個字元在prefix table中的值,把這個值當作檢查pattern的第x個位置,從這個位置繼續往下檢查。如果這個值為-1,則要從pattern的第0個位置檢查起,且txt的index += 1。

Note: 建立prefix table的方式與原理說明請參考下面兩個影片

func establishPrefixTable(pattern: String) -> [Int] {
    var i = 1
    var j = 0
    let patternArr = Array(pattern)
    var prefixTable = Array(repeating: 0, count: pattern.count)
    while i < patternArr.count {
        if patternArr[i] == patternArr[j] {
            j += 1
            prefixTable[i] = j
            i += 1
        } else {
            if j > 0 {
                j = prefixTable[j-1]
            } else {
                prefixTable[i] = 0
                i += 1
            }
        }
    }
    return prefixTable
}

func checkPatternByKMP(txt: String, pattern: String) -> Bool {
    let txtArr = Array(txt)
    let patternArr = Array(pattern)
    let prefixTable = establishPrefixTable(pattern: pattern)
    var i = 0 // for txt
    var j = 0 // for pattern
    while i < txtArr.count {
        if txtArr[i] == patternArr[j] {
            if j == (patternArr.count - 1) {
                return true
            } else {
                i += 1
                j += 1
            }
        } else {
            if j > 0 {
                j = prefixTable[j-1]
            } else {
                i += 1
            }
        }
    }
    return false
}

print(checkPatternByKMP(txt: "AABAABAAA", pattern: "BAAB"))
print(checkPatternByKMP(txt: "AABAABAAA", pattern: "BAAC"))