問題描述

You are given a string, s, and a list of words, words, that are all of the same length. Find all starting indices of substring(s) in s that is a concatenation of each word in words exactly once and without any intervening characters.
For example, given:
s: "barfoothefoobarman"
words: ["foo", "bar"]
You should return the indices:[0,9].
(order does not matter).

思路分析

//前面有一些彎路分析。
LeetCode上此題標注的難度是Hard，所以直接用暴力的方法應該無法滿足時間限制。考慮到單詞的長度是相同的（k），如果指針每次向后移動k位，那么之前的匹配結果在此次匹配時可以利用，例如

s = 'abacadabc'
words = ['a', 'b', 'c']

當匹配到s2時，發(fā)現(xiàn)‘a(chǎn)ba’不滿足要求（因為a的出現(xiàn)次數(shù)超過限制），但知道之前的‘a(chǎn)b’滿足要求而且s[2]位置的‘a(chǎn)’也在words中，因此可以推出‘ba’是滿足要求的，這樣在s[0]位置開始而在s[2]匹配失效后，可以直接從s[3]繼續(xù)進行下面的匹配，而不需再返回s[1]處重新開始；
當匹配到s5時，發(fā)現(xiàn)‘d’不在words中，因此匹配失效，那么下次應從s[6]位置重新開始匹配；
上面兩條規(guī)律通過保證“s不回溯”來提高匹配效率。
然而這么做存在問題。考慮下面兩個的例子：

s = 'aaaaaaaa'
words = ['aa', 'aa', 'aa']

由于words中可能存在重復的單詞（尤其是單詞全部相同的情況），指針每次向后移動k位會導致部分解的丟失，在上面例子中，如果每次向后移動2位，會丟失s[1]位置開始的匹配。

s =‘a(chǎn)babaab’
words = ['ab', 'ba', 'ba']

當匹配到s[2]（ab）處發(fā)現(xiàn)匹配失效但‘a(chǎn)b’在words中，從s[4]開始繼續(xù)匹配，會丟失s[1]位置開始的匹配。
因此，直接采取指針每次向后移動k位的方法是錯誤的。但如果不能每次移動k位，那么就無法保證s不回溯，相當于暴力解題，我嘗試了使用Hash的暴力解法，果然會"Time Limit Exceeded"。
然后我看到了九章算法中的解答。其精妙之處在于，它將任務分解為k種情況，考慮從[0, k-1]開始匹配，在每次匹配的時候，就可以保證不回溯。這種方法相當于暴力和技巧的結合。

AC代碼

#encoding=utf-8
class Solution(object):
    def findSubstring(self, s, words):
        """
        :type s: str
        :type words: List[str]
        :rtype: List[int]
        """
        result = []
        m = len(s) #待匹配串的長度
        n = len(words) #單詞個數(shù)
        if n == 0: 
            return result
        k = len(words[0]) #單詞長度
        dic = {}
        for word in words:
            if dic.has_key(word):
                dic[word] += 1
            else:
                dic[word] = 1
        dup = {}
        for begin in range(k):
            w_used = 0 #記錄已經(jīng)使用的單詞數(shù)
            p1 = begin
            p2 = p1 + k
            while m - p1 + 1 >= n * k: #s剩余長度大于等于單詞表各單詞總長度 
                #匹配時控制單詞的出現(xiàn)次數(shù)不超過限制，因此當w_used==n時，就是匹配成功了
                if w_used == n: 
                    result.append(p1)
                if p2 > m + k + 1:
                    break
                w = s[p2-k:p2]
                if w in dic:
                    if w in dup and dup[w] != 0:
                        #超過單詞次數(shù)限制，把p1移到已經(jīng)匹配的第一個w之后
                        if dup[w] == dic[w]: 
                            end = s.index(w, p1, p2)
                            for i in range(p1, end, k): 
                                rmv = s[i:i+k]
                                dup[rmv] -= 1
                                w_used -= 1
                            p1 = end + k
                            p2 += k
                        else:
                            dup[w] += 1
                            w_used += 1
                            p2 += k
                    else:
                        dup[w] = 1
                        w_used += 1
                        p2 += k
                else: #出現(xiàn)不在word里的單詞
                    p1 = p2
                    p2 = p1 + k
                    dup.clear()
                    w_used = 0
        return result

Runtime: 84 ms, which beats 89.72% of python submissions.

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美国产综合欧美视频

30. Substring with Concatenation of All Words

30. Substring with Concatenation of All Words

問題描述

思路分析

AC代碼

推薦閱讀更多精彩內(nèi)容

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美 国产 综合 欧美 视频

30. Substring with Concatenation of All Words

問題描述

思路分析

AC代碼

推薦閱讀更多精彩內(nèi)容

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美国产综合欧美视频