LeetCode刷题记录（字符串篇）

LeetCode刷题记录（字符串篇），个人向。
为了锻炼代码能力，会交替使用不同编程语言完成答案。

[344] Reverse String

难度：Easy
语言：C++

题目描述。
左右指针往中间靠拢即可，无难度。

  void reverseString(vector<char>& s) {
    char tmp;

    for (int left = 0, right = s.size() - 1; left < right; left++, right--) {
      tmp = s[left];
      s[left] = s[right];
      s[right] = tmp;
    }
  }

[541] Reverse String II

难度：Easy
语言：Python

题目描述。
按照题目要求反转字符串即可，基本无难度。

    def reverseStr(self, s: str, k: int) -> str:
        s = list(s)
        cur = 0
        size = len(s)
        while cur < size:
            left = cur
            right = min(left + k - 1, size - 1)
            while left < right:
                tmp = s[left]
                s[left] = s[right]
                s[right] = tmp
                left += 1
                right -= 1
            cur += 2 * k
        return "".join(s)

[151] Reverse Words in a String

难度：Medium
语言：Rust

题目描述。
最简单的方法是先分好词，然后翻转然后拼起来即可。但为了增加挑战性，我们希望能只用O(1)的额外空间，即in-place地解决问题。
那么如何in-place修改呢？我们可以考虑翻转两次。第一次翻转整个字符串，第二次再翻转各个单词，如此单词内部顺序正确，整体的单词顺序也被翻转过来了。除此之外，我们还需要先去除多余空格。由此，可得到以下步骤：

去除多余空格。
翻转字符串，然后翻转单词。

对于去除空格，我们使用了快慢指针的方式，即一个快指针在前面扫描，遇到多余的空格就不处理，否则把字符填到慢指针的地方，最后resize一下。
最后得到的代码如下。

impl Solution {
    pub fn reverse(s: &mut Vec<char>, mut begin: usize, mut end: usize) {
        while begin < end {
            let temp = s[begin];
            s[begin] = s[end];
            s[end] = temp;
            begin += 1;
            end -= 1;
        }
    }
    pub fn remove_extra_spaces(s: &mut Vec<char>) {
        let mut slow: usize = 0;
        let len = s.len();
        let mut i: usize = 0;
        while i < len {
            if !s[i].is_ascii_whitespace() {
                if slow != 0 {
                    s[slow] = ' ';
                    slow += 1;
                }
                while i < len && !s[i].is_ascii_whitespace() {
                    s[slow] = s[i];
                    slow += 1;
                    i += 1;
                }
            }
            i += 1;
        }
        s.resize(slow, ' ');
    }
    pub fn reverse_words(s: String) -> String {
        let mut s = s.chars().collect::<Vec<char>>();
        Self::remove_extra_spaces(&mut s);
        let len = s.len();
        Self::reverse(&mut s, 0, len - 1);
        let mut start = 0;
        for i in 0..=len {
            if i == len || s[i].is_ascii_whitespace() {
                Self::reverse(&mut s, start, i - 1);
                start = i + 1;
            }
        }
        s.iter().collect::<String>()
    }
}

[28] Find the Index of the First Occurrence in a String

难度：Easy
语言：Python

题目描述。
使用了KMP算法。关于KMP算法，具体可以看这篇文章。
简单来说，先构造next表，该表记录了needle从开始到index所处位置的最长公共前后缀的长度。之后在查找时，若needle和haystack当前位置匹配失败，则可根据next表找到needle中和后缀相同的前缀所处的末尾位置，然后再从该位置继续匹配，这样可省去前缀的重复匹配的开销。该算法的时间复杂度为O(m+n)。

class Solution:
    def getNext(self, needle: str) -> List:
        j = 0
        i = 1
        length = len(needle)
        next_table = [0] * length
        while i < length:
            while j > 0 and needle[i] != needle[j]:
                j = next_table[j - 1]
            if needle[i] == needle[j]:
                j += 1
            next_table[i] = j
            i += 1
        return next_table

    def strStr(self, haystack: str, needle: str) -> int:
        next_table = self.getNext(needle)
        j = 0
        i = 0
        haystack_length = len(haystack)
        needle_length = len(needle)
        while i < haystack_length:
            while j > 0 and haystack[i] != needle[j]:
                j = next_table[j - 1]
            if haystack[i] == needle[j]:
                j += 1
                if j == needle_length:
                    return i - needle_length + 1
            i += 1
        return -1

[459] Repeated Substring Pattern

难度：Easy
语言：C++

题目描述。
看见题目后先自己写了个简单解法。Naive的想法就是把pattern重复几次到s的长度，然后再看两者是否相等即可。其中又有一些小优化，例如pattern的长度最长只需要到total_length的一半，以及total_length % i == 0才需要继续判断。
这种方法是可以通过测试的，但还有更高效的算法。可以使用KMP算法来解决该问题。
在[28] Find the Index of the First Occurrence in a String中，我们提到了通过构建next表来存储最长公共前后缀的长度，在该题中也一样，先对字符串s构建next表。如果s存在最小重复子串单位，则total_length - common_length就是该最小单位的长度。最后，我们再检查是否total_length % diff == 0即可判断该字符串s是否由重复的子字符串构成。
更详细的解释可参考这篇文章。

class Solution {
 public:
  void getNext(int* next, string s) {
    int j = 0;
    int length = s.size();
    next[0] = j;
    for (int i = 1; i < length; ++i) {
      while (j > 0 && s[i] != s[j]) {
        j = next[j - 1];
      }
      if (s[i] == s[j]) {
        ++j;
      }
      next[i] = j;
    }
  }

  bool repeatedSubstringPattern(string s) {
    int total_length = s.size();
    vector<int> next(total_length);
    getNext(&next[0], s);
    int common_length = next[total_length - 1];

    if (common_length == 0) {
      return false;
    }

    int diff = total_length - common_length;
    if (total_length % diff == 0) {
      return true;
    } else {
      return false;
    }
  }

  bool repeatedSubstringPatternNaive(string s) {
    string pattern(1, s[0]);
    int total_length = s.size();
    int i = 1;

    while (i <= total_length / 2) {
      if (total_length % i == 0) {
        int times = total_length / i;
        string result = pattern;
        while (times-- > 1) {
          result += pattern;
        }
        if (result == s) {
          return true;
        }
      }
      pattern += s[i];
      ++i;
    }

    return false;
  }
};

总结

其实本质算法上看，字符串与数组并无区别，只是元素变成了字符而已。

字符串的一些题涉及到了双指针的方法。其中，双指针又可用于反转字符串。[151] Reverse Words in a String又在此基础上更进一步，需要把字符串翻转两次，其中第一次翻转整个字符串，第二次仅翻转单词来达到仅翻转所有单词顺序的效果。
另一部分题用到了KMP算法。KMP算法通过记录了从开始到字符串各个位置子串的最长公共前后缀的长度。通过该记录，可以在查找子串的过程中失败时方便回退到前缀末尾对应的位置，跳过已匹配的部分，降低时间复杂度。[459] Repeated Substring Pattern还通过数学原理判断了是否存在可重复的子串来构成原字符串。具体来说，通过total_length和common_length的差找到了可能的最小重复单位的长度，再通过取模判断原字符串是否是最小单位的倍数。

补充

[3] Longest Substring Without Repeating Characters

难度：Medium
语言：C++

题目描述。
使用滑动窗口即可。不断尝试移动右指针，若会导致重复则移动左指针使窗口缩小直到右指针指向元素的加入不会导致重复。重复这个过程一直到右指针到达边界，过程中记录窗口最大长度即可。
相比下官方题解优先移动左指针，然后再在不导致重复的前提下不断尝试移动右指针。其本质实际是一样的。

class Solution {
  public:
    int lengthOfLongestSubstring(string s) {
        int n = s.size();
        int maxLength = 0;
        int slow = 0, fast = 0;
        unordered_set<char> charSet;
        while (fast < n) {
            while (charSet.find(s[fast]) != charSet.end()) {
                charSet.erase(s[slow++]);
            }
            charSet.insert(s[fast]);
            maxLength = max(maxLength, fast - slow + 1);
            ++fast;
        }
        return maxLength;
    }
};

[13] Roman to Integer

难度：Easy
语言：C++

题目描述。
用unordered_map存储字符到数字的映射以简化代码。
根据规则，如果当前字符代表的数字比下一个字符代表的数字小，则减去当前数字而不是加上。

class Solution {
  public:
    unordered_map<char, int> hmap = {{'I', 1},   {'V', 5},   {'X', 10},
                                     {'L', 50},  {'C', 100}, {'D', 500},
                                     {'M', 1000}};

    int romanToInt(string s) {
        int n = s.size();
        int ret = 0;
        for (int i = 0; i < n; ++i) {
            int value = hmap[s[i]];
            if (i + 1 < n && value < hmap[s[i + 1]]) {
                ret -= value;
            } else {
                ret += value;
            }
        }
        return ret;
    }
};

[12] Integer to Roman

难度：Medium
语言：C++

题目描述。
直接打表，转换的时候（按照数字从大到小）查表即可。

const pair<int, string> hmap[] = {
    {1000, "M"}, {900, "CM"}, {500, "D"}, {400, "CD"}, {100, "C"},
    {90, "XC"},  {50, "L"},   {40, "XL"}, {10, "X"},   {9, "IX"},
    {5, "V"},    {4, "IV"},   {1, "I"}};
class Solution {
  public:
    string intToRoman(int num) {
        string result;
        for (auto &[value, symbol] : hmap) {
            while (num >= value) {
                result += symbol;
                num -= value;
            }
        }
        return result;
    }
};

[58] Length of Last Word

难度：Easy
语言：C++

题目描述。
容易想到遍历一遍，过程中记录最后遇到的单词的长度即可。但题目既然已经指定了是最后一个单词，那显然有性能更友好的方式，即从尾开始反向遍历，遍历完第一个单词即可，从而省去前面多余部分遍历。

class Solution {
  public:
    int lengthOfLastWord(string s) {
        int index = s.size() - 1;
        int length = 0;
        while (s[index] == ' ') {
            --index;
        }

        while (index >= 0 && s[index] != ' ') {
            length += 1;
            --index;
        }

        return length;
    }
};

[14] Longest Common Prefix

难度：Easy
语言：C++

题目描述。
先以strs[0]作为commenPrefix，然后用strs[1]来比较，比较后更新前缀长度，然后用strs[2]来比较，以此类推，中间一直更新前缀长度即可。

class Solution {
  public:
    string longestCommonPrefix(vector<string> &strs) {
        string commenPrefix = strs[0];
        int prefixLength = commenPrefix.size();
        int n = strs.size();
        for (int i = 1; i < n; ++i) {
            int length = min(prefixLength, (int)strs[i].size());
            int j = 0;
            while (j < length && commenPrefix[j] == strs[i][j]) {
                ++j;
            }
            prefixLength = j;
            if (prefixLength == 0) { // Early stop
                return commenPrefix.substr(0, 0);
            }
        }
        return commenPrefix.substr(0, prefixLength);
    }
};

[6] Zigzag Conversion

难度：Medium
语言：C++

题目描述。
可以看成是有numRows行string，模拟的过程就是不断在这numRows行string的后面追加。而追加的顺序是从上到下和从下到上不断变换，这一点用dir和curRow += dir来控制。于是有如下代码。
尽管，通过找规律的方式推导数学公式，可以直接写出答案，进一步把空间复杂度压缩到 $O(1)$ 。但找规律的方式始终不像是在考察代码能力，甚至都不像是在考察数学能力。个人认为比较偏题，因此没有采用。

class Solution {
  public:
    string convert(string s, int numRows) {
        if (numRows < 2) {
            return s;
        }

        vector<string> rows(numRows);
        string result = "";
        int dir = -1;
        int curRow = 0;
        for (char c : s) {
            rows[curRow] += c;
            if (curRow == 0 || curRow == numRows - 1) {
                dir = -dir;
            }
            curRow += dir;
        }

        for (string row : rows) {
            result += row;
        }
        return result;
    }
};

[68] Text Justification

难度：Hard
语言：C++

题目描述。
按照题目要求模拟即可。实际难度不高，主要是代码写起来麻烦。
要考虑以下情况：

最后一行：左对齐，单空格分隔，最后补空格。
一行只有一个单词：单词然后补空格
其他情况（不是最后一行且有多个单词）：除法计算平均空格数avgSpaces，取模得到左边有多少位置extraSpaces空格要+1。

此外，这里使用了left和right指针做优化，然后直接从words获取数据，避免了中间临时存储当前行words的开销。

class Solution {
    // Returns a string composed of n spaces.
    string blank(int n) { return string(n, ' '); }

    // Joins words in the range [left, right) using the separator sep.
    string join(vector<string> &words, int left, int right, string sep) {
        string s = words[left];
        for (int i = left + 1; i < right; ++i) {
            s += sep + words[i];
        }
        return s;
    }

  public:
    vector<string> fullJustify(vector<string> &words, int maxWidth) {
        vector<string> ans;
        int right = 0, n = words.size();
        while (true) {
            int left = right; // Index of the first word in the current line.
            int sumLen = 0;   // Total length of words in the current line.
            // Determine how many words can fit in the current line.
            // Note: At least one space is required between words.
            while (right < n &&
                   sumLen + words[right].length() + right - left <= maxWidth) {
                sumLen += words[right++].length();
            }

            // Last line: left-justified. Words are separated by a single space,
            // and extra spaces are appended at the end of the line.
            if (right == n) {
                string s = join(words, left, n, " ");
                ans.emplace_back(s + blank(maxWidth - s.length()));
                return ans;
            }

            int numWords = right - left;
            int numSpaces = maxWidth - sumLen;

            // If there is only one word in the line, left-justify it and pad
            // with spaces.
            if (numWords == 1) {
                ans.emplace_back(words[left] + blank(numSpaces));
                continue;
            }

            // For a line with multiple words, evenly distribute spaces.
            int avgSpaces = numSpaces / (numWords - 1);
            int extraSpaces = numSpaces % (numWords - 1);
            // Join words that receive an extra space.
            string s1 =
                join(words, left, left + extraSpaces + 1, blank(avgSpaces + 1));
            // Join the remaining words.
            string s2 =
                join(words, left + extraSpaces + 1, right, blank(avgSpaces));
            ans.emplace_back(s1 + blank(avgSpaces) + s2);
        }
    }
};

[125] Valid Palindrome

难度：Easy
语言：C++

题目描述。
容易想到我们可以先把字符串转成只包含字母数字并且纯小写的形式，然后再判断回文即可。但这样需要 $O(n)$ 的空间复杂度来存储中间字符串。
因此我们可以直接在原字符串上判断，遇到非字母数字的字符直接跳过即可，判断时再转一下小写。

class Solution {
  public:
    bool isPalindrome(string s) {
        int left = 0, right = s.size() - 1;
        while (left < right) {
            while (left < right && !isalnum(s[left])) {
                ++left;
            }
            while (left < right && !isalnum(s[right])) {
                --right;
            }
            if (tolower(s[left]) != tolower(s[right])) {
                return false;
            }
            ++left;
            --right;
        }
        return true;
    }
};

[392] Is Subsequence

难度：Easy
语言：C++

题目描述
双指针扫一遍即可，检查扫完t的时候是否扫完了s。基本无难度。

class Solution {
  public:
    bool isSubsequence(string s, string t) {
        int sp = 0, tp = 0;
        int sSize = s.size(), tSize = t.size();
        while (sp < sSize && tp < tSize) {
            if (s[sp] == t[tp]) {
                ++sp;
            }
            ++tp;
        }
        return sp >= sSize;
    }
};

[30] Substring with Concatenation of All Words

难度：Hard
语言：C++

题目描述。
使用滑动窗口，一次滑动单词长度n。
我们再分别选取i从0到n-1的位置作为滑动窗口的起点，即可覆盖所有结果。
因此代码是一个二重循环：

外循环负责分别选取0到n-1的位置作为滑动起点。
内循环负责进行滑动窗口的滑动，一次滑动长度n。

class Solution {
  public:
    vector<int> findSubstring(string s, vector<string> &words) {
        int m = words.size();
        int n = words[0].size();
        int ls = s.size();
        vector<int> result;
        for (int i = 0; i < n && i + m * n - 1 < ls; ++i) {
            unordered_map<string, int> diff;
            for (int j = 0; j < m; ++j) {
                diff[s.substr(i + j * n, n)] += 1;
            }
            for (string &word : words) {
                if (--diff[word] == 0) {
                    diff.erase(word);
                }
            }

            if (diff.empty()) {
                result.emplace_back(i);
            }

            for (int start = i + n; start + m * n - 1 < ls; start += n) {
                string word = s.substr(start - n, n);
                if (--diff[word] == 0) {
                    diff.erase(word);
                }
                word = s.substr(start + (m - 1) * n, n);
                if (++diff[word] == 0) {
                    diff.erase(word);
                }
                if (diff.empty()) {
                    result.emplace_back(start);
                }
            }
        }
        return result;
    }
};

AllenY's blog