Description #
Given a string, find the length of the longest substring without repeating characters.
*Example 1*: Input: "abcabcbb" Output: 3 Explanation: The answer is "abc", with the length of 3. *Example 2*: Input: "bbbbb" Output: 1 Explanation: The answer is "b", with the length of 1. *Example 3*: Input: "pwwkew" Output: 3 Explanation: The answer is "wke", with the length of 3. Note that the answer must be a substring, "pwke" is a subsequence and not a substring.
(From the Longest Substring Without Repeating Characters problem description)
Solutions #
Problem API #
class Solution:
def lengthOfLongestSubstring(self, s: str) -> int:
Tests #
Test vector #1 (abcabcbb
) tells that the longest “longest substring without
repeating characters” is abc
. This reveals that the question can be
rephrased as “the longest substring with unique characters”.
The following edge cases need some special consideration:
- Multiple substrings with the same length
- Per definition this is not relevant because only the length of the longest string is needed. It is debatable if a longest substring exists when two different strings have the same length. Example 1 of the task implies that this is not relevant.
- The whole input is the same character
- Experience shows that these are sometimes edge cases for loops.
- The whole input is only on character long
- Experience shows that this is sometimes an edge case for loops.
- No input
- if
s == ""
there is no longest substring and0
is returned.
This leads to the following tests:
import os
import pytest
class BaseTestClass:
def test_vector_1(self):
assert self.lengthOfLongestSubstring("abcabcbb") == 3
def test_vector_2(self):
assert self.lengthOfLongestSubstring("bbbbb") == 1
def test_vector_3(self):
assert self.lengthOfLongestSubstring("pwwkew") == 3
def test_empty_vector_is_0(self):
assert self.lengthOfLongestSubstring("") == 0
def test_one_elem_vector_is_1(self):
""" Test the special case of 'starting' and 'ending' at once. """
assert self.lengthOfLongestSubstring("a") == 1
def test_detect_longest_substring_at_eof(self):
assert self.lengthOfLongestSubstring("ababc") == 3
Solution 1a: \(O(|s| * |longest\_substring|)\) #
A straight forward (but slow) algorithm is to check for each character in s
if
it starts the longest string. In other words: For each character (indexed by leftmost
) the
algorithm looks ahead until a duplicate char is found and the substring ends.
The longest substring wins.
Notes:
- Caching the length of
s
s
is constant, and so islen(s)
. Even iflen(s)
is very fast and \(O(1)\) (source) and premature optimization is the root of all evil (or at least most of it) in programming this low hanging fruit can be picked.
Source #
class Solution:
def lengthOfLongestSubstring(self, s: str) -> int:
len_s = len(s) # len_s is constant, so we can cache the value
longest_substring_len = 0
for leftmost in range(0, len_s):
current_substring = set(s[leftmost])
lookahead = leftmost + 1
while lookahead < len_s and s[lookahead] not in current_substring:
current_substring.add(s[lookahead])
lookahead +=1
if len(current_substring) > longest_substring_len:
longest_substring_len = len(current_substring)
return longest_substring_len
Test results #
With the following test result:
============================= test session starts ==============================
platform linux -- Python 3.7.5, pytest-3.10.1, py-1.8.0, pluggy-0.12.0
benchmark: 3.2.2 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /home/jens/Documents/leetcode.com/003-longest_substr_without_repeating_chars, inifile:
plugins: benchmark-3.2.2, profiling-1.7.0
collected 6 items
test_1a-solution.py ...... [100%]
============================ slowest test durations ============================
(0.00 durations hidden. Use -vv to show these durations.)
=========================== 6 passed in 0.03 seconds ===========================
Tests ran in 0.1789331170002697 seconds
Runtime analysis #
This implementation boils down to two nested loops:
- An outer loop (
for leftmost in range(0, len_s)
) loopsn = len(s)
times: \(\rightarrow O(|s|)\) The outer loop nests the following- indexed access into
s
(s[leftmost]
): \(O(1)\) (source) set
creation (set(...)
): \(O(1)\)- A nested loop (
while lookahead < len_s and s[lookahead] not in current_substring
) is bound by the length of the longest substring: \(O(|longest\_substring|)\).
- indexed access into
\(\rightarrow\) the average case behavior is \(O(|s| * |longest\_substring|)\) \(\blacksquare\)
Solution 1b: Moving window: \(O(n)\) #
This builds on the previous solution. The idea is that the previous version can be optimized by removing the lookahead.
class Solution:
def lengthOfLongestSubstring(self, s: str) -> int:
longest_substring_len = 0
current_substring = set()
index_of_start = 0
for rightmost_char in s:
if rightmost_char in current_substring:
# also end the substring but continue
longest_substring_len = max(longest_substring_len, len(current_substring))
# Shrink the window until we can add the next char
while True:
current_substring.remove(s[index_of_start])
index_of_start += 1
if not rightmost_char in current_substring:
break
current_substring.add(rightmost_char)
else:
current_substring.add(rightmost_char)
# end of input: The longest string could also end with s
longest_substring_len = max(longest_substring_len, len(current_substring))
return longest_substring_len
Test results #
With the following test result:
============================= test session starts ==============================
platform linux -- Python 3.7.5, pytest-3.10.1, py-1.8.0, pluggy-0.12.0
benchmark: 3.2.2 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /home/jens/Documents/leetcode.com/003-longest_substr_without_repeating_chars, inifile:
plugins: benchmark-3.2.2, profiling-1.7.0
collected 6 items
test_1b-solution.py ...... [100%]
============================ slowest test durations ============================
(0.00 durations hidden. Use -vv to show these durations.)
=========================== 6 passed in 0.02 seconds ===========================
Tests ran in 0.16662812398863025 seconds
Runtime analysis #
This implementation boils down to two nested loops:
set
creation (set(...)
): \(O(1)\)- An outer loop (
for rightmost_char in s
) over the whole strings
times \(\rightarrow O(|s|)\). The outer loop nests the following cases- Sliding window found a duplicate at the current position \(\rightarrow O(|longest\_substring|)\)
set
membership test (rightmost_char in current_substring
) \(\rightarrow O(1)\)- Reduction of the sliding window, max. \(O(|longest\_substring|)\) times. \(\rightarrow O(|longest\_substring|)\)
- Remove element from set \(O(1)\) (source although it is only guaranteed for
dict
) set
membership test (rightmost_char in current_substring
) \(\rightarrow O(1)\)- \(\rightarrow O(1)\) per iteration
- Remove element from set \(O(1)\) (source although it is only guaranteed for
- Extend the substring (
current_substring.add(rightmost_char)
): \(O(1)\)
- Sliding window found a duplicate at the current position \(\rightarrow O(|longest\_substring|)\)
\(\rightarrow\) superficially the average case behavior is \(O(|s| * |longest\_substring|)\).
A closer look at the implementation shows a different picture. Over all iterations the nested loop cannot drop more than \(|s|\) elements from the window.
In other words, the amortized runtime of the nested loop is \(O(1)\).
\(\rightarrow\) This brings the runtime of the whole algorithm to \(O(n)\) \(\blacksquare\)
This also explains why this implementation is more than 12 times faster than the previous version when using the leetcode test set.
Summary #
Solution | Correct | Link | Runtime | Memory | Language |
---|---|---|---|---|---|
#1, Solution 1a (quadratic runtime) | Yes | Submission | 544 ms, faster than 13.77% | 12.9 MB, less than 100.00% | Python3 |
#2, Solution 1b (linear runtime) | Yes | Submission | 44 ms, faster than 99.09% | 12.7 MB, less than 100.00% | Python3 |