# Implement strStr()

## Brute force

I'd not to be brainfucked and I write first brute force version. No surprisingly, Time Limit Exceeded.

```
foreach i in haystack
foreach j in needle
if haystack[i + j] != needle[j]
i = i + 1
retry
found
```

simple and easy but slow.

## KMP Step by Step

I only heard the KMP method, but I have no idea about how it goes. now, I will show how do I understand KMP step by step.

### Needless Comparing

lets say

```
0 1 2 3 4 5 6
A B C D E F G
A B C E
when
i = 0
j = 3
if haystack[i + j] != needle[j]
i = i + 1
retry
```

`i`

jump from `0`

to `1`

. but wait, why not jump to `3 (i + j)`

? comparing `1`

and `2`

are obviously needless.

### Where should `i`

jump to on earth?

e.g.

```
0 1 2 3 4 5 6
A B A B A C G
A B A C
A B A C
A B A C
i = 0
j = 3
```

retry from `i = 3`

? Wrong. it seems `i`

should jump to `2`

instead of `3`

```
0 1 2 3
A B A C
A B A C
i = 0
j = 3
```

mismatch at `3`

indicates that `haystack[3 - 1] == needle[3 - 1]`

.
In additional, `needle[3 - 1] == needle[0]`

, so `i`

jumps to `3`

and then goes back `1`

char to `2`

, in the formula: `0 + 3 - 1 (i + j - 1)`

more complex example

```
0 1 2 3 4 5
A B C A B C
A B C A B C
i = 0
j = 5
```

mismatch at `5`

indicates that `haystack[5 - 1] == needle[5 - 1]`

.
In additional, `needle[5 - 2] == needle[0]`

and `needle[5 - 1] == needle[1]`

, so `i`

jumps to `5 (i + j)`

and then goes back `2`

chars to `3`

, in the formula: `0 + 5 - 2 (i + j - 2)`

### How many chars does `needle[i]`

go back?

From above, `needle[i]`

should go back `P`

chars

such that

```
needle[i - P] == needle[0] and needle[i - (P - 1)] == needle[1] and ...
for n = 0 .. P - 1
needle[i - P + n] == needle[0 + n]
```

the `P`

here DO have name. In wikipedia or some other places, it was known as needle's longest proper suffix which is a proper prefix. I do not understand why they like brainfucking ppl :(.

now we can write some easy code to find all `P`

of `needle[i]`

to `P[i]`

, `P[i]`

here means how many chars should needle[i] go back.

The first brute force `P`

finding version. Because I do not know how to calculate the `P`

, I search from max possible value to zero. I hope codes below would help you understand what the `P`

is

```
here S.substring(i, j) == S[i] + S[i + 1] + ... S[j - 1]
for i in needle
Possible_P = i - 1
while Possible_P > 0 and needle.substring(0, Possible_P) != needle.substring(i - Possible_P, i)
Possible_P = Possible_P - 1
P[i] = Possible_P
```

as you can see

`needle.substring(0, Possible_P)`

is the prefix and `needle.substring(i - Possible_P, i)`

is the suffix

### Put them together (First working version KMP)

```
for i in needle
Possible_P = i - 1
while Possible_P > 0 and needle.substring(0, Possible_P) != needle.substring(i - Possible_P, i)
Possible_P = Possible_P - 1
P[i] = Possible_P
foreach i in haystack
foreach j in needle
if haystack[i + j] != needle[j]
i = i + j - P[i]
i = MAX(i, 1) // at least jump 1 char, think j = 0, infinite looping
retry
found
```

### Optimizing P finding

Time limited again, last P finding method is silly and takes a long time to run.

Each loop in needle only add `1`

char, so the maximum value of `P[i]`

is `P[i - 1] + 1`

```
for i in needle
Possible_P = P[i - 1] + 1
while Possible_P > 0 and needle.substring(0, Possible_P) != needle.substring(i - Possible_P, i)
Possible_P = Possible_P - 1
P[i] = Possible_P
```

and think can the case like `P[i - 1] = 10 and P[i] = 6`

happens?

No, becasue `P[i]`

depends on `P[i - 1]`

.

`P[i - 1] = 10`

means that in the string of `needle[0 .. i - 1 - 1]`

, that the fist `10 chars(prefix) of needle[0 .. i - 1 - 1] = last 10 chars(suffix) of needle[0 .. i - 1 - 1]`

.

`P[i] = 6`

means that in the string of `needle[0 .. i - 1]`

, that the fist `6 chars(prefix) of needle[0 .. i - 1] = last 6 chars(suffix) of needle[0 .. i - 1]`

.

So, `P[i]`

should be `P[i - 1] + 1`

or `0`

```
for i in needle
Possible_P = P[i - 1] + 1
if needle.substring(0, Possible_P) != needle.substring(i - Possible_P, i)
Possible_P = 0
P[i] = Possible_P
```

Now, substring is also needless. `P[i - 1]`

indicates that `needle.substring(0, P[i - 1 - 1]) == needle.substring(i - P[i - 1 - 1], i)`

.
if any new char(`needle[i - 1]`

) comes, checking `needle[0 + P[i - 1]] == needle[i - 1]`

is enough.

e.g.

```
0 1 2 3 4 5
A B C A B C
A B C A B C
^ ^
m n
when i = 5
P[5 - 1] = 1 (needle[0] == needle[3])
if needle[0 + 1] == needle[5 - 1]
P[5] = P[4] + 1
```

here we can see `m = 0 + P[i - 1]`

and `n = i - 1`

### P finding Final version

```
for i in needle
Possible_P = P[i - 1] + 1
if needle[0 + P[i - 1]] != needle[i - 1]
Possible_P = 0
P[i] = Possible_P
```

### Source code *Read on Github*

```
1 public class Solution {
2 public String strStr(String haystack, String needle) {
3
4 if(haystack == null) return null;
5 if(needle == null) return null;
6
7 char[] _haystack = haystack.toCharArray();
8 char[] _needle = needle.toCharArray();
9
10 if(_needle.length == 0) return haystack;
11 if(_haystack.length < _needle.length) return null;
12
13
14 int[] P = new int[_needle.length];
15
16 for(int j = 2; j < _needle.length; j++){
17
18 int k = 0;
19
20 if(_needle[0 + P[j - 1]] == _needle[j - 1]){
21 k = P[j - 1] + 1;
22 }
23
24 P[j] = k;
25 }
26
27 next:
28 for(int i = 0; i < _haystack.length; /*void*/){
29
30 for(int j = 0; j < _needle.length; j++){
31
32 if(i + j >= _haystack.length) return null;
33
34 if(_haystack[i + j] != _needle[j]){
35 i += Math.max(1, j - P[j]);
36 continue next;
37 }
38 }
39
40 return new String(_haystack, i, _haystack.length - i);
41 }
42
43 return null;
44 }
45 }
```