A gene string can be represented by an 8-character long string, with choices from "A"
, "C"
, "G"
, "T"
.
Suppose we need to investigate about a mutation (mutation from “start” to “end”), where ONE mutation is defined as ONE single character changed in the gene string.
For example, "AACCGGTT"
-> "AACCGGTA"
is 1 mutation.
Also, there is a given gene “bank”, which records all the valid gene mutations. A gene must be in the bank to make it a valid gene string.
Now, given 3 things — start, end, bank, your task is to determine what is the minimum number of mutations needed to mutate from “start” to “end”. If there is no such a mutation, return -1.
Note:
- Starting point is assumed to be valid, so it might not be included in the bank.
- If multiple mutations are needed, all mutations during in the sequence must be valid.
- You may assume start and end string is not the same.
Example 1:
start: "AACCGGTT"
end: "AACCGGTA"
bank: ["AACCGGTA"]return: 1
Example 2:
start: "AACCGGTT"
end: "AAACGGTA"
bank: ["AACCGGTA", "AACCGCTA", "AAACGGTA"]return: 2
Example 3:
start: "AAAAACCC"
end: "AACCCCCC"
bank: ["AAAACCCC", "AAACCCCC", "AACCCCCC"]return: 3
<Solution>
We can use BFS (Breadth-First Search) to solve this problem.
The search sequence of BFS is 「initial state -> all the states which need only one move -> all the states which need only two moves -> … -> the final state」
We will use queue in the BFS algorithm to store every state need to be checked. The state which is closer to initial state will be check early because the FIFO property of queue. Therefore, we can use BFS to find the shortest path or min steps.
For this question, start form initial state and check every string in the bank array. If a string is different by only one character, then put it into queue and mark the string as used. If we can find a string which is equal to target string, then we find the min steps.