LeetCode: Repeated DNA Sequences

LeetCode: Repeated DNA Sequences

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: “ACGAATTCCG”. When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

For example,

1
2
3
4
Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT",

Return:
["AAAAACCCCC", "CCCCCAAAAA"].
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

public class RepeatedDNASequences {

public List<String> findRepeatedDnaSequences(String s) {
List<String> result = new ArrayList<>();
if (s == null || s.isEmpty()) return result;
Map<String, Integer> dp = new HashMap<>();
for (int i = 0; i < s.length() - 9; i++) {
String sub = s.substring(i, i + 10);
dp.put(sub, dp.getOrDefault(sub, 0) + 1);
}

for (Map.Entry<String, Integer> entry : dp.entrySet()) {
if (entry.getValue() < 2) continue;
result.add(entry.getKey());
}
return result;
}

}