
3545: Minimum Deletions for At Most K Distinct Characters
Problem Statement
Given a string and maximum allowed distinct characters (k), determine the minimum deletions needed to achieve at most k unique characters.
Example 1
Input: s = "abc", k = 2
Output: 1
Explanation: Remove one character (e.g., 'c')
Example 2
Input: s = "aabb", k = 2
Output: 0
Explanation: Already has 2 distinct characters
Approach 1: Frequency-Based Greedy Algorithm
Prioritize keeping characters with highest frequencies to minimize deletions.
Algorithm Steps
- Calculate character frequencies
- Sort frequencies in descending order
- Keep top k frequencies, sum remaining
Complexity Analysis
Time Complexity | Space Complexity |
---|---|
O(n + m log m) | O(m) |
class Solution {
public:
int minDeletions(string s, int k) {
unordered_map<char, int> freq;
for(char c : s) freq[c]++;
vector<int> counts;
for(auto& [ch, cnt] : freq) counts.push_back(cnt);
sort(counts.rbegin(), counts.rend());
int total = 0;
for(int i = k; i < counts.size(); ++i) {
total += counts[i];
}
return total;
}
};
class Solution {
public int minDeletions(String s, int k) {
Map<Character, Integer> freq = new HashMap<>();
for(char c : s.toCharArray()) {
freq.put(c, freq.getOrDefault(c, 0) + 1);
}
List<Integer> counts = new ArrayList<>(freq.values());
Collections.sort(counts, Collections.reverseOrder());
int total = 0;
for(int i = k; i < counts.size(); i++) {
total += counts.get(i);
}
return total;
}
}
from collections import Counter
class Solution:
def minDeletions(self, s: str, k: int) -> int:
freq = Counter(s)
counts = sorted(freq.values(), reverse=True)
return sum(counts[k:]) if len(counts) > k else 0
Approach 2: Combinatorial Subset Selection
Brute-force approach suitable for small input constraints (n ≤ 16).
Algorithm Steps
- Generate all possible character subsets
- Filter subsets with size ≤ k
- Calculate deletions for each valid subset
- Track minimum deletions across all subsets
Complexity Analysis
Time Complexity | Space Complexity |
---|---|
O(2m * m) | O(m) |
class Solution {
public:
int minDeletions(string s, int k) {
unordered_map<char, int> freq;
for(char c : s) freq[c]++;
vector<char> chars;
for(auto& [ch, _] : freq) chars.push_back(ch);
int min_del = INT_MAX;
int m = chars.size();
for(int mask = 0; mask < (1 << m); ++mask) {
int bits = __builtin_popcount(mask);
if(bits > k) continue;
int sum = 0;
for(int i = 0; i < m; ++i) {
if(mask & (1 << i)) sum += freq[chars[i]];
}
min_del = min(min_del, (int)s.size() - sum);
}
return min_del;
}
};
class Solution {
public int minDeletions(String s, int k) {
Map<Character, Integer> freq = new HashMap<>();
for(char c : s.toCharArray()) {
freq.put(c, freq.getOrDefault(c, 0) + 1);
}
List<Character> chars = new ArrayList<>(freq.keySet());
int m = chars.size();
int minDel = Integer.MAX_VALUE;
for(int mask = 0; mask < (1 << m); mask++) {
int bits = Integer.bitCount(mask);
if(bits > k) continue;
int sum = 0;
for(int i = 0; i < m; i++) {
if((mask & (1 << i)) != 0) {
sum += freq.get(chars.get(i));
}
}
minDel = Math.min(minDel, s.length() - sum);
}
return minDel;
}
}
from itertools import combinations
from collections import Counter
class Solution:
def minDeletions(self, s: str, k: int) -> int:
freq = Counter(s)
distinct = list(freq.keys())
min_del = float('inf')
for r in range(0, k+1):
for subset in combinations(distinct, r):
current = sum(freq[c] for c in subset)
min_del = min(min_del, len(s) - current)
return min_del
Approach Comparison
Approach | Time | Space | Best For |
---|---|---|---|
Greedy Frequency | O(n + m log m) | O(m) | General case, large inputs |
Combinatorial | O(2m * m) | O(m) | Small inputs (m ≤ 16) |
Edge Case Analysis
1. All Characters Identical
Input: "aaaaa", k = 1 → Output: 0
2. k Equal to Distinct Count
Input: "abcde", k = 5 → Output: 0
3. Empty String Requirement
Input: "aab", k = 0 → Impossible (per constraints)
Frequently Asked Questions
1. Why does sorting frequencies guarantee optimal solution?
Sorting lets us prioritize characters contributing most to the string length, minimizing needed deletions.
2. How does the combinatorial approach handle duplicates?
It considers distinct characters only, as duplicates are handled through frequency counts.
3. What if k exceeds distinct character count?
Immediate return 0 since no deletions needed (already satisfies condition).
4. Why use bitmask in combinatorial approach?
Efficiently represents character subsets through binary encoding (each bit = character presence).
5. Can we use dynamic programming here?
Possible but unnecessary due to effective greedy solution. DP would add complexity without benefit.
6. How to handle frequency ties?
Arbitrary selection works since tied frequencies contribute equally to total length.
7. What's the worst-case scenario for greedy approach?
When all characters have equal frequency (still O(m log m) sorting time).
8. Why calculate sum(counts[k:])?
Represents frequencies of characters excluded after keeping top k frequent ones.
9. How does input size affect approach choice?
Greedy works for all sizes, combinatorial only viable for m ≤ 16 due to exponential growth.
10. Can we optimize space further?
Yes, by storing frequencies in-place without sorting, but time efficiency would suffer.