3784. Minimum Deletion Cost to Make All Characters Equal
Problem Statement
You are given a string s of length n and an integer array cost of the same length, where cost[i] is the cost to delete the ith character of s.
You may delete any number of characters from s (possibly none), such that the resulting string is non-empty and consists of equal characters.
Return an integer denoting the minimum total deletion cost required.
Example 1
Input: s = "aabaac", cost = [1,2,3,4,1,10]
Output: 11
Explanation:
Deleting the characters at indices 0, 1, 2, 3, 4 results in the string "c", which consists of equal characters, and the total cost is cost[0] + cost[1] + cost[2] + cost[3] + cost[4] = 1 + 2 + 3 + 4 + 1 = 11.
Example 2
Input: s = "abc", cost = [10,5,8]
Output: 13
Explanation:
Deleting the characters at indices 1 and 2 results in the string "a", which consists of equal characters, and the total cost is cost[1] + cost[2] = 5 + 8 = 13.
Example 3
Input: s = "zzzzz", cost = [67,67,67,67,67]
Output: 0
Explanation:
All characters in s are equal, so the deletion cost is 0.
Key Insight
The solution requires understanding these crucial observations:
- The resulting string must consist of only one character type (all 'a's, all 'b's, etc.)
- We can delete any characters, but must keep at least one character (non-empty result)
- To minimize deletion cost, we should keep the character with the highest total deletion cost
- If we choose character
chto keep, we must delete all characters that are notch - Equivalently, we can think: minimum deletion cost = total cost - max(cost_sum for each character)
Alternative approach: For each group of consecutive identical characters, keep only the most expensive one in that group and delete the rest.
Approach 1: Frequency Array (Your Approach - Optimal)
Calculate the total cost of deleting all characters, then for each character (a-z), calculate the total cost of keeping that character. The minimum deletion cost is total cost minus the maximum keep cost.
Algorithm Steps
- Initialize
total = 0andfreq[26] = {0} - For each character in s:
- Add
cost[i]tototal - Add
cost[i]tofreq[s[i] - 'a']
- Add
- Find
max_freq = max(freq[0..25]) - Return
total - max_freq
How It Works
For each character from 'a' to 'z', freq[ch] represents the cost of keeping all occurrences of that character. If we choose to keep character ch, we must delete all other characters at cost total - freq[ch]. We want to minimize this, which means maximizing freq[ch].
Complexity Analysis
| Time Complexity | Space Complexity |
|---|---|
| O(n) | O(1) |
Where n is the length of the string. Single pass through the string with constant extra space for the frequency array.
class Solution {
public:
long long minCost(string s, vector<int>& cost) {
long long total = 0;
vector<long long> freq(26, 0);
for (int i = 0; i < s.length(); i++) {
total += cost[i];
freq[s[i] - 'a'] += cost[i];
}
long long max_freq = 0;
for (int i = 0; i < 26; i++) {
if (freq[i] > max_freq) {
max_freq = freq[i];
}
}
return total - max_freq;
}
};
class Solution {
public long minCost(String s, int[] cost) {
long total = 0;
long[] freq = new long[26];
for (int i = 0; i < s.length(); i++) {
total += cost[i];
freq[s.charAt(i) - 'a'] += cost[i];
}
long maxFreq = 0;
for (int i = 0; i < 26; i++) {
if (freq[i] > maxFreq) {
maxFreq = freq[i];
}
}
return total - maxFreq;
}
}
class Solution:
def minCost(self, s: str, cost: list[int]) -> int:
total = 0
freq = [0] * 26
for i in range(len(s)):
total += cost[i]
freq[ord(s[i]) - ord('a')] += cost[i]
max_freq = max(freq)
return total - max_freq
Approach 2: Greedy by Character Groups
Process the string in groups of consecutive identical characters. For each group, keep only the character with the maximum cost and delete the rest.
Algorithm Steps
- Initialize
total_cost = 0andi = 0 - While
i < n:- Start new group with character
s[i] - Track
group_max = cost[i]andgroup_sum = cost[i] - Move
ito next character while same character - For each same character, update
group_maxandgroup_sum - Add
group_sum - group_maxtototal_cost(delete all but max)
- Start new group with character
- Return
total_cost
How It Works
This approach works because in the final string, all characters must be equal. If we have consecutive identical characters, we need to keep only one of them (the most expensive) to minimize cost. For different characters in different groups, we'll eventually need to delete entire groups except for one character type.
Complexity Analysis
| Time Complexity | Space Complexity |
|---|---|
| O(n) | O(1) |
Single pass through the string with constant extra variables.
class Solution {
public:
long long minCost(string s, vector<int>& cost) {
long long total = 0;
int n = s.length();
int i = 0;
while (i < n) {
char current_char = s[i];
long long group_sum = 0;
long long group_max = 0;
// Process all consecutive same characters
while (i < n && s[i] == current_char) {
group_sum += cost[i];
if (cost[i] > group_max) {
group_max = cost[i];
}
i++;
}
// Add cost of deleting all but the most expensive in this group
total += (group_sum - group_max);
}
return total;
}
};
class Solution {
public long minCost(String s, int[] cost) {
long total = 0;
int n = s.length();
int i = 0;
while (i < n) {
char currentChar = s.charAt(i);
long groupSum = 0;
long groupMax = 0;
// Process all consecutive same characters
while (i < n && s.charAt(i) == currentChar) {
groupSum += cost[i];
if (cost[i] > groupMax) {
groupMax = cost[i];
}
i++;
}
// Add cost of deleting all but the most expensive in this group
total += (groupSum - groupMax);
}
return total;
}
}
class Solution:
def minCost(self, s: str, cost: list[int]) -> int:
total = 0
n = len(s)
i = 0
while i < n:
current_char = s[i]
group_sum = 0
group_max = 0
# Process all consecutive same characters
while i < n and s[i] == current_char:
group_sum += cost[i]
group_max = max(group_max, cost[i])
i += 1
# Add cost of deleting all but the most expensive in this group
total += (group_sum - group_max)
return total
Approach 3: Dynamic Programming (Alternative)
Use dynamic programming to track the minimum cost to make all characters equal up to position i, considering which character we keep.
Algorithm Steps
- Initialize
dp[26]wheredp[ch]= minimum cost to make all characters equal to characterch - For each character
s[i]with costcost[i]:- If we keep character
s[i], we don't delete it (cost 0 for this character if it matches target) - If we change to different character, we must delete
s[i](addcost[i])
- If we keep character
- Update DP table accordingly
- Return minimum from
dp[0..25]
How It Works
For each character position, we consider keeping each possible character (a-z). If the current character matches our target, we don't pay deletion cost. Otherwise, we add the deletion cost.
Complexity Analysis
| Time Complexity | Space Complexity |
|---|---|
| O(n × 26) | O(26) |
Where n is the length of string. This is less efficient than Approach 1 but demonstrates the DP pattern.
class Solution {
public:
long long minCost(string s, vector<int>& cost) {
int n = s.length();
// dp[ch] = minimum cost to make all characters equal to character 'a' + ch
vector<long long> dp(26, 0);
for (int i = 0; i < n; i++) {
int current_char = s[i] - 'a';
// Temporary array for new DP values
vector<long long> new_dp(26, LLONG_MAX);
for (int ch = 0; ch < 26; ch++) {
if (dp[ch] == LLONG_MAX) continue;
if (ch == current_char) {
// Current character matches target, keep it (no cost)
new_dp[ch] = min(new_dp[ch], dp[ch]);
} else {
// Current character doesn't match, delete it
new_dp[ch] = min(new_dp[ch], dp[ch] + cost[i]);
}
}
dp = new_dp;
}
// Find minimum cost among all possible target characters
long long min_cost = LLONG_MAX;
for (int ch = 0; ch < 26; ch++) {
min_cost = min(min_cost, dp[ch]);
}
return min_cost;
}
};
class Solution {
public long minCost(String s, int[] cost) {
int n = s.length();
// dp[ch] = minimum cost to make all characters equal to character 'a' + ch
long[] dp = new long[26];
Arrays.fill(dp, Long.MAX_VALUE);
for (int i = 0; i < n; i++) {
int currentChar = s.charAt(i) - 'a';
long[] newDp = new long[26];
Arrays.fill(newDp, Long.MAX_VALUE);
for (int ch = 0; ch < 26; ch++) {
if (dp[ch] == Long.MAX_VALUE) continue;
if (ch == currentChar) {
// Current character matches target, keep it (no cost)
newDp[ch] = Math.min(newDp[ch], dp[ch]);
} else {
// Current character doesn't match, delete it
newDp[ch] = Math.min(newDp[ch], dp[ch] + cost[i]);
}
}
dp = newDp;
}
// Find minimum cost among all possible target characters
long minCost = Long.MAX_VALUE;
for (int ch = 0; ch < 26; ch++) {
minCost = Math.min(minCost, dp[ch]);
}
return minCost;
}
}
class Solution:
def minCost(self, s: str, cost: list[int]) -> int:
n = len(s)
# dp[ch] = minimum cost to make all characters equal to character ch
dp = [float('inf')] * 26
for i in range(n):
current_char = ord(s[i]) - ord('a')
new_dp = [float('inf')] * 26
for ch in range(26):
if dp[ch] == float('inf'):
continue
if ch == current_char:
# Current character matches target, keep it (no cost)
new_dp[ch] = min(new_dp[ch], dp[ch])
else:
# Current character doesn't match, delete it
new_dp[ch] = min(new_dp[ch], dp[ch] + cost[i])
dp = new_dp
# Find minimum cost among all possible target characters
min_cost = min(dp)
return min_cost
Visual Example Walkthrough
Let's trace through Approach 1 with s = "aabaac", cost = [1,2,3,4,1,10]:
| Step | Action | Total Cost | Frequency Array Updates | Explanation |
|---|---|---|---|---|
| 1 | Process 'a' at index 0 | total = 1 | freq[0] = 1 (for 'a') | Add cost 1 to total and freq['a'] |
| 2 | Process 'a' at index 1 | total = 3 | freq[0] = 3 | Add cost 2 to total and freq['a'] |
| 3 | Process 'b' at index 2 | total = 6 | freq[1] = 3 (for 'b') | Add cost 3 to total and freq['b'] |
| 4 | Process 'a' at index 3 | total = 10 | freq[0] = 7 | Add cost 4 to total and freq['a'] |
| 5 | Process 'a' at index 4 | total = 11 | freq[0] = 8 | Add cost 1 to total and freq['a'] |
| 6 | Process 'c' at index 5 | total = 21 | freq[2] = 10 (for 'c') | Add cost 10 to total and freq['c'] |
| 7 | Find max frequency | total = 21 | max_freq = max(8, 3, 10, ...) = 10 | Character 'c' has highest total cost (10) |
| 8 | Calculate result | result = 21 - 10 = 11 | Keep 'c', delete others | Minimum cost is 11 |
Character Frequencies
'a': 1 + 2 + 4 + 1 = 8
'b': 3 = 3
'c': 10 = 10
Total: 1 + 2 + 3 + 4 + 1 + 10 = 21
Optimal Solution
Keep character: 'c' (cost 10)
Delete characters: All 'a's and 'b'
Deletion cost: 1 + 2 + 3 + 4 + 1 = 11
Result: String "c"
Alternative Choice
If keep 'a': Delete 'b' and 'c'
Deletion cost: 3 + 10 = 13
If keep 'b': Delete all 'a's and 'c'
Deletion cost: 1+2+4+1+10 = 18
Keeping 'c' is optimal
Comparison of Approaches
| Approach | Time Complexity | Space Complexity | Key Insight | Best Use Case |
|---|---|---|---|---|
| Frequency Array | O(n) | O(1) | Keep character with maximum total cost | Optimal solution, simple implementation |
| Greedy by Groups | O(n) | O(1) | For each consecutive group, keep only the most expensive | When thinking about consecutive characters |
| Dynamic Programming | O(n × 26) | O(26) | Frame as decision problem at each position | Learning DP patterns, generalization |
Edge Cases and Testing
1. All Characters Same
Input: s = "aaaaa", cost = [1,2,3,4,5] → Output: 0
All characters are already equal, no deletion needed
Frequency array: 'a' = 15, total = 15, result = 15 - 15 = 0
2. Single Character
Input: s = "a", cost = [100] → Output: 0
Only one character, must keep it (non-empty result)
Frequency array: 'a' = 100, total = 100, result = 100 - 100 = 0
3. All Different Characters
Input: s = "abcde", cost = [10,20,30,40,50] → Output: 100
Total = 10+20+30+40+50 = 150
Max frequency: max(10,20,30,40,50) = 50 (for 'e')
Result = 150 - 50 = 100
Keep 'e', delete others
4. Two Characters with Equal Total Cost
Input: s = "abab", cost = [5,5,5,5] → Output: 10
Total = 20
'a' total = 5+5 = 10
'b' total = 5+5 = 10
Result = 20 - 10 = 10 (can keep either 'a' or 'b')
5. Very Large Costs
Input: s = "ab", cost = [10^9, 10^9] → Output: 10^9
Total = 2×10^9
'a' = 10^9, 'b' = 10^9
Result = 2×10^9 - 10^9 = 10^9
Must use 64-bit integers to avoid overflow
6. Consecutive Identical Characters
Input: s = "aaabbb", cost = [1,2,3,4,5,6] → Output: 10
Greedy approach:
Group 'aaa': keep max=3, delete 1+2 = 3
Group 'bbb': keep max=6, delete 4+5 = 9
Total = 3 + 9 = 12? Wait, this is wrong
Frequency approach:
Total = 1+2+3+4+5+6 = 21
'a' = 1+2+3 = 6
'b' = 4+5+6 = 15
Result = 21 - 15 = 6 (keep 'b', delete all 'a's)
FAQs
1. Why does keeping the character with maximum total cost give minimum deletion cost?
Deletion cost = total cost of all characters - cost of kept characters. Since total cost is fixed, minimizing deletion cost means maximizing the cost of kept characters. We must keep all occurrences of one character type, so we choose the character with maximum total cost across all its occurrences.
2. Can we keep a subset of a character rather than all occurrences?
Yes, but it wouldn't help. If we keep a subset of character 'a', we're deleting some 'a's, which adds to deletion cost without any benefit. The optimal strategy is to keep all occurrences of the chosen character type.
3. What if multiple characters have the same maximum total cost?
It doesn't matter which one we choose - the deletion cost will be the same. For example, if 'a' and 'b' both have total cost 100, then total - 100 = same result whether we keep 'a' or 'b'.
4. Why is the result guaranteed to be non-negative?
The maximum frequency is always ≤ total cost, so total - max_freq ≥ 0. It's 0 only when all characters are the same (max_freq = total).
5. How do we handle integer overflow with large costs?
Use 64-bit integers (long long in C++, long in Java, int in Python handles big integers automatically). Cost[i] ≤ 10^9 and n ≤ 10^5, so total ≤ 10^14, which fits in 64-bit.
6. What's wrong with the Greedy by Groups approach?
The Greedy approach minimizes cost within each group but doesn't consider that we might want to delete entire groups. It assumes we'll keep at least one character from each group, but the optimal solution might delete all characters of certain types entirely.
7. Can this problem be solved with sorting?
Sorting would break the connection between characters and their costs, so no. We need to preserve which cost belongs to which character.
8. What if the string could have uppercase letters or other characters?
We would need a larger frequency array (size 52 for uppercase+lowercase, or 256 for all ASCII). The algorithm remains the same: total cost - max(cost_sum for each character).
9. Is there a brute force approach?
Yes, try keeping each character type (26 possibilities), calculate deletion cost for each, take minimum. This is O(26 × n) which is essentially O(n) and similar to our frequency array approach.
10. Can we solve this with a hash map instead of array?
Yes, use a hash map to store total cost for each character. For lowercase letters only, array is more efficient. For arbitrary characters, hash map is better.