3784. Minimum Deletion Cost to Make All Characters Equal

Difficulty: Medium
Topics: String, Greedy, Array
Companies: Google, Amazon, Meta
Acceptance Rate: 54.5%
Minimum Deletion Cost visualization
Pro Tip: This problem can be solved optimally with O(n) time and O(1) space using a frequency array approach. The key insight is that to minimize deletion cost, we should keep the character with the highest total deletion cost and delete all others. Alternatively, we can use a greedy approach by keeping the most expensive character in each group of consecutive characters.

Problem Statement

You are given a string s of length n and an integer array cost of the same length, where cost[i] is the cost to delete the ith character of s.

You may delete any number of characters from s (possibly none), such that the resulting string is non-empty and consists of equal characters.

Return an integer denoting the minimum total deletion cost required.

Example 1

Input: s = "aabaac", cost = [1,2,3,4,1,10]

Output: 11

Explanation:

Deleting the characters at indices 0, 1, 2, 3, 4 results in the string "c", which consists of equal characters, and the total cost is cost[0] + cost[1] + cost[2] + cost[3] + cost[4] = 1 + 2 + 3 + 4 + 1 = 11.

Example 2

Input: s = "abc", cost = [10,5,8]

Output: 13

Explanation:

Deleting the characters at indices 1 and 2 results in the string "a", which consists of equal characters, and the total cost is cost[1] + cost[2] = 5 + 8 = 13.

Example 3

Input: s = "zzzzz", cost = [67,67,67,67,67]

Output: 0

Explanation:

All characters in s are equal, so the deletion cost is 0.

Problem Link: View on LeetCode ↗

Key Insight

The solution requires understanding these crucial observations:

  1. The resulting string must consist of only one character type (all 'a's, all 'b's, etc.)
  2. We can delete any characters, but must keep at least one character (non-empty result)
  3. To minimize deletion cost, we should keep the character with the highest total deletion cost
  4. If we choose character ch to keep, we must delete all characters that are not ch
  5. Equivalently, we can think: minimum deletion cost = total cost - max(cost_sum for each character)

Alternative approach: For each group of consecutive identical characters, keep only the most expensive one in that group and delete the rest.

Important: The problem requires the resulting string to be non-empty. This means we must keep at least one character. In cases where all characters are the same, we keep all of them (cost 0). In cases with different characters, we choose one character type to keep entirely.

Approach 1: Frequency Array (Your Approach - Optimal)

Calculate the total cost of deleting all characters, then for each character (a-z), calculate the total cost of keeping that character. The minimum deletion cost is total cost minus the maximum keep cost.

Algorithm Steps

  1. Initialize total = 0 and freq[26] = {0}
  2. For each character in s:
    • Add cost[i] to total
    • Add cost[i] to freq[s[i] - 'a']
  3. Find max_freq = max(freq[0..25])
  4. Return total - max_freq

How It Works

For each character from 'a' to 'z', freq[ch] represents the cost of keeping all occurrences of that character. If we choose to keep character ch, we must delete all other characters at cost total - freq[ch]. We want to minimize this, which means maximizing freq[ch].

Complexity Analysis

Time Complexity Space Complexity
O(n) O(1)

Where n is the length of the string. Single pass through the string with constant extra space for the frequency array.

Frequency Array Approach (Your Solution)
class Solution {
public:
    long long minCost(string s, vector<int>& cost) {
        long long total = 0;
        vector<long long> freq(26, 0);
        
        for (int i = 0; i < s.length(); i++) {
            total += cost[i];
            freq[s[i] - 'a'] += cost[i];
        }
        
        long long max_freq = 0;
        for (int i = 0; i < 26; i++) {
            if (freq[i] > max_freq) {
                max_freq = freq[i];
            }
        }
        
        return total - max_freq;
    }
};
class Solution {
    public long minCost(String s, int[] cost) {
        long total = 0;
        long[] freq = new long[26];
        
        for (int i = 0; i < s.length(); i++) {
            total += cost[i];
            freq[s.charAt(i) - 'a'] += cost[i];
        }
        
        long maxFreq = 0;
        for (int i = 0; i < 26; i++) {
            if (freq[i] > maxFreq) {
                maxFreq = freq[i];
            }
        }
        
        return total - maxFreq;
    }
}
class Solution:
    def minCost(self, s: str, cost: list[int]) -> int:
        total = 0
        freq = [0] * 26
        
        for i in range(len(s)):
            total += cost[i]
            freq[ord(s[i]) - ord('a')] += cost[i]
        
        max_freq = max(freq)
        return total - max_freq
Optimal Solution: This approach is optimal with O(n) time and O(1) space. It works by considering that we must choose exactly one character type to keep entirely. The character with the highest total cost is the most "expensive" to delete, so we keep it to minimize deletion cost.

Approach 2: Greedy by Character Groups

Process the string in groups of consecutive identical characters. For each group, keep only the character with the maximum cost and delete the rest.

Algorithm Steps

  1. Initialize total_cost = 0 and i = 0
  2. While i < n:
    • Start new group with character s[i]
    • Track group_max = cost[i] and group_sum = cost[i]
    • Move i to next character while same character
    • For each same character, update group_max and group_sum
    • Add group_sum - group_max to total_cost (delete all but max)
  3. Return total_cost

How It Works

This approach works because in the final string, all characters must be equal. If we have consecutive identical characters, we need to keep only one of them (the most expensive) to minimize cost. For different characters in different groups, we'll eventually need to delete entire groups except for one character type.

Complexity Analysis

Time Complexity Space Complexity
O(n) O(1)

Single pass through the string with constant extra variables.

Greedy Group Approach
class Solution {
public:
    long long minCost(string s, vector<int>& cost) {
        long long total = 0;
        int n = s.length();
        int i = 0;
        
        while (i < n) {
            char current_char = s[i];
            long long group_sum = 0;
            long long group_max = 0;
            
            // Process all consecutive same characters
            while (i < n && s[i] == current_char) {
                group_sum += cost[i];
                if (cost[i] > group_max) {
                    group_max = cost[i];
                }
                i++;
            }
            
            // Add cost of deleting all but the most expensive in this group
            total += (group_sum - group_max);
        }
        
        return total;
    }
};
class Solution {
    public long minCost(String s, int[] cost) {
        long total = 0;
        int n = s.length();
        int i = 0;
        
        while (i < n) {
            char currentChar = s.charAt(i);
            long groupSum = 0;
            long groupMax = 0;
            
            // Process all consecutive same characters
            while (i < n && s.charAt(i) == currentChar) {
                groupSum += cost[i];
                if (cost[i] > groupMax) {
                    groupMax = cost[i];
                }
                i++;
            }
            
            // Add cost of deleting all but the most expensive in this group
            total += (groupSum - groupMax);
        }
        
        return total;
    }
}
class Solution:
    def minCost(self, s: str, cost: list[int]) -> int:
        total = 0
        n = len(s)
        i = 0
        
        while i < n:
            current_char = s[i]
            group_sum = 0
            group_max = 0
            
            # Process all consecutive same characters
            while i < n and s[i] == current_char:
                group_sum += cost[i]
                group_max = max(group_max, cost[i])
                i += 1
            
            # Add cost of deleting all but the most expensive in this group
            total += (group_sum - group_max)
        
        return total
Greedy Insight: This approach works because in the optimal solution, for each group of consecutive identical characters, we keep only the most expensive one. This is actually equivalent to Approach 1 but provides a different perspective on the problem.

Approach 3: Dynamic Programming (Alternative)

Use dynamic programming to track the minimum cost to make all characters equal up to position i, considering which character we keep.

Algorithm Steps

  1. Initialize dp[26] where dp[ch] = minimum cost to make all characters equal to character ch
  2. For each character s[i] with cost cost[i]:
    • If we keep character s[i], we don't delete it (cost 0 for this character if it matches target)
    • If we change to different character, we must delete s[i] (add cost[i])
  3. Update DP table accordingly
  4. Return minimum from dp[0..25]

How It Works

For each character position, we consider keeping each possible character (a-z). If the current character matches our target, we don't pay deletion cost. Otherwise, we add the deletion cost.

Complexity Analysis

Time Complexity Space Complexity
O(n × 26) O(26)

Where n is the length of string. This is less efficient than Approach 1 but demonstrates the DP pattern.

Dynamic Programming Approach
class Solution {
public:
    long long minCost(string s, vector<int>& cost) {
        int n = s.length();
        
        // dp[ch] = minimum cost to make all characters equal to character 'a' + ch
        vector<long long> dp(26, 0);
        
        for (int i = 0; i < n; i++) {
            int current_char = s[i] - 'a';
            
            // Temporary array for new DP values
            vector<long long> new_dp(26, LLONG_MAX);
            
            for (int ch = 0; ch < 26; ch++) {
                if (dp[ch] == LLONG_MAX) continue;
                
                if (ch == current_char) {
                    // Current character matches target, keep it (no cost)
                    new_dp[ch] = min(new_dp[ch], dp[ch]);
                } else {
                    // Current character doesn't match, delete it
                    new_dp[ch] = min(new_dp[ch], dp[ch] + cost[i]);
                }
            }
            
            dp = new_dp;
        }
        
        // Find minimum cost among all possible target characters
        long long min_cost = LLONG_MAX;
        for (int ch = 0; ch < 26; ch++) {
            min_cost = min(min_cost, dp[ch]);
        }
        
        return min_cost;
    }
};
class Solution {
    public long minCost(String s, int[] cost) {
        int n = s.length();
        
        // dp[ch] = minimum cost to make all characters equal to character 'a' + ch
        long[] dp = new long[26];
        Arrays.fill(dp, Long.MAX_VALUE);
        
        for (int i = 0; i < n; i++) {
            int currentChar = s.charAt(i) - 'a';
            long[] newDp = new long[26];
            Arrays.fill(newDp, Long.MAX_VALUE);
            
            for (int ch = 0; ch < 26; ch++) {
                if (dp[ch] == Long.MAX_VALUE) continue;
                
                if (ch == currentChar) {
                    // Current character matches target, keep it (no cost)
                    newDp[ch] = Math.min(newDp[ch], dp[ch]);
                } else {
                    // Current character doesn't match, delete it
                    newDp[ch] = Math.min(newDp[ch], dp[ch] + cost[i]);
                }
            }
            
            dp = newDp;
        }
        
        // Find minimum cost among all possible target characters
        long minCost = Long.MAX_VALUE;
        for (int ch = 0; ch < 26; ch++) {
            minCost = Math.min(minCost, dp[ch]);
        }
        
        return minCost;
    }
}
class Solution:
    def minCost(self, s: str, cost: list[int]) -> int:
        n = len(s)
        
        # dp[ch] = minimum cost to make all characters equal to character ch
        dp = [float('inf')] * 26
        
        for i in range(n):
            current_char = ord(s[i]) - ord('a')
            new_dp = [float('inf')] * 26
            
            for ch in range(26):
                if dp[ch] == float('inf'):
                    continue
                
                if ch == current_char:
                    # Current character matches target, keep it (no cost)
                    new_dp[ch] = min(new_dp[ch], dp[ch])
                else:
                    # Current character doesn't match, delete it
                    new_dp[ch] = min(new_dp[ch], dp[ch] + cost[i])
            
            dp = new_dp
        
        # Find minimum cost among all possible target characters
        min_cost = min(dp)
        return min_cost
DP Insight: This dynamic programming approach, while less efficient, illustrates how we can frame the problem as making decisions at each character position. However, for this specific problem, the greedy/frequency array approaches are more optimal.

Visual Example Walkthrough

Let's trace through Approach 1 with s = "aabaac", cost = [1,2,3,4,1,10]:

a
a
b
a
a
c
1
2
3
4
1
10
Step Action Total Cost Frequency Array Updates Explanation
1 Process 'a' at index 0 total = 1 freq[0] = 1 (for 'a') Add cost 1 to total and freq['a']
2 Process 'a' at index 1 total = 3 freq[0] = 3 Add cost 2 to total and freq['a']
3 Process 'b' at index 2 total = 6 freq[1] = 3 (for 'b') Add cost 3 to total and freq['b']
4 Process 'a' at index 3 total = 10 freq[0] = 7 Add cost 4 to total and freq['a']
5 Process 'a' at index 4 total = 11 freq[0] = 8 Add cost 1 to total and freq['a']
6 Process 'c' at index 5 total = 21 freq[2] = 10 (for 'c') Add cost 10 to total and freq['c']
7 Find max frequency total = 21 max_freq = max(8, 3, 10, ...) = 10 Character 'c' has highest total cost (10)
8 Calculate result result = 21 - 10 = 11 Keep 'c', delete others Minimum cost is 11

Character Frequencies

'a': 1 + 2 + 4 + 1 = 8

'b': 3 = 3

'c': 10 = 10

Total: 1 + 2 + 3 + 4 + 1 + 10 = 21

Optimal Solution

Keep character: 'c' (cost 10)

Delete characters: All 'a's and 'b'

Deletion cost: 1 + 2 + 3 + 4 + 1 = 11

Result: String "c"

Alternative Choice

If keep 'a': Delete 'b' and 'c'

Deletion cost: 3 + 10 = 13

If keep 'b': Delete all 'a's and 'c'

Deletion cost: 1+2+4+1+10 = 18

Keeping 'c' is optimal

Visual Insight: The character with the highest total cost ('c' with cost 10) is the most "expensive" to delete, so we keep it and delete all other characters. This gives us the minimum total deletion cost of 11.

Comparison of Approaches

Approach Time Complexity Space Complexity Key Insight Best Use Case
Frequency Array O(n) O(1) Keep character with maximum total cost Optimal solution, simple implementation
Greedy by Groups O(n) O(1) For each consecutive group, keep only the most expensive When thinking about consecutive characters
Dynamic Programming O(n × 26) O(26) Frame as decision problem at each position Learning DP patterns, generalization
Recommendation: For interviews and practical use, the Frequency Array approach (Approach 1) is optimal in both time and space complexity. It's intuitive: to minimize deletion cost, keep the character that would be most expensive to delete. The Greedy approach provides an alternative perspective that's also efficient.

Edge Cases and Testing

1. All Characters Same

Input: s = "aaaaa", cost = [1,2,3,4,5] → Output: 0
All characters are already equal, no deletion needed
Frequency array: 'a' = 15, total = 15, result = 15 - 15 = 0

2. Single Character

Input: s = "a", cost = [100] → Output: 0
Only one character, must keep it (non-empty result)
Frequency array: 'a' = 100, total = 100, result = 100 - 100 = 0

3. All Different Characters

Input: s = "abcde", cost = [10,20,30,40,50] → Output: 100
Total = 10+20+30+40+50 = 150
Max frequency: max(10,20,30,40,50) = 50 (for 'e')
Result = 150 - 50 = 100
Keep 'e', delete others

4. Two Characters with Equal Total Cost

Input: s = "abab", cost = [5,5,5,5] → Output: 10
Total = 20
'a' total = 5+5 = 10
'b' total = 5+5 = 10
Result = 20 - 10 = 10 (can keep either 'a' or 'b')

5. Very Large Costs

Input: s = "ab", cost = [10^9, 10^9] → Output: 10^9
Total = 2×10^9
'a' = 10^9, 'b' = 10^9
Result = 2×10^9 - 10^9 = 10^9
Must use 64-bit integers to avoid overflow

6. Consecutive Identical Characters

Input: s = "aaabbb", cost = [1,2,3,4,5,6] → Output: 10
Greedy approach: 
  Group 'aaa': keep max=3, delete 1+2 = 3
  Group 'bbb': keep max=6, delete 4+5 = 9
  Total = 3 + 9 = 12? Wait, this is wrong
  
Frequency approach:
  Total = 1+2+3+4+5+6 = 21
  'a' = 1+2+3 = 6
  'b' = 4+5+6 = 15
  Result = 21 - 15 = 6 (keep 'b', delete all 'a's)
Important: The Greedy by Groups approach (Approach 2) and the Frequency Array approach (Approach 1) can give different results! The Greedy approach minimizes cost within each group, but doesn't consider that we might want to delete entire groups. The Frequency Array approach gives the correct global optimum.

FAQs

1. Why does keeping the character with maximum total cost give minimum deletion cost?

Deletion cost = total cost of all characters - cost of kept characters. Since total cost is fixed, minimizing deletion cost means maximizing the cost of kept characters. We must keep all occurrences of one character type, so we choose the character with maximum total cost across all its occurrences.

2. Can we keep a subset of a character rather than all occurrences?

Yes, but it wouldn't help. If we keep a subset of character 'a', we're deleting some 'a's, which adds to deletion cost without any benefit. The optimal strategy is to keep all occurrences of the chosen character type.

3. What if multiple characters have the same maximum total cost?

It doesn't matter which one we choose - the deletion cost will be the same. For example, if 'a' and 'b' both have total cost 100, then total - 100 = same result whether we keep 'a' or 'b'.

4. Why is the result guaranteed to be non-negative?

The maximum frequency is always ≤ total cost, so total - max_freq ≥ 0. It's 0 only when all characters are the same (max_freq = total).

5. How do we handle integer overflow with large costs?

Use 64-bit integers (long long in C++, long in Java, int in Python handles big integers automatically). Cost[i] ≤ 10^9 and n ≤ 10^5, so total ≤ 10^14, which fits in 64-bit.

6. What's wrong with the Greedy by Groups approach?

The Greedy approach minimizes cost within each group but doesn't consider that we might want to delete entire groups. It assumes we'll keep at least one character from each group, but the optimal solution might delete all characters of certain types entirely.

7. Can this problem be solved with sorting?

Sorting would break the connection between characters and their costs, so no. We need to preserve which cost belongs to which character.

8. What if the string could have uppercase letters or other characters?

We would need a larger frequency array (size 52 for uppercase+lowercase, or 256 for all ASCII). The algorithm remains the same: total cost - max(cost_sum for each character).

9. Is there a brute force approach?

Yes, try keeping each character type (26 possibilities), calculate deletion cost for each, take minimum. This is O(26 × n) which is essentially O(n) and similar to our frequency array approach.

10. Can we solve this with a hash map instead of array?

Yes, use a hash map to store total cost for each character. For lowercase letters only, array is more efficient. For arbitrary characters, hash map is better.

About Algopush: Algopush provides comprehensive algorithm solutions and explanations for coding interview preparation. Visit algopush.com for more problem solutions and coding resources.