Word Count Engine

Implement a document scanning function wordCountEngine, which receives a string document and returns a list of all unique words in it and their number of occurrences, sorted by the number of occurrences in a descending order. If two or more words have the same count, they should be sorted according to their order in the original sentence. Assume that all letters are in english alphabet. You function should be case-insensitive, so for instance, the words “Perfect” and “perfect” should be considered the same word.

The engine should strip out punctuation (even in the middle of a word) and use whitespaces to separate words.

Analyze the time and space complexities of your solution. Try to optimize for time while keeping a polynomial space complexity.

Examples:

input: document = “Practice makes perfect. you’ll only
get Perfect by practice. just practice!”

output: [ [“practice”, “3”], [“perfect”, “2”],
[“makes”, “1”], [“youll”, “1”], [“only”, “1”],
[“get”, “1”], [“by”, “1”], [“just”, “1”] ]
Important: please convert the occurrence integers in the output list to strings (e.g. “3” instead of 3). We ask this because in compiled languages such as C#, Java, C++, C etc., it’s not straightforward to create mixed-type arrays (as it is, for instance, in scripted languages like JavaScript, Python, Ruby etc.). The expected output will simply be an array of string arrays.

Constraints:

[time limit] 5000ms
[input] string document
[output] array.array.string

Award Budget Cuts

The awards committee of your alma mater (i.e. your college/university) asked for your assistance with a budget allocation problem they’re facing. Originally, the committee planned to give N research grants this year. However, due to spending cutbacks, the budget was reduced to newBudget dollars and now they need to reallocate the grants. The committee made a decision that they’d like to impact as few grant recipients as possible by applying a maximum cap on all grants. Every grant initially planned to be higher than cap will now be exactly cap dollars. Grants less or equal to cap, obviously, won’t be impacted.

Given an array grantsArray of the original grants and the reduced budget newBudget, write a function findGrantsCap that finds in the most efficient manner a cap such that the least number of recipients is impacted and that the new budget constraint is met (i.e. sum of the N reallocated grants equals to newBudget).

Analyze the time and space complexities of your solution.

Example:

1
2
3
4
5
input:  grantsArray = [2, 100, 50, 120, 1000], newBudget = 190

output: 47 # and given this cap the new grants array would be
# [2, 47, 47, 47, 47]. Notice that the sum of the
# new grants is indeed 190

Constraints:

[time limit] 5000ms

[input] array.double grantsArray

0 ≤ grantsArray.length ≤ 20
0 ≤ grantsArray[i]
[input] double newBudget

[output] double

Attempt

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
#include <iostream>
#include <vector>

using namespace std;

double findGrantsCap( vector<int> grantsArray, int newBudget )
{
//sort(grantsArray);
int sum = 0;
int len = grantsArray.size();

double mean = newBudget / len;
//cout << "mean = " << mean << endl;
// get count and sum of values less than mean

int count = 0;
int rem_sum = 0;
for (auto e: grantsArray) {
if (e < mean) {
count++;
rem_sum += e;
}
}
cout << (newBudget - rem_sum) / (len - count) << endl;
return ((newBudget - rem_sum) / (len - count));


}

int main() {
findGrantsCap({2, 100, 50, 120, 1000}, 190);
return 0;
}

Fails a few test cases.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Time: 8ms
Passed: 2
Failed: 4
Test Case #1
Input: [2,4], 3,Expected: 1.5,Actual: 1
Test Case #2
Input: [2,4,6], 3,Expected: 1,Actual: 1
Test Case #3
Input: [2,100,50,120,167], 400,Expected: 128,Actual: 116
Test Case #4
Input: [2,100,50,120,1000], 190,Expected: 47,Actual: 47
Test Case #5
Input: [21,100,50,120,130,110], 140,Expected: 23.8,Actual: 23
Test Case #6
Input: [210,200,150,193,130,110,209,342,117], 1530,Expected: 211,Actual: 204

Longest Harmonious Subsequence

Problem

We define a harmonious array is an array where the difference between its maximum value and its minimum value is exactly 1.

Now, given an integer array, you need to find the length of its longest harmonious subsequence among all its possible subsequences.

Example 1:

Input: [1,3,2,2,5,2,3,7]
Output: 5
Explanation: The longest harmonious subsequence is [3,2,2,2,3].

Note: The length of the input array will not exceed 20,000.

Analysis

The brute force approach would involve taking every possible combination. That would be horribly slow at time complexity of O(2^N).

Accepted

We use a map to store {item, item_count}. Then we iterate through the keys and look for key+1.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
class Solution {
public:
int findLHS(vector<int>& nums) {
unordered_map<int, int> counts_map;

for (auto e: nums) {
counts_map[e]++;
}
int max_len = 0;
for(auto it = counts_map.cbegin(); it != counts_map.cend(); it++) {
if (counts_map.find(it->first + 1) != counts_map.cend()) {
max_len = max(max_len, counts_map[it->first] + counts_map[it->first + 1]);
}
}

return max_len;
}
};

Chalkboard XOR Game 🚧

Problem

We are given non-negative integers nums[i] which are written on a chalkboard. Alice and Bob take turns erasing exactly one number from the chalkboard, with Alice starting first. If erasing a number causes the bitwise XOR of all the elements of the chalkboard to become 0, then that player loses. (Also, we’ll say the bitwise XOR of one element is that element itself, and the bitwise XOR of no elements is 0.)

Also, if any player starts their turn with the bitwise XOR of all the elements of the chalkboard equal to 0, then that player wins.

Return True if and only if Alice wins the game, assuming both players play optimally.

Example:
Input: nums = [1, 1, 2]
Output: false
Explanation:
Alice has two choices: erase 1 or erase 2.
If she erases 1, the nums array becomes [1, 2]. The bitwise XOR of all the elements of the chalkboard is 1 XOR 2 = 3. Now Bob can remove any element he wants, because Alice will be the one to erase the last element and she will lose.
If Alice erases 2 first, now nums becomes [1, 1]. The bitwise XOR of all the elements of the chalkboard is 1 XOR 1 = 0. Alice will lose.

Notes:

  • 1 <= N <= 1000.
  • 0 <= nums[i] <= 2^16.

Min Depth of Binary Tree

Problem

Given a binary tree, find its minimum depth.

The minimum depth is the number of nodes along the shortest path from the root node down to the nearest leaf node.

Note: A leaf is a node with no children.

Example:

Given binary tree [3,9,20,null,null,15,7],

1
2
3
4
5
  3
/ \
9 20
/ \
15 7

return its minimum depth = 2.

Attempt

We can traverse tree and record the min depth among leaf nodes.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
/**
* Definition for a binary tree node.
* struct TreeNode {
* int val;
* TreeNode *left;
* TreeNode *right;
* TreeNode(int x) : val(x), left(NULL), right(NULL) {}
* };
*/
class Solution {
int min_depth = INT_MAX;
void traverse(TreeNode *root, int current_depth) {
if (root == nullptr) {
min_depth = min(min_depth, current_depth - 1);
return;
}
traverse(root->left, current_depth + 1);
traverse(root->right, current_depth + 1);
}
public:
int minDepth(TreeNode* root) {
traverse(root, 1);
return min_depth;
}
};

Aha! I need to read the question more carefully 😧 I need to count the number of nodes!!! Additionally, we need to exclude the paths where a leaf may be missing, since that would skew the result! This is a pretty good problem, even though it’s marked as easy. I also don’t need to add a separate recursive function. We can just use the given function recursively.

Accepted

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
/**
* Definition for a binary tree node.
* struct TreeNode {
* int val;
* TreeNode *left;
* TreeNode *right;
* TreeNode(int x) : val(x), left(NULL), right(NULL) {}
* };
*/
class Solution {
public:
int minDepth(TreeNode* root) {
if (!root) return 0;
int L = minDepth(root->left);
int R = minDepth(root->right);
// if one of them is zero then we need to return the non-zero value
return ( 1 + ((L && R)? min(L, R) : max(L, R)));
}
};

Expressive Words

Problem

Sometimes people repeat letters to represent extra feeling, such as “hello” -> “heeellooo”, “hi” -> “hiiii”. Here, we have groups, of adjacent letters that are all the same character, and adjacent characters to the group are different. A group is extended if that group is length 3 or more, so “e” and “o” would be extended in the first example, and “i” would be extended in the second example. As another example, the groups of “abbcccaaaa” would be “a”, “bb”, “ccc”, and “aaaa”; and “ccc” and “aaaa” are the extended groups of that string.

For some given string S, a query word is stretchy if it can be made to be equal to S by extending some groups. Formally, we are allowed to repeatedly choose a group (as defined above) of characters c, and add some number of the same character c to it so that the length of the group is 3 or more. Note that we cannot extend a group of size one like “h” to a group of size two like “hh” - all extensions must leave the group extended - ie., at least 3 characters long.

Given a list of query words, return the number of words that are stretchy.

Example:
Input:
S = “heeellooo”
words = [“hello”, “hi”, “helo”]
Output: 1
Explanation:
We can extend “e” and “o” in the word “hello” to get “heeellooo”.
We can’t extend “helo” to get “heeellooo” because the group “ll” is not extended.

Notes:

  • 0 <= len(S) <= 100.
  • 0 <= len(words) <= 100.
  • 0 <= len(words[i]) <= 100.
  • S and all words in words consist only of lowercase letters

Analysis

We generate a mapping of character to counts and compare them.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
class Solution {
using counts_t = pair<char, int>;
using counts_vec_t = vector<counts_t>;

void gen_counts(const string& s, counts_vec_t& v) {
v.push_back({s[0], 1});
char last = s[0];
for(int i = 0; i < (int) s.size(); i++) {
if (last == s[i]) {
v.back().second++;
}
else {
v.push_back({s[i], 1});
last = s[i];
}
}
}
public:
int expressiveWords(string s, vector<string>& words) {
counts_vec_t vec;
gen_counts(s, vec);
int matched_count = 0;
for (auto w: words) {
counts_vec_t wc;
gen_counts(w, wc);
bool matched = false;
for (int i = 0; i < vec.size(); i++) {
if (i >= wc.size()) break;
if (vec[i].first != wc[i].first) break;
if (vec[i].second >= 3) matched = true;
else if (vec[i].second == wc[i].second) matched = true;
}
if (matched) matched_count++;
}
return matched_count;
}
};

Accepted

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
class Solution {
using counts_t = pair<char, int>;
using counts_vec_t = vector<counts_t>;

void gen_counts(const string& s, counts_vec_t& v) {
v.push_back({s[0], 1});
char last = s[0];
for(int i = 1; i < (int) s.size(); i++) {
if (last == s[i]) {
v.back().second++;
}
else {
v.push_back({s[i], 1});
last = s[i];
}
}
}
public:
int expressiveWords(string s, vector<string>& words) {
counts_vec_t vec;
gen_counts(s, vec);
int matched_count = 0;
for (auto w: words) {
counts_vec_t wc;
gen_counts(w, wc);

bool matched = true;
for (int i = 0; i < vec.size(); i++) {
if ((i > wc.size()) ||
//cout << wc[i].first << ", " << wc[i].second << endl;
(vec[i].first != wc[i].first) ||

(vec[i].second < wc[i].second) ||
(wc[i].second == 1 && vec[i].second == 2) )
{ matched = false; break; }
}
if (matched) {
//cout << w << " matched\n";
matched_count++;
}
}
return matched_count;
}
};

Reverse Words 🚧

Problem

Given an input string, reverse the string word by word.

Example:

Input: “the sky is blue”,
Output: “blue is sky the”.

Note:

A word is defined as a sequence of non-space characters.
Input string may contain leading or trailing spaces. However, your reversed string should not contain leading or trailing spaces.
You need to reduce multiple spaces between two words to a single space in the reversed string.

Follow up: For C programmers, try to solve it in-place in O(1) space.

Analysis

For an in-place O(1) space solution we can solve in three steps.

  1. Trim extra spaces
  2. Reverse whole string
  3. reverse each word

However, this is takes longer. If we have string with m words of average length x and is n characters long, it would take O(n+m*x)? 🚧

First Attempt

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
class Solution {
private:
void trim(string& s, int i) {
while (i < s.size()) {
if (s[i] == ' ') s.erase(i, 1);
else break;
}
}

string::iterator rword(string &s, string::iterator start) {
auto end = start;
while (*end != ' ') {
end++;
}
reverse(start, end);
return end;
}

public:
void reverseWords(string &s) {
// trim leading spaces
trim(s, 0);

// trim remaining extra spaces
int i = 0;
while (i < s.size()) {
if (s[i] != ' ') {
i++;
} else {
trim(s, i+1);
}

}

//reverse the whole input
reverse(s.begin(), s.end());

//reverse each word
auto it = s.begin();
while (it != s.end()) {
it = rword(s, it);
it++; //skip spaces between words
}
}
};

This one goes into an infinite loop.

Accepted (slow)

Used Visual Studio Debugger to step through and find the cycle. Looks like we need to be careful about moving iterators while maintaining single space between words. All in all this is a very complex solution.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
// ReverseWords.cpp : Defines the entry point for the console application.
//

#include "stdafx.h"
#include <string>
#include <algorithm>
#include <iostream>

using namespace std;

class Solution {
private:
int trim(string& s, int i) {
while (i < (int) s.size()) {
if (s[i] == ' ') s.erase(i, 1);
else break;
}
return i + 1; // skip to non-space char (may be past end of string)
}

string::iterator rword(string &s, string::iterator start) {
auto end = start;
while (end != s.end() && *end != ' ') {
end++;
}
reverse(start, end);
return end;
}

public:
void reverseWords(string &s) {
// trim leading spaces
trim(s, 0);

// trim remaining extra spaces
int i = 0;
while (i < (int) s.size()) {
if (s[i] != ' ') {
i++;
}
else {
i = trim(s, i + 1);
}

}

//reverse the whole input
reverse(s.begin(), s.end());

// trim any spaces left over from last run
trim(s, 0);

//reverse each word
auto it = s.begin();
while (it != s.end()) {
it = rword(s, it);
if (it != s.end()) it++; //skip spaces between words
}
}
};

int main()
{
Solution sol;
string str = " This is a test ";
sol.reverseWords(str);
cout << "Reversed: " << str << endl;
return 0;
}

Accepted(improved) [TODO]

With C++ stringstream is an effective option for problems such as these.

1
2


First Bad Version

Problem

You are a product manager and currently leading a team to develop a new product. Unfortunately, the latest version of your product fails the quality check. Since each version is developed based on the previous version, all the versions after a bad version are also bad.

Suppose you have n versions [1, 2, …, n] and you want to find out the first bad one, which causes all the following ones to be bad.

You are given an API bool isBadVersion(version) which will return whether version is bad. Implement a function to find the first bad version. You should minimize the number of calls to the API.

Example:

Given n = 5, and version = 4 is the first bad version.

call isBadVersion(3) -> false
call isBadVersion(5) -> true
call isBadVersion(4) -> true

Then 4 is the first bad version.

Analysis

We should use binary search to get a logarithmatic time.

One important thing to keep in mind is that (left + right)/2 can actually cause overflow! Interesting discussion of that in the solution as follows:

If you are setting mid=left+right2mid = \frac{left + right}{2}mid=2left+right​, you have to be very careful. Unless you are using a language that does not overflow such as Python, left+rightleft + rightleft+right could overflow. One way to fix this is to use left+right−left2left + \frac{right - left}{2}left+2right−left​ instead.

If you fall into this subtle overflow bug, you are not alone. Even Jon Bentley’s own implementation of binary search had this overflow bug and remained undetected for over twenty years.

Attempt

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
// Forward declaration of isBadVersion API.
bool isBadVersion(int version);

class Solution {
public:
int firstBadVersion(int n) {
int left = 1;
int right = n;
while (left < right) {
int mid = left + (right - left)/2;
if (isBadVersion(mid)) {
right = mid - 1;
} else {
left = mid + 1;
}
}
return left;
}
};

We should compare left <= right since otherwise we will end up missing one.

Accepted

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
// Forward declaration of isBadVersion API.
bool isBadVersion(int version);

class Solution {
public:
int firstBadVersion(int n) {
int left = 1;
int right = n;
while (left <= right) {
int mid = left + (right - left)/2;
if (isBadVersion(mid)) {
right = mid - 1;
} else {
left = mid + 1;
}
}
return left;
}
};

Array Nesting

Problem

A zero-indexed array A of length N contains all integers from 0 to N-1. Find and return the longest length of set S, where S[i] = {A[i], A[A[i]], A[A[A[i]]], … } subjected to the rule below.

Suppose the first element in S starts with the selection of element A[i] of index = i, the next element in S should be A[A[i]], and then A[A[A[i]]]… By that analogy, we stop adding right before a duplicate element occurs in S.

Example 1:

Input: A = [5,4,0,3,1,6,2]
Output: 4
Explanation:
A[0] = 5, A[1] = 4, A[2] = 0, A[3] = 3, A[4] = 1, A[5] = 6, A[6] = 2.

One of the longest S[K]:
S[0] = {A[0], A[5], A[6], A[2]} = {5, 6, 2, 0}

Note:

  • N is an integer within the range [1, 20,000].
  • The elements of A are all distinct.
  • Each element of A is an integer within the range [0, N-1].

Accepted

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
class Solution {
public:
int arrayNesting(vector<int>& nums) {
int longest = 0;
int visited[nums.size()] = {0};
for (int i = 0; i < nums.size(); i++) {
// if we have alreayd visited this then starting from this index
// will always be shorter, so skip
if (visited[i]) continue;
int j = i, count = 0;
while (j >=0 && j < nums.size() && !visited[j]) {
count++;
visited[j] = 1;
if (j < 0 || j >= nums.size()) break;
j = nums[j];
}
longest = max(longest, count);
}
return longest;
}
};