What matters is working on a project that your manager thinks has an impact.

- Working on any project that is going to be moved to a different team soon is not impact
- Any project that your manager's manager does not care about == no impact. Helping junior team members without making it visible and part of your dev goals is not an impact.
- Fixing bugs, dealing with production issues, and handling incidents are not impacting but the baseline.
- So having alignment with your manager about what impact is and how they measure it is a super important part of it.

There are things like

- Frequently changing managers and what each new manager thinks is impact are clearly different. The committees and companies want it to not matter but the real world is messier than that. If it was unbiased and only based on quantitative assessment we would have robots, not managers.
- Changing Projects - experiments fail, product directions change, and if unfortunately, you are stuck in a project that had a year for failed experiments you have no impact, but clearly helped the company.
- All of this is nonsense when considering a promotion to staff or principal because it is part of the job description to deal with uncertainty. But an Eng2 to Senior the limitation is the scope and opportunities to have an impact on certain things like what gets into next quarter delivery.

So what an engineer at levels lower can do, or these are things I think would have helped me

- Start writing the promotion packet when you join the company.
- Review that every 2nd week with the manager.
- If you are not able to write something that is impactful and contributes to promotions for two weeks we have a problem.
- Account for the fact the system is not perfect. Making corrections along the way is important.

If I am ever a manger. I think I going to make this an agenda item for every 1:1.

]]>Let us take an example recursion tree,
*C being the function called recursively, T(1) as base case.*

Code interpretation can look like this,

```
function C(param) {
if(param == some condition) //Base Case // Leaf
return 1 // T(1)
//do something
C(paramNew1) //next computation // branch 1 to the recursion tree
C(paramNew2) //next computation // branch 2 to the recursion tree
....
}
```

- Notice two things:
**tree depth**is the total number of return statements executed until we reach base case , along the most distant node from the root of the tree.- In above diagram it is n-1.

**tree breadth**is total number recursive function calls that are made at a given time.- In above diagram it is 2. A function , at a given time, it is getting split into two more functions.

- Draw the recursion tree
- For arbitrary
`n`

, find out the**depth**`d`

of the tree as`f(n)`

- Find out average
**branching factor**`b`

i.e. how many children are present per node on average - If we want to visit every node in a tree of depth
`d`

with branching factor`b`

, we take at least**b**operations.^{d} - ~ O(b
^{d})

- Memory complexity is determined by the number of recursion calls ( which are stored on program stack )
- ~
*O(recursion depth)*

Let us take example from the previous post

```
public int minDistance(String word1, String word2) {
return ED(word1,word2,0,0);
}
private int ED(String s1, String s2, int i, int j){
if(i==s1.length())
return s2.length()-j; // remaining size difference
if(j==s2.length())
return s1.length()-i; // remaining size difference
if(s1.charAt(i) == s2.charAt(j))
return ED(s1,s2,i+1,j+1); //cost = 0
else
return Math.min(Math.min(
ED(s1,s2,i,j+1), //insertion
ED(s1,s2,i+1,j)), //deletion
ED(s1,s2,i+1,j+1)) //substitution
+ 1; //cost = 1 for Levenshtein
}
```

Using Method to determine complexity from the Recursion Tree , let's say,

`s1`

is the first string and`s2`

is the second string- s1 is of
`m`

length and s2 is of`n`

length - So, Depth
`d`

of the tree will be`min(m,n)`

.- As, we terminate early when we have finished iterate over the smaller string.

- And, Branching factor
`b`

will be`3`

.- As we call the function
`ED`

recursively, for insertion, deletion and substitution.

- As we call the function

**O(3**, at the least^{min(m,n)})**O(3**, at worst case and it occurs when m=n^{n})

**O(n)**at the worst case

The time complexity is exponential in nature. In what test case will it be *inefficient* for Edit Distance? Why?

hint!

```
"dinitrophenylhydrazine"
"acetylphenylhydrazine"
```

Try to draw the recursion tree for this example and find out when does this become *inefficient*.

Stay tune for the solution in the next blog and understand why it is *inefficient*. Also, we will explore next approaches to make the Edit Distance algorithm *efficient*.

Below is a Map of things we have written so far. Also, we would like to hang with people who are doing these algorithms. Join us at Discord or follow us over Twitter/Instagram.

]]>Edit distance between two words is the minimum number of operations , i.e character insertions, deletions or substitutions, to transform one word to another word. This operation has cost associated with it.

Levenshtein Distance is edit distance with `cost = 1`

for the operation , it comes from the family of distance metrics.

Levenshtein distance - Wikipedia

- Edit Distance between EXECUTE and EXPIRE is 4.
- EXECUTE EXPCUTE EXPIUTE EXPRTE EXPIRE
- Better representation using "gap representation"
`E X E C U T E`

`E X P I R _ E`

`0 0 1 1 1 1 0`

**Sum the Cost = 4**- " empty space
`_`

" to denote insertion or deletion.

```
ED(s1,s2,i,j) where i - s1[0...m-1] , j - s2[0...n-1]
Insertion: Recur for i and j+1
Deletion: Recur for i+1 and j
Substitution: Recur for i+1 and j+1
```

Brute Force approach is to **enumerate all possible matches** for edit operations until we find match between two strings. While comparing the characters, when the character match we do not have to perform any operation , so the cost becomes 0. If they do not match, then we check for possibilities of each insertion ,deletion and substitution operation, with cost 1.

```
public int minDistance(String word1, String word2) {
return ED(word1,word2,0,0);
}
private int ED(String s1, String s2, int i, int j){
if(i==s1.length())
return s2.length()-j; // remaining size difference
if(j==s2.length())
return s1.length()-i; // remaining size difference
if(s1.charAt(i) == s2.charAt(j))
return ED(s1,s2,i+1,j+1); //cost = 0
else
return Math.min(Math.min(
ED(s1,s2,i,j+1), //insertion
ED(s1,s2,i+1,j)), //deletion
ED(s1,s2,i+1,j+1)) //substitution
+ 1; //cost = 1 for Levenshtein
}
```

**Notice** :

- When there is a
**Prefix Match**between two strings, the tree progresses only in one direction. e.g.`ED(0,0) and ED(1,1)`

. - Otherwise three children (one for each operation) are generated. e.g
`ED(2,2)`

. - Also, notice repeating sub-problems e.g.
`ED(3,3)`

.

During recursion, we are trying to breakdown our original problem into sub-problems. We keep recursing until we hit empty string. *This hints to be a dynamic programming problem.*

At a high level, dynamic programming approaches includes below, and we will walk through them in the upcoming blogs for this series.

- Recursion and Memoization
- Tabulation aka WagnerFischer Algorithm

Let us first find out the complexity of the code above.

Time complexity:

- at least : O(3
^{min(m,n)}) - at worst case: O(3
^{n}) , occurs when m=n

- at least : O(3
Space complexity:

- O(n)

- diff (Unix)
- stemming (NLP)
- spelling correction
- DNA sequence

The complexity for the brute force approach came up to being ** exponential**. How did we come up with the analysis?

Below is a Map of things we have written so far. Also, we would like to hang with people who are doing these algorithms. Join us at Discord or follow us over Twitter/Instagram.

]]>- Window
- Invariant - Something that needs to be true for the window to be valid
- Decision when invariant is met

*Window*:`dups`

-> Set*Decision*: Is there a repeating element*Invariant*: size(window)

```
class Solution:
def containsNearbyDuplicate(self, nums: List[int], k: int) -> bool:
dups = set()
for idx, elm in enumerate(nums):
if elm in dups:
return True
dups.add(elm)
if len(dups) > k:
dups.remove(nums[idx-k])
return False
```

*Window*: window/queue*Decision*: maximum sum*Invariant*: size(window)

```
def maximumSubArrayOfSize(arr: str, k: int) -> int:
window = []
max_sum = 0
for elm in arr:
window.append(elm)
while len(window) > k:
window.pop(0)
max_sum = max(max_sum, sum(window))
return max_sum
```

*Window*: basket/window*Decision*: maximum_no_of_elements in basket*Invariant*: no_unique_elements(window) <= 2

```
import collections
class Solution:
def totalFruit(self, fruits: List[int]) -> int:
basket = collections.defaultdict(int)
max_count = 0
window = []
for fruit in fruits:
basket[fruit] += 1
window.append(fruit)
while len(basket) > 2:
elm = window.pop(0)
basket[elm] -= 1
if basket[elm] == 0:
del basket[elm]
max_count = max(max_count, sum(basket.values()))
return max_count
```

Though this last problem looks somewhat complicated, the complexity is around maintaining and updating three variables `basket`

, `max_count`

, and `window`

. The extra variable is `basket`

if we introduce that after common code to add and decision it is easy to worry about invariant.

Here is a Map of things we have written so far.

Also, we would like to hang with people who are doing these algorithms. Join us at Discord or follow us over Twitter/Instagram.

]]>**This time we will explore how to decompose the apparent problem into parts**, where each part involves multiple ways to solve and optimizing a solution. As we are decomposing the problem, we will understand the ** trade-offs** between optimizing the different parts.

During the process, understanding the sort of concepts we are connecting to ** build a concept map** which can be further used as a pattern for related problems.

Design a class to find the kth largest element in a stream. Note that it is the kth largest element in the sorted order, not the kth distinct element.

Problem Description in Detail here.

As it is a stream of elements, we would start with some state of collection of elements and when we add a new element, we look for Kth largest element at that state.

We would break this implementation down into two functions

- init() : which initializes the collection and Kth variable
- add() : takes an element as input and returns Kth Largest element

`init()`

is called only once for object initialization, we assume an Array / List. That will take`O(c) ~ O(1)`

time complexity and`O(c)`

space due fixed size of collection.`add()`

is getting called every element in a stream.- To find Kth Largest , we would iterate k times the array to compare the elements, which is in worst case
`O(k*n)`

time complexity. When k == n, that is`O(n^2)`

time complexity. - Space complexity is
`O(n) [Original init() c==n] + O(n) [k==n]`

.

- To find Kth Largest , we would iterate k times the array to compare the elements, which is in worst case

`init()`

is called only once for object initialization, we assume an Array / List. Complexity same as above.`add()`

is getting called every element in a stream.- To find Kth largest , we first sort the array of n elements ,at any given point, which has time complexity of
`O(nlogn)`

and space`O(n)`

. - Once array is sorted, finding Kth largest is
`O(1)`

time complexity.

- To find Kth largest , we first sort the array of n elements ,at any given point, which has time complexity of

In previous approach, we have init() function and find kth problem optimized to constant time complexity. Only sorting is now taking longer `O(nlogn)`

, is there a solution where sorting can be optimized?

- The best case for a sorting algorithm out there is
`O(nlogn)`

. - As this a collection of stream, and at any given point, we add only a new single element to the collection, is there a way to know where this new element fits into the collection?

Right! After sorting once maybe in the init() or add(), we can keep maintaining a sorted array by finding position for this new element from the stream getting added using search methods.

- Linear search and comparing will perform worst time complexity of
`O(n)`

, but we can improve the searching using Two Pointer`O(n)`

(*some search will be reduced compare to linear search, in the average case*) or even Binary Search`O(logn)`

. - As in the problem, we only care for Kth largest, we do not need to maintain all the elements coming in for a stream,
*which can be million/billions of elements.*We can maintain only the top k elements , reducing the space complexity to`O(k)`

, improving further the searching methods , as shown in the diagram above.

So far, we have only tried to optimize the function `add()`

, keeping `init()`

function with constant time and linear space complexity.

Can we make some trade-off here?

`init()`

is basically call only once for object initialization.`add()`

is getting called every element in a stream.

Is there a solution which can make `init()`

little slower but `add()`

faster than `O(nlogn)`

?

- Instead of using
*array / list*in`init()`

, we can explore data structure which will arrange the incoming elements in stream in such a way that finding kth element is now faster.

What data structure would keep the order of the element in largest or smallest sequence and look up ( finding kth in the order) is optimum than array / list ? Ans : `Priority Queue`

There are two types of `Priority Queue`

We could build a max heap which will keep the largest element on the root node. To find the Kth largest, we would have to deepcopy the PQ, then pop the root node for `k times`

to get the Kth largest.

We could build a min Heap which will keep the smallest element on the root node. As we do not care for smallest element, we do not need extra space to deep copy and simply start popping the elements until the `tree size == k`

. At this point, the root node is the Kth largest element.

We can further optimize, to only maintain `k`

elements in the PQ and a new element when added to PQ size > k, we can throw the smallest element on the root node, which is `O(1)`

operation. The size of the PQ tree remains k elements.

Putting everything together, we have now build the concept map.

*For problems like finding Kth Largest/Smallest in sorted, unsorted, or almost sorted array, now we can see a pattern and use this concept map of problem solving!*

```
class KthLargest {
private PriorityQueue<Integer> minHeap = new PriorityQueue<>((a,b) -> a-b);
private int capacity ;
public KthLargest(int k, int[] nums) {
capacity= k;
for(int num:nums){
minHeap.offer(num);
if(minHeap.size() > capacity){
minHeap.poll();
}
}
}
public int add(int val) {
minHeap.add(val);
if(minHeap.size() > capacity){
minHeap.poll();
}
return minHeap.peek();
}
}
```

- Constraints to this problem guaranteed that there will be at least
`k`

elements in the array when you search for the kth element. - In the event the constraint wasn't guaranteed, what would you have to consider in your implementation?

]]>Author : Rasika.Warade

Given an integer array nums, return the maximum difference between two successive elements in its sorted form. If the array contains less than two elements, return 0.

You must write an algorithm that runs in linear time and uses linear extra space.

We tried two approaches before:

- Simple Sorting and iterating through the array ( max arr length = 10
^{5}) with`O(nlogn)`

time complexity. - Building a boolean array until the max element (max element = 10
^{9}) and iterating through this new array. Although this is linear time complexity and utilizing extra space,`n`

here is 10^{9}. Iterating a billion-element array is expensive. We should come up with a more efficient solution.

The problem here is we have such a **big search space** ~10^{9}. *How can we reduce it further?*

Instead of building array from `0th`

element to `Max Element`

, ** what if we look for range from min to max values?** We can build array mapping from min element to max.

This approach can reduce some search space but not drastically. Take an example below, we would still have an array of length 10^{7}.

```
e.g. [100, 500, 300, 9999999, 9999] MaxGap = 9999999 - 9999 = 9990000
```

**What do we care about in the range?**

- Having elements in the contiguous sequence between the range min and max
- If all the elements are present between this range , then
**MaxGap = 1**

**When will the max gap be greater than 1 ?**

- If some or sequence of elements go missing in the range, we have
**MaxGap > 1**

*How to identify the missing ranges?*

If you take a look at the diagram above, you will see that an **array works like a bucket**, and that each bucket contains only one element.

n buckets = n elements

- Can we bucket the sequences with more than one element?
*It's possible!* - Perhaps we can store some meta-information about each bucket to help find the gaps?
*It's possible, what kind of information*

To explore further, question that comes first to mind,

How do we determine the number of elements to store in each bucket?

Let us , for example, transform our original array into buckets of size = 3 (random selection) and visualize.

In other words, when there are **equal sized buckets**, the bucket with no elements will add the most to the gap between contiguous sequences. By looking at the previous and next buckets, where it overflowed, the total can be calculated.

We can bucket the range of elements in such a way that we can find **at least one empty bucket**, and this is the gap that actually contributes to **MaxGap**.

If we store min and max element for each bucket, then for such an empty bucket,

MaxGap = (max Element from previous non-empty bucket) - (min Element from next non-empty bucket)

Therefore, finding all the missing elements becomes unnecessary.

Although the above diagram shows that intuitively, *how do we know this will be true always?*

The Pigeonhole Principle states that

if

`n`

items are put into`m`

containers, with`n>m`

, then at least one container must contain more than one item

Considering the same principle alternate formulations for non-missing and missing items, which is relevant to this problem,

If n objects are distributed over n places in such a way that no place receives more than one object, then each place receives exactly one object.

**This is like our array from min to max range, no missing, with MaxGap = 1.**

If n objects are distributed over m places, and if n < m, then some place receives no object.

**This is like our array from min to max range, some missing, with MaxGap > 1.**
In other words, if n-1 numbers are divided into n buckets, there is at least one empty bucket.

It follows that each of these empty buckets will contribute to the MaxGap, which now reduces the problem to **identifying all the empty buckets **and computing min and max for non-empty buckets adjacent to those.

To solidify, let us do an example walkthrough,

- Given n elements in input array, first build buckets of size (n-1) [so that we have at least one empty bucket]
- Decide bucket size from the range
**min**and**max**values for the input array. [**Note**: both min and max elements are kept exclusive when adding to the bucket to satisfy the condition]`(n-2)/(n-1)`

produces one empty bucket (total numbers divided by total buckets) - As we are storing max and min values for each bucket, initialize the variables for each bucket.
- Iterate over input array, and for each element, find which bucket it belongs to using below formula to find bucketIndex.
**Skip the min and max elements as explained in Step 2.**Bucket_Index = Math.floor((element-minElem)/bucketSize)

- Once bucket index identified, compare with max and min values for that bucket and update accordingly.
- Now that we have stored meta-information and buckets , iterate over buckets to identify empty ones.
- When empty bucket found, calculate the maxGap.

```
class Solution {
static class Bucket{
int max;
int min;
Bucket(int max,int min){
this.max = max;
this.min = min;
}
}
public int maximumGap(int[] nums) {
if(nums.length == 1) return 0;
// find range of elements max and min
int maxElem = Arrays.stream(nums).max().getAsInt();
int minElem = Arrays.stream(nums).min().getAsInt();
int n = nums.length;
// build a bucket of size (n-1)
Bucket[] bucketArr = new Bucket[n-1];
for(int i=0;i<n-1;i++){
bucketArr[i] = new Bucket(Integer.MIN_VALUE,Integer.MAX_VALUE);
}
// divide the range into equal sized buckets
// make sure both numerator and denominator are floats!!!
float bucketSize = (float)(maxElem - minElem)/(float)(n-1);
// add the elements to the buckets
// find min and max for each bucket
for(Integer elem: nums){
// continue if min or max
if(elem == minElem || elem == maxElem){
continue;
}
int bucketIndex = (int) Math.floor((elem-minElem)/bucketSize);
bucketArr[bucketIndex].max = Math.max(bucketArr[bucketIndex].max, elem);
bucketArr[bucketIndex].min = Math.min(bucketArr[bucketIndex].min, elem);
}
// identify all the empty buckets and calculate maxGap
int maxGap = 0;
int maxOfPrevNonEmpty = minElem;
for(int i = 0; i< n-1; i++){
//Empty bucket - skip
if(bucketArr[i].min == Integer.MAX_VALUE){
continue;
}
int minOfNextNonEmpty = bucketArr[i].min;
maxGap = Math.max(maxGap, minOfNextNonEmpty - maxOfPrevNonEmpty);
maxOfPrevNonEmpty = bucketArr[i].max;
}
maxGap = Math.max(maxGap, maxElem-maxOfPrevNonEmpty);
return maxGap;
}
}
```

Time Complexity : O(n)

Space Complexity : O(n)

]]>Author: Rasika.Warade

Given an integer array nums, return the maximum difference between two successive elements in its sorted form. If the array contains less than two elements, return 0.

You must write an algorithm that runs in linear time and uses linear extra space.

```
Example 1:
Input: nums = [3,6,9,1]
Output: 3
Explanation: The sorted form of the array is [1,3,6,9], either (3,6) or (6,9) has the maximum difference 3.
Example 2:
Input: nums = [10]
Output: 0
Explanation: The array contains less than 2 elements, therefore return 0.
```

Let us do the brute force way!

```
class Solution {
public int maximumGap(int[] nums) {
Arrays.sort(nums);
int maxDiff = 0;
for(int ptr1=0, ptr2=ptr1+1; ptr2<nums.length; ptr1++, ptr2++){
int diff = nums[ptr2]- nums[ptr1];
maxDiff = Math.max(diff, maxDiff);
}
return maxDiff;
}
}
```

So we have established that brute force is simple ! What makes it hard is maybe finding a linear time solution. Let's explore more on our options and thoughts.

**Step-By-Step Approach**

```
class Solution {
public int maximumGap(int[] nums) {
int maxElem = Arrays.stream(nums).max().getAsInt();
Boolean[] boolArray = new Boolean[maxElem+1];
Arrays.fill(boolArray, Boolean.FALSE);
for(int i=0;i<nums.length;i++){
boolArray[nums[i]] = Boolean.TRUE;
}
int maxDiff = 0;
int ptr1 = -1, ptr2 = 0;
while(ptr2 < boolArray.length){
if(boolArray[ptr2]){
if(ptr1 != -1){
maxDiff = Math.max( maxDiff, ptr2-ptr1 );
}
ptr1 = ptr2;
}
ptr2++;
}
return maxDiff;
}
}
```

- We have been asked to find a linear time and linear space, and we have found one solution with linear time complexity, with actually better, constant space. So what's the catch?
- What happens when we have very large number as maxElement? Shouldn't we be spending a lot of time iterating through this boolean array? Also, wasting a lot of storage space when the total elements in the
`nums[]`

is less?

Let us check our constraints in the problem before we mark it as a final solution

```
Constraints:
1 <= nums.length <= 10^5
0 <= nums[i] <= 10^9
```

Right! 10^{9} its a huge number, a linear time solution for iterating this size of array will cause **Time Limit Exceeded**.

Well! The start was simple and the problem got progressively harder. Let us explore this thought process on finding an optimum solution in the next blog.

]]>Author: Rasika.Warade

```
class StreamObserber()
def observer(x):
def sample(): # return an random element x
```

```
import random
class StreamObserver:
def __init__(self):
self.n = 0
self.elements = []
def observe(self, x):
self.n += 1
self.elements.append(x)
def sample(self):
r = random.randint(0, self.n - 1)
return self.elements[r]
```

```
class StreamObserverV1:
"""
Optimize for memory, use a fixed size queue and discard the oldest element when we have reached the maximum size.
"""
def __init__(self):
self.elements = []
self.max_size = 10
def observe(self, x):
if len(self.elements) >= self.max_size:
self.elements.pop(0)
self.elements.append(x)
def sample(self):
r = random.randint(0, len(self.elements) - 1)
return self.elements[r]
```

```
class StreamObserverV2:
def __init__(self) -> None:
self.elm = None
self.n = 0
def observe(self, x):
self.n += 1
prob = 1 / (self.n)
randomProb = random.uniform(0, 1)
if randomProb < prob:
self.elm = x
else:
pass
def sample(self):
return self.elm
```

- Push in 1 to 100 numbers, and sample for 10 times should have got at-least one less than 10 ?