Table of Contents
Definition - Greedy algorithms focus on finding the local optima. They pick the option that best satisfy the criteria of problem. Example - Best First Search.
Types of Problems
-
Optimization Problems
- Requires a deterministic min/max value for a given criteria.
- Requires an objective function.
- Has optimal solutions.
- Example – Knapsack, Shortest Path.
-
Planning / Decision Problems
- Result is always either True / False.
- Doesn’t need an objective function.
- Doesn’t have optimal solutions, has feasible solutions.
- Example – Searching, Sorting, N-Queens.
Knapsack Problem
Problem Statement
You are given a set of objects, where each object iii has:
- a weight ,
- a profit (or value) .
- a knapsack with a maximum capacity of units of weight
Objective
Select a subset of the objects such that:
- the total weight of the selected objects does not exceed the knapsack capacity mmm,
- each object is either included or excluded (for 1 knapsack) or partially selected (for fractional knapsack),
- the total profit of the selected objects is maximized.
Objective:
Types of Knapsack -
- Fractional Knapsack - Partial objects are allowed, Solved using Greedy methods.
- Binary Knapsack (0/1 Knapsack) - No partial objects are allowed, Solved using Dynamic Programming.
How to solve -
- A greedy algorithm can always provide an optimal solution for a Fractional Knapsack problem.
- Calculate the profit/weight ratio.
- Assign priority to objects in descending order of ratio.
- If the weight constraint satisfies, pick a whole/fraction of the object.
- When the weight constraint fails, stop picking.
- A greedy algorithm can’t always provide an optimal solution for a 0/1 Knapsack problem.
Job Scheduling with Deadline (JSD)
Problem Statement
You are given a set of jobs, where each job has:
- a arrival time (each job is available from the start)
- a burst time (unit execution time (each job takes exactly 1 unit of time))
- a deadline (latest time by which it must be completed),
- a profit earned if the job is completed on or before its deadline,
Objective
Schedule a subset of jobs such that:
- no two jobs overlap (only one job can be done at a time),
- each job finishes on or before its deadline,
- the total profit is maximized.
How to solve -
Suppose we have -
| Jobs | Deadline | Profit |
|---|---|---|
| J1 | 2 | 200 |
| J2 | 1 | 30 |
| J3 | 2 | 50 |
| J4 | 1 | 80 |
Brute Force -
Choose all subsets of the above jobs and calculate the total profit. The subset which provides the maximum profit is the answer.
Greedy Algorithm -
- Assign priority to jobs based upon the descending order of Profit.
- Identify the max deadline and store it.
- Pick jobs in order of this priority and place each at its respective deadline . If the deadline is not available, then try to place it at .
- In the above schedule the priority will be J1, J4, J3, J2.
- Pick J1. As , check 2. As 2 is empty, J1 is assigned to 2.
- Pick J4. As , check 1. As 1 is empty, J4 is assigned to 1. Thus total profit is .
Optimal Merge Patterns
This is an application of the merging algorithm.
Problem Statement
- You are given files.
- It is required to merge them using 2-way merging.
Objective
In which sequence should the files be merged such that:
- Total number of record movements is minimal.
How to solve -
Given sorted files and having and records respectively. To merge them into a single sorted file, we’d require record movements.
This is different from the merging in merge sort as it has element comparisons, while here we have record movements.
Example -
Suppose , , and have , , and records respectively. If we were to merge these files in the order , what would be the total number of record movements?
Here we first merge and into one file, and then merge this resultant file with . So -
- Record movements in merging and - .
- Resultant file now has records.
- Record movements in merging and - .
- Resultant file now has records.
- Using and , total number of record movements .
However, if we were to merge in the sequence , the total number of record movements will be .
Brute Force
Try all permutations of the files and calculate the total number of record movements. The permutation which provides the minimum record movements is the answer.
Greedy Algorithm
- Assign priority to files in an ascending order of record count.
- Pick two files with the highest priorities. Merge them and add the resulting file again into the list and recalculate priorities.
- Keep merging until all files are merged and only one file remains in the queue.
In the above example, the priority would be thus would provide the minimum record movements of .

- Total number of record movements -
- Total number of element-wise comparison -
Merge Tree

We can use a merge tree too where the files with smaller size are at the bottom of the tree while the files with larger size are closer to the root.
Formally we do where is the depth of a file in the merge tree (excluding the root node) and is the size of that file. From the tree above we can see -
-
F1 is at depth 2, thus
-
Similarly,
-
-
-
Thus record movements
-
At any node of the tree, the smaller child is to the left and the larger child is to the right.
-
Smaller files are kept at the bottom because the cost of repeatedly merging them would be smaller.
Encoding & Huffman Encoding
Definition - Encoding is the process of converting information from one format to another for efficient transmission or storage.
Types of Encoding -
- Uniform Encoding - Each character is encoded using equal number of bits.
- Non-uniform Encoding - The number of bits used to encode a character depend upon some criteria.
A problem with Uniform Encoding techniques like Binary Encoding is the wastage of space. For example - To encode 10 characters using binary encoding we’d require 4 bits. But using 4 bits we can encode a total of 16 characters, so the extra 6 binary sequences are being wasted.
Huffman Encoding
A non-uniform encoding technique where the encoding/code is based upon the frequency of that character or elements.
- Assign more bits to elements with a low frequency.
- Assign less bits to elements with a high frequency.
- No two elements have the same encoding, encoding is unique.
Suppose we have 5 characters with the following frequencies -
The 2-way Optimal Binary merge tree would look like -

Here the smaller child of every node is the left child node and the larger child of every node is the right child node. Let -
- Left Branch - 0
- Right Branch - 1
So the encodings for the characters are like -
To get the average number of bits/character, sum the intermediate nodes of the merge tree similar to how we sum to get the number of record movements. Here the average number of bits/character is .
Minimum Cost Spanning Tree
Types of graphs -
- In a directed complete graph there are edges for all vertices. Thus the total number of edges is - .
- In an undirected complete graph the first vertex has edges, second vertex has vertex, and so on. Thus the total number of edges is .
Types of graph representations -
- Adjacency Matrix - .
- Adjacency List - .
When to use which representation -
- For dense graphs we prefer Adjacency Matrix instead of Adjacency List as for them. An Adjacency list would be of while an Adjacency Matrix would be of .
- For sparse graphs we prefer Adjacency Lists because the number of edges in them is low.
Spanning Tree
A subgraph of a given graph where is a spanning tree if is a tree (acyclic).
- A spanning tree with vertices will always have edges.
- Number of edges not used in the spanning tree -
- Number of null links in the spanning tree - There are links in the tree out of which are used. Thus .
Prim’s Vs Kruskal’s Algorithm
- Prim’s - Pick the minimum cost edges extending from the nodes currently in the tree.
- Kruskal’s - Using a Min Heap + Set, greedily pick smallest cost edges.
- Prim’s Algorithm
- Time Complexity:
- (without heap)
- (with heap)
- If , then
- Always maintains a tree structure
- Time Complexity:
- Kruskal’s Algorithm
- Time Complexity: (uses heap)
- May or may not maintain a tree structure during execution
- The cost of the MCST obtained by both algorithms is the same.
- The tree structure obtained by both approaches:
- If all edges have unique/distinct weights, both algorithms produce the same tree
- If there are duplicate edge weights, the tree structure may or may not be the same
Djikstra’s MCST Algorithm
- Consider all edges of the graph.
- If a cycle forms, remove the maximum cost edge in that cycle.
Djikstra’s Single Source Shortest Path is also a Greedy Algorithm, but discussed under Dynamic Programming.
Questions
Q1) We have 6 files of the following sizes, how can we merge them to get the minimum number of record movements?
Sol -
Here the priority would be - . The merge tree would look something like this -
Here the priority queue looked like -
- . Pick and and merge. Resultant = .
- . Pick and and merge. Resultant = .
- . Pick and and merge. Resultant = .
- . Pick and and merge. Resultant = .
- . Pick and and merge. Resultant = .
- . Only one file remains, thus merging terminates.
Total record movement =
Q2) What is the worst case no. of element wise comparison in the previous question?
Sol - The worst case number of comparisons are . So in each file merge the worst case number of comparisons is -
- Upon merging
- Upon merging
- Upon merging
- Upon merging
- Upon merging
Total =
Just do .
Q3) What is the bet case no. of element wise comparison in the previous question?
Sol - The best case number of comparisons are . So in each file merge the best case number of comparisons is -
- Upon merging
- Upon merging
- Upon merging
- Upon merging
- Upon merging
Total =
Q4)

Sol - The merge tree for this is -

Due to this the encodings turn out to be like -
- H - 0
- G - 10
- F - 110
- E - 1110
- D - 11110
- C - 111110
- B - 111111
- A - 111110 So the encoding can be decoded like -
Thus is the answer.
Q5)

Sol - Here the merge tree is like -

The average number of bits/characters . Thus for 100 characters, the expected length of the encoded message would be .
Q6)

Sol -
Here according to the second rule - . Thus should have a higher priority, then and lastly . Using this information and frequency from the string we can make the merge tree -

Using this we can find the encoding for each character and then find the total length of the encoded string. It would come out to be .
Q7)

Sol - This can be solved using [[Greedy Algorithms#|Djikstra’s MCST Algorithm]]. Here there are 3 cycles. If some edge was not selected from a cycle for the MCST, then it had the maximum cost among all edges in that cycle.
- In cycle , was not selected. As and , , thus .
- In cycle , , thus .
- In cycle , |CD|>16$.
Thus total cost = .
Q8)

Sol - The minimum cost edges in such a graph would be between two adjacent nodes and of cost . As there are a total edges, cost of the MCST would be .
Q9)

- The graph will be the XY coordinate plane with each coordinate being a vertex.
- Thus in total the graph will have vertices.
- The edges of such a graph will be of cost or .
- As the graph forms a grid (draw one if needed), one can simply follow the gridlines and traverse all vertices. Thus for such a path using edges of cost will be cheaper.
- The MCST would have edges ( No. of vertices) and as each edge will be of cost 1, the total cost would be .