Optimal Merge Patterns (Two way merge pattern):
- Two sorted files containing n and m records respectively could be merged together to obtain one sorted file in time O(m+n).
- When more than two sorted flies are to be merged together the merge can be accomplished by repeatedly merging sorted files in pairs.
- If files Xl, X2, X3 and X4 are to be merged we could first merge Xl and X2 to get a file Yl. Then we could merge Yl and X3 to get Y2.Finally, Y2 and X4 could be merged to obtain the desired sorted file.
- Alternatively, we could first merge Xl and X2 getting Yl, then merge X3 and X 4 getting Y2 and finally Y1 and Y2 getting the desired sorted file.
- Given n sorted files there are many ways in which to pairwise merge them into a single sorted file.
- Different pairings require differing amounts of computing time.
- The problem we shall address ourselves to now is that of determining an optimal (i.e. one requiring the fewest comparisons) way to pairwise merge n sorted files together.
Optimal Merge Patterns Example :
Xl, X2 and X3 are three sorted files of length 30, 20 and 10 records each. Merging Xl and X2 requires 50 record moves. Merging the result with X3 requires another 60 moves. The total number of record moves required to merge the three files this way is 110.
Y1 = X1+X2 = 30+20=50 moves
Y2 = Y1+ X3 = 50 + 10=60 moves
Total number of record moved = 50+60 = 110
If instead, we first merge X2 and X3 (taking 30 moves) and then Xl (taking 60 moves), the total record moves made is only 90. Hence, the second merge pattern is faster than the first.
Z1 = X2+X3=20+10=30 moves
Z2=Z1+X1=30+30=60 moves
Total number of record moved = 30+60=90 moves
Representing a merge pattern:
- The merge pattern will be referred to as a two-way merge pattern (each merge step involves the merging of two files.)
- The two way merge patterns can be represented by binary merge tree.
- A binary merge tree representing the optimal merge pattern obtained for the above six files.
- The leaf nodes are drawn as squares and represent the given files. These nodes will be called external nodes.
- The remaining nodes are drawn circular and are called internal nodes.
- Each internal node has exactly two children and it represents the file obtained by merging the files represented by its two children. The number in each node is the length (i.e., the number of records) of the file represented by that node.
Example: (a, b, c, d, e, f) = (46, 13, 20, 10, 5, 6)
Topic : Optimal Merge Patterns
I proud you