|
6 | 6 |
|
7 | 7 | Implementation of k-way merge.
|
8 | 8 |
|
9 |
| -This package implements the `KWayMerger` type. |
10 |
| -It is a stateful, lazy iterator of the elements in an iterator of iterators. |
11 |
| -The elements of the inner iterators will be yielded in an order given by a predicate optionally passed to `KWayMerger` (default: `isless`). |
12 |
| -Therefore, if the inner iterators are sorted by the predicate, the output of the `KWayMerger` is also guaranteed to be sorted. |
| 9 | +This package exports the function `kway_merge`. |
| 10 | +It constructs a `KWayMerger` - a stateful, lazy iterator of the elements in an iterator of iterators. |
| 11 | +The elements of the inner iterators will be yielded in order, as specified by the optional ordering (default: `Forward`). |
| 12 | +Therefore, if the inner iterators are sorted by the order, the yielded elements of the `KWayMerger` is also guaranteed to be sorted. |
13 | 13 |
|
14 |
| -The primary purpose of `KWayMerger` is to efficiently merge N sorted iterables into one sorted stream. |
| 14 | +The primary purpose of `kway_merge` is to efficiently merge N sorted iterables into one sorted stream. |
15 | 15 |
|
16 |
| -The iterator yields `(i::Int, x)` tuples, where `x` is the next element of one of the iterators, and `i` is the 1-based index of the iterator that yielded `x`: |
| 16 | +The iterator yields `@NamedTuple{from_iter::Int, value::T}`, where the value field has the next element of one of the iterators, and the from_iter field contains the 1-based index of the iterator that yielded the value: |
17 | 17 |
|
18 | 18 | ```julia
|
19 |
| -julia> it = KWayMerger([[2, 3], [1, 4]]); |
| 19 | +julia> it = kway_merge([[2, 3], [1, 4]]); |
20 | 20 |
|
21 | 21 | julia> first(it)
|
22 |
| -(2, 1) |
| 22 | +(from_iter = 2, value = 1) |
23 | 23 |
|
24 |
| -julia> println(collect(it)) |
| 24 | +julia> println(map(Tuple, it)) |
25 | 25 | [(1, 2), (1, 3), (2, 4)]
|
26 | 26 | ```
|
27 | 27 |
|
28 | 28 | The function `peek` can be used to check the next element without advancing the iterator:
|
29 | 29 |
|
30 | 30 | ```julia
|
31 |
| -julia> it = KWayMerger([1]); |
| 31 | +julia> it = kway_merge([1]); |
32 | 32 |
|
33 | 33 | julia> peek(it)
|
34 |
| -(1, 1) |
| 34 | +(from_iter = 1, value = 1) |
35 | 35 |
|
36 | 36 | julia> first(it)
|
37 |
| -(1, 1) |
| 37 | +(from_iter = 1, value = 1) |
38 | 38 |
|
39 | 39 | julia> peek(it) === nothing
|
40 | 40 | true
|
41 | 41 | ```
|
42 | 42 |
|
43 | 43 | ## Documentation
|
44 |
| -This package's public functionality are the `KWayMerger` type, and its `Base.peek` method. |
| 44 | +This package's public functionality are the `kway_merge` function, the (unexported) `KWayMerger` type, and its `Base.peek` method. |
45 | 45 | See their docstrings for more details.
|
46 | 46 |
|
47 | 47 | ## Performance
|
48 |
| -When merging I iterables with a total length of N: |
| 48 | +When merging I iterables: |
49 | 49 | * A `KWayMerger` allocates O(I) space upon construction
|
50 | 50 | * Producing each element takes O(log(I)) time
|
51 | 51 |
|
52 |
| -Therefore, merging I sorted iterables with N total elements using a KWayMerger therefore takes O(N * log(I)) time. |
53 |
| -It is generally faster than flattening the iterators and sorting, when I << N. |
| 52 | +Therefore, merging I sorted iterables with N total elements using `kway_merge` takes O(N * log(I)) time. |
| 53 | +This is similar to the O(N * log(N)) time taken for comparison-based sorts. |
| 54 | +That's no co-incidence: One can take a list with N elements, separate it into N 1-element lists, then merge them with a kway-merge. That is a variant of merge sort. |
| 55 | + |
| 56 | +However, compared to a comparison-based sort like quicksort, using a kway merge has the following differences: |
| 57 | +* Usually, we have I << N, and therefore, kway merge is usually faster. |
| 58 | +* For large I, quicksort is faster in practice because its overhead per element is smaller. |
| 59 | + |
54 | 60 | Note that Julia uses radix sort for integers, which sorts in O(N), and therefore usually beats a k-way merge.
|
55 | 61 |
|
56 | 62 | ## Contributing
|
57 | 63 | We appreciate contributions from users including reporting bugs, fixing
|
58 |
| -issues, improving performance and adding new features. |
| 64 | +issues, improving performance and adding new fea oftentures. |
59 | 65 |
|
60 | 66 | Take a look at the [contributing files](https://github.com/BioJulia/Contributing)
|
61 | 67 | detailed contributor and maintainer guidelines, and code of conduct.
|
|
0 commit comments