Sorting Algorithms

Overview

  • Visualizations

    Due to the complexity of the sorting algorithms. Visualizations may greatly enhance the learning process. Below are two useful websites providing visualizations of common algorithms and data structures.

  • Categories

    All algorithms not mentioned in the book is for your information only. Only algorithm with bold names are covered in the book.

    • Comparison based

      • Selection

      • Insertion

      • Bubble

      • Shell

      • Quick

      • Merge

      • Heap

    • Non-comparison based

      • Bucket

      • Radix

      • Counting

    • Divide-and-conquer \(\Theta(n \log n)\)

      • Quick

      • Merge

    • simple/naive \(\Theta(n^{2})\)

      • Selection

      • Insertion

      • Bubble

    • Hybrid

      • Best in real-world scenarios

      • Adopted in many C++ standard library implementations

      • Tim sort: combination of merge sort and insertion sort

      • Intro sort: combination of quick sort, heap sort, and insertion sort

    • Stable sorting algorithm

      The order of duplicate elements is preserved after sorting.

      • Insertion

      • Merge

      • Radix

  • Time complexity

    • simple \(\Theta(n^{2})\): insert, selection, bubble

    • fast \(\Theta(n \log n)\): quick, merge, heap

    • linear: radix \(\Theta(nk) = \Theta(n)\), bucket \(\Theta(n + k) = \Theta(n)\)

  • Space complexity

    The following discussion is on array only.

    • in-place (swap based)

      • Iterative (loop based) \(O(1)\)

        • selection sort

        • bubble sort

        • insertion sort

        • shell sort

        • heap sort

      • Divide-and-conquer (recursive) \(O(\log n)\)

        • quick sort

        • selection sort

        • bubble sort

        • insertion sort

        • shell sort

        • heap sort

      • Divide-and-conquer (recursive)

        • quick sort: \(\Theta(\log n)\) to \(\Theta(n)\) for recursion

    • not-in-place (out-of-place) \(\Theta(n)\) for extra arrays

      • merge sort

      • bucket sort

      • radix sort

Array vs Linked-list

  • Sorting an array

    • Array property

      • fast access by index

      • slow insertion and deletion of elements except at the end

    • With the same time complexity, consider data movement overhead

    • Preferred general purpose sorting algorithms

      • Quick sort

      • Merge sort

      • Heap sort

      • Shell sort

  • Sorting a linked-list

    • Linked-list property

      • slow access by index

      • fast reconstruction of new list

      • fast access to neighboring nodes

    • With the same time complexity, consider data movement overhead

    • Preferred general purpose sorting algorithms

      • Merge sort

      • Insertion sort

    • Time Complexity

      • \(\Theta(n \log n)\) for merge sort

      • \(\Theta(n^2)\) for insertion sort

    • Space Complexity depends on the implementation

Real-World Choices

  • This section is OPTIONAL!

  • In real-world scenarios, there are more considerations.

    • Data size is not always large

    • Performance boost from cache is often significant

    • We can always make more complicated sorting algorithms

  • For small arrays

    • Insertion sort

      • Fast

      • Simple

      • Stable (original order of equal elements is preserved)

  • For large arrays

    • Quick sort > Merge sort > Heap sort

      • Quick sort is faster than merge sort and heap sort in practice, inconsistent complexity.

      • Merge sort is stable, has consistent complexity, but requires extra space

      • Heap sort is in-place but is not stable and cannot make good use of cache

    • In parallel environment

      • Merge sort

      • Quick sort

  • The overall winner for array is

    • Hybrid

      • Use divide-and-conquer to divide the array into small subarrays and then use insertion sort to sort the small subarrays

      • Quick sort + Insertion sort

      • Merge sort + Insertion sort

  • For linked list

    • Out-of-place algorithms are no longer requiring \(\Theta(n)\) space

    • Cache is not helping at all

    • Direct sort

      • Merge sort for large linked list (can be parallelized)

      • Insertion sort for small linked list

    • Sort as array

      • Make an array from the linked list

      • Sort the array

      • Create the new linked list according to the sorted array

    • Sort as array of pointers

      • Make an array of pointers

      • Sort the array of pointers

      • Generate the new linked list according to the sorted array