Specification for void sort(int *arr, int nelt)
:
Input constraints:
arr
is an array of integers
nelt
is the length of the array arr
arr
the original array contents and by arr'
what the array contains after the code runs):
arr
and arr'
. This type of function is
a permutation:
Revised output constraints for output array arr'
:
arr'[phi(i)] = arr[i]
nelt
objects, labeled 0, 1, 2,
... nelt-1
. We define our domain and range set as
S={0,1,2,...,nelt-1}, and require that phi is a bijection on S. A
fact that we will use is that the composition of two permutations is
itself a permutation.
Then we need to prove the correctness of the sort routine starting with the an identity permutation phi0, and showing that at each step s the function maintains that there is a permutation phis relating the current state of the array with its initial state.
Now here is the proposed code that will implement this specification:
1. void sort(int *arr, int nelt) { 2. int mid, smallix, bigix, t; 3. if (nelt <= 1) return; 4. 5. mid = arr[0]; 6. smallix = 0; 7. bigix = nelt; 8. 9. while (smallix < bigix) { 10. while (smallx < nelt && arr[smallix] <= mid) 11. ++smallix; 12. while (bigix > 0 && arr[bigix-1] > mid) 13. bigix--; 14. if (!(smallix==nelt || 0 == bigix || smallix == bigix)) { 15. t = arr[smallix]; 16. arr[smallix] = arr[bigix - 1]; 17. arr[bigix - 1] = t; 18. } 19. } 20. t = arr[0]; arr[0] = arr[smallix - 1]; arr[smallix - 1] = t; 21. sort(arr, smallix - 1); 22. sort(arr + smallix, nelt - smallix); 23. }
There are two key strategies to proving code correctness:
Notice that the main while loop divides the array into two parts, based
on a pivot point (which is based on the value of variable
mid
). Now we begin with our inductive proof:
Proof by induction
nelt < k
.
nelt = k
.
We begin the proof of the third step by noticing that a loop invariant that can help us. We know that smallix < bigix for the entire loop. Why? Because it begins that way (before the loop), and the two small while loops ensure that smallix is only incremented when arr[smallix] <= mid, and that bigix is only decremented when arr[bigix] > mid.
There are three regions of the array: [0,smallix), [smallix, bigix), and [bigix,nelt). The region (smallix,bigix) gets smaller due to the inner while loops, until it eventually reaches size 0 when smallix = bigix.
For each of these regions, we know the following loop invariants are true:
We will continue this proof next time in class.
You can download this annotated code and play with it yourself:
Your task is to fix his code. In order to keep your job -- your pointy-haired boss will undoubtedly read your changes -- you must keep the random pivot selection idea. Use what you know about testing and proofs of correctness to make this code work again.
You should hand in: the fixed (and properly annotated) code, the testing scaffolding that you used when debugging your implementation, the test inputs (esp those that showed bugs in the pointy-haired boss's implementation or your own initial fixes) that will serve as test cases for regression testing subsequent versions, and a README.txt file containing a description of what you did, how you decide to do what you did, and why you believed it to be the correct fix(es). You should also include in your writeup a discussion of whether the worse-case performance of the original code might actually a problem in practice.
Clarification: by test scaffolding I mean what code you write to test your fixes to the sort function. I expect you to have some sort of testing driver which allows you to at least semi-automate the process of feeding in the test cases from a test suite, and such a driver program will be part of the testing tools for the project, to be handed over to sustaining engineering along with the regression testing test suite. You may wish to generate your test cases via a program, or just have data files -- your design decision should be part of the writeup.
This assignment is due at 2359 on February 15, 2002.
bsy+cse127w02@cs.ucsd.edu, last updated
email bsy.