Testing Algorithm Robustness

Do you know what your heuristics are doing?

http://bit.ly/testing-algorithmic-robustness

Problem

  • We have some algorithm that works on data.
  • We want to test that it's correct.
  • How? We feed it data and assert it got the correct answer!
  • How do we know what the correct answer is...?

Solution

We don't test that the answer is right only that it is robust.

What does that mean?

  • We test the behaviour of our algorithms under changes to the data.
  • If it shouldn't improve the answer, it doesn't.
  • If it should improve the answer, it does.

How?

  • Using Hypothesis!
  • You write tests that something should be true for all inputs.
  • Hypothesis runs your test against a wide range of input.

Knapsack Packing Problem

Given \(n\) items with weights \(w_i\) and values \(v_i\), and \(C \geq 0\), find a set of items with total weight \(\leq C\) that maximizes the total value.

NP-hard, so we'll use a \(O(n \log(n))\) heuristic solution instead.


def solve_knapsack(items, capacity):
    """
    items: List of (value, weight) pairs
    capacity: Non-negative number.

    Returns a list of items from the list with weight summing
    to <= capacity that tries to maximize sum(value).
    """
    items.sort(key=lambda x: (x[0] / x[1], x[1]), reverse=True)
    result = []
    for item in items:
        if item[1] + current_fill <= capacity:
            result.append(item)
            current_fill += item[1]
    return result

  

from hypothesis import given, assume
import hypothesis.strategies as st

Item = st.tuples(
  st.integers(min_value=1), st.integers(min_value=1))
ItemSet = st.lists(Item)
Capacity = st.integers(min_value=0)
  

@given(ItemSet, Capacity)
def test_increasing_score_improves_things(
    items, capacity
):
    result = solve_knapsack(items, capacity)
    original_score = score_solution(result)
    assert result
    for item in result:
        new_items = list(items)
        new_items.append((item[0] + 1, item[1]))
        new_result = solve_knapsack(new_items, capacity)
        assert score_solution(new_result) > original_score
  

Falsifying example:

test_increasing_score_improves_things(
  items=[(6, 3), (5, 2)],
  capacity=5
)
  

If we add a (6, 2) item in, our chosen solution is [(6, 2), (5, 2)], scoring 11. We could have picked [(6, 2), (6, 3)], scoring 12.


@given(ItemSet, Capacity)
def test_increasing_weight_does_not_improve_things(
    items, capacity
):
    result = solve_knapsack(items, capacity)
    original_score = score_solution(result)
    for item in result:
        new_items = list(items)
        new_items.remove(item)
        new_items.append((item[0], item[1] + 1))
        new_result = solve_knapsack(new_items, capacity)
        assert score_solution(new_result) <= original_score

  

Falsifying example:

test_increasing_weight_of_chosen_item_does_not_improve_things(
  items=[(66, 33), (65, 32)],
  capacity=33
)
  

The chosen example is (65, 32), which has a lower score but higher score per unit weight. Raising its weight causes us to choose the other one which has a higher score.

In Summary

  • Not knowing the right answer shouldn't stop you from testing your code.
  • By testing your algorithms on random data, you'll get a much better handle on how it behaves in the wild.
  • By defining how your algorithm responds to changing data, you will get more robust heuristics.

Obligatory Sales Pitch

I'm available for hire for consulting and training.

drmaciver.com / @DRMacIver

https://hypothesis.readthedocs.org/

http://bit.ly/testing-algorithmic-robustness