xdev.algo module¶
- xdev.algo.edit_distance(string1, string2)[source]¶
Edit distance algorithm. String1 and string2 can be either strings or lists of strings
- Parameters:
string1 (str | List[str])
string2 (str | List[str])
- Requirements:
pip install python-Levenshtein
- Returns:
float | List[float] | List[List[float]]
Example
>>> # xdoctest: +REQUIRES(module:Levenshtein) >>> string1 = 'hello world' >>> string2 = ['goodbye world', 'rofl', 'hello', 'world', 'lowo'] >>> edit_distance(['hello', 'one'], ['goodbye', 'two']) >>> edit_distance('hello', ['goodbye', 'two']) >>> edit_distance(['hello', 'one'], 'goodbye') >>> edit_distance('hello', 'goodbye') >>> distmat = edit_distance(string1, string2) >>> result = ('distmat = %s' % (ub.repr2(distmat),)) >>> print(result) >>> [7, 9, 6, 6, 7]
- xdev.algo.knapsack(items, maxweight, method='iterative')[source]¶
Solve the knapsack problem by finding the most valuable subsequence of items subject that weighs no more than maxweight.
- Parameters:
items (tuple) – is a sequence of tuples (value, weight, id_), where value is a number and weight is a non-negative integer, and id_ is an item identifier.
maxweight (numbers.Real) – a non-negative number indicating the maximum weight of items we can take.
- Returns:
- (total_value, items_subset) - a pair whose first element is the
sum of values in the most valuable subsequence, and whose second element is the subsequence. Subset may be different depending on implementation (ie top-odwn recusrive vs bottom-up iterative)
- Return type:
References
http://codereview.stackexchange.com/questions/20569/dynamic-programming-solution-to-knapsack-problem http://stackoverflow.com/questions/141779/solving-the-np-complete-problem-in-xkcd http://www.es.ele.tue.nl/education/5MC10/Solutions/knapsack.pdf
Example
>>> # ENABLE_DOCTEST >>> import ubelt as ub >>> items = [(4, 12, 0), (2, 1, 1), (6, 4, 2), (1, 1, 3), (2, 2, 4)] >>> maxweight = 15 >>> total_value1, items_subset1 = knapsack(items, maxweight, method='iterative') >>> print('items_subset1 = {}'.format(ub.repr2(items_subset1, nl=1))) >>> print('total_value1 = {}'.format(ub.repr2(total_value1, nl=1)))
Example
>>> # ENABLE_DOCTEST >>> # Solve https://xkcd.com/287/ >>> weights = [2.15, 2.75, 3.35, 3.55, 4.2, 5.8] * 2 >>> items = [(w, w, i) for i, w in enumerate(weights)] >>> maxweight = 15.05 >>> total_value1, items_subset1 = knapsack(items, maxweight, method='iterative') >>> total_weight = sum([t[1] for t in items_subset1]) >>> print('total_weight = %r' % (total_weight,)) >>> print('items_subset1 = %r' % (items_subset1,)) >>> #assert items_subset1 == items_subset, 'NOT EQ\n%r !=\n%r' % (items_subset1, items_subset)
- Timeit:
# >>> import ubelt as ub # >>> setup = ub.codeblock( # >>> ‘’’ # import ubelt as ut # weights = [215, 275, 335, 355, 42, 58] * 40 # items = [(w, w, i) for i, w in enumerate(weights)] # maxweight = 2505 # #import numba # #knapsack_numba = numba.autojit(knapsack_iterative) # #knapsack_numba = numba.autojit(knapsack_iterative_numpy) # ‘’’) # >>> # Test load time # >>> stmt_list1 = ub.codeblock( # >>> ‘’’ # knapsack_iterative(items, maxweight) # knapsack_ilp(items, maxweight) # #knapsack_numba(items, maxweight) # #knapsack_iterative_numpy(items, maxweight) # ‘’’).split(’n’)
# ut.util_dev.timeit_compare(stmt_list1, setup, int(5))
from xdev.algo import * # NOQA def knapsack_inputs(n):
rng = np.random items = [
(rng.randint(0, 10), rng.randint(0, 10), idx) for idx in range(n)]
return {‘items’: items, ‘maxweight’: 100}
- def input_wrapper(func):
import functools @functools.wraps(func) def _wrap(kwargs):
return func(**kwargs)
return _wrap
import perfplot perfplot.show(
setup=knapsack_inputs, kernels=list(map(input_wrapper, [knapsack_iterative, knapsack_ilp])), n_range=[2 ** k for k in range(15)], xlabel=”maxweight”, equality_check=None,
)
- xdev.algo.knapsack_ilp(items, maxweight, verbose=False)[source]¶
solves knapsack using an integer linear program
Example
>>> # DISABLE_DOCTEST >>> import ubelt as ub >>> # Solve https://xkcd.com/287/ >>> weights = [2.15, 2.75, 3.35, 3.55, 4.2, 5.8, 6.55] >>> values = [2.15, 2.75, 3.35, 3.55, 4.2, 5.8, 6.55] >>> indices = ['mixed fruit', 'french fries', 'side salad', >>> 'hot wings', 'mozzarella sticks', 'sampler plate', >>> 'barbecue'] >>> items = [(v, w, i) for v, w, i in zip(values, weights, indices)] >>> #items += [(3.95, 3.95, 'mystery plate')] >>> maxweight = 15.05 >>> verbose = True >>> total_value, items_subset = knapsack_ilp(items, maxweight, verbose) >>> print('items_subset = %s' % (ub.repr2(items_subset, nl=1),))
- xdev.algo.number_of_decimals(num)[source]¶
- Parameters:
num (float)
References
stackoverflow.com/questions/6189956/finding-decimal-places
Example
>>> # ENABLE_DOCTEST >>> num = 15.05 >>> result = number_of_decimals(num) >>> print(result) 2
- xdev.algo.knapsack_iterative_int(items, maxweight)[source]¶
Iterative knapsack method
- Math:
maximize sum_{i in T} v_i subject to sum_{i in T} w_i leq W
Notes
dpmat is the dynamic programming memoization matrix. dpmat[i, w] is the total value of the items with weight at most W T is idx_subset, the set of indicies in the optimal solution
Example
>>> # ENABLE_DOCTEST >>> weights = [1, 3, 3, 5, 2, 1] * 2 >>> items = [(w, w, i) for i, w in enumerate(weights)] >>> maxweight = 10 >>> items = [(.8, 700, 0)] >>> maxweight = 2000 >>> print('maxweight = %r' % (maxweight,)) >>> print('items = %r' % (items,)) >>> total_value, items_subset = knapsack_iterative_int(items, maxweight) >>> total_weight = sum([t[1] for t in items_subset]) >>> print('total_weight = %r' % (total_weight,)) >>> print('items_subset = %r' % (items_subset,)) >>> result = 'total_value = %.2f' % (total_value,) >>> print(result) total_value = 0.80
- xdev.algo.knapsack_iterative_numpy(items, maxweight)[source]¶
Iterative knapsack method
maximize sum_{i in T} v_i subject to sum_{i in T} w_i leq W
Notes
dpmat is the dynamic programming memoization matrix. dpmat[i, w] is the total value of the items with weight at most W T is the set of indicies in the optimal solution
- xdev.algo.knapsack_greedy(items, maxweight)[source]¶
non-optimal greedy version of knapsack algorithm does not sort input. Sort the input by largest value first if desired.
- Parameters:
`items` (tuple) – is a sequence of tuples (value, weight, id_), where value is a scalar and weight is a non-negative integer, and id_ is an item identifier.
`maxweight` (scalar) – is a non-negative integer.
Example
>>> # ENABLE_DOCTEST >>> items = [(4, 12, 0), (2, 1, 1), (6, 4, 2), (1, 1, 3), (2, 2, 4)] >>> maxweight = 15 >>> total_value, items_subset = knapsack_greedy(items, maxweight) >>> result = 'total_value = %r\n' % (total_value,) >>> result += 'items_subset = %r' % (items_subset,) >>> print(result) total_value = 7 items_subset = [(4, 12, 0), (2, 1, 1), (1, 1, 3)]