xdev.algo module

xdev.algo.edit_distance(string1, string2)[source]

Edit distance algorithm. String1 and string2 can be either strings or lists of strings

Parameters:
  • string1 (str | List[str])

  • string2 (str | List[str])

Requirements:

pip install python-Levenshtein

Returns:

float | List[float] | List[List[float]]

Example

>>> # xdoctest: +REQUIRES(module:Levenshtein)
>>> string1 = 'hello world'
>>> string2 = ['goodbye world', 'rofl', 'hello', 'world', 'lowo']
>>> edit_distance(['hello', 'one'], ['goodbye', 'two'])
>>> edit_distance('hello', ['goodbye', 'two'])
>>> edit_distance(['hello', 'one'], 'goodbye')
>>> edit_distance('hello', 'goodbye')
>>> distmat = edit_distance(string1, string2)
>>> result = ('distmat = %s' % (ub.repr2(distmat),))
>>> print(result)
>>> [7, 9, 6, 6, 7]
xdev.algo.knapsack(items, maxweight, method='iterative')[source]

Solve the knapsack problem by finding the most valuable subsequence of items subject that weighs no more than maxweight.

Parameters:
  • items (tuple) – is a sequence of tuples (value, weight, id_), where value is a number and weight is a non-negative integer, and id_ is an item identifier.

  • maxweight (numbers.Real) – a non-negative number indicating the maximum weight of items we can take.

Returns:

(total_value, items_subset) - a pair whose first element is the

sum of values in the most valuable subsequence, and whose second element is the subsequence. Subset may be different depending on implementation (ie top-odwn recusrive vs bottom-up iterative)

Return type:

tuple

References

http://codereview.stackexchange.com/questions/20569/dynamic-programming-solution-to-knapsack-problem http://stackoverflow.com/questions/141779/solving-the-np-complete-problem-in-xkcd http://www.es.ele.tue.nl/education/5MC10/Solutions/knapsack.pdf

Example

>>> # ENABLE_DOCTEST
>>> import ubelt as ub
>>> items = [(4, 12, 0), (2, 1, 1), (6, 4, 2), (1, 1, 3), (2, 2, 4)]
>>> maxweight = 15
>>> total_value1, items_subset1 = knapsack(items, maxweight, method='iterative')
>>> print('items_subset1 = {}'.format(ub.repr2(items_subset1, nl=1)))
>>> print('total_value1 = {}'.format(ub.repr2(total_value1, nl=1)))

Example

>>> # ENABLE_DOCTEST
>>> # Solve https://xkcd.com/287/
>>> weights = [2.15, 2.75, 3.35, 3.55, 4.2, 5.8] * 2
>>> items = [(w, w, i) for i, w in enumerate(weights)]
>>> maxweight = 15.05
>>> total_value1, items_subset1 = knapsack(items, maxweight, method='iterative')
>>> total_weight = sum([t[1] for t in items_subset1])
>>> print('total_weight = %r' % (total_weight,))
>>> print('items_subset1 = %r' % (items_subset1,))
>>> #assert items_subset1 == items_subset, 'NOT EQ\n%r !=\n%r' % (items_subset1, items_subset)
Timeit:

# >>> import ubelt as ub # >>> setup = ub.codeblock( # >>> ‘’’ # import ubelt as ut # weights = [215, 275, 335, 355, 42, 58] * 40 # items = [(w, w, i) for i, w in enumerate(weights)] # maxweight = 2505 # #import numba # #knapsack_numba = numba.autojit(knapsack_iterative) # #knapsack_numba = numba.autojit(knapsack_iterative_numpy) # ‘’’) # >>> # Test load time # >>> stmt_list1 = ub.codeblock( # >>> ‘’’ # knapsack_iterative(items, maxweight) # knapsack_ilp(items, maxweight) # #knapsack_numba(items, maxweight) # #knapsack_iterative_numpy(items, maxweight) # ‘’’).split(’n’)

# ut.util_dev.timeit_compare(stmt_list1, setup, int(5))

from xdev.algo import * # NOQA def knapsack_inputs(n):

rng = np.random items = [

(rng.randint(0, 10), rng.randint(0, 10), idx) for idx in range(n)]

return {‘items’: items, ‘maxweight’: 100}

def input_wrapper(func):

import functools @functools.wraps(func) def _wrap(kwargs):

return func(**kwargs)

return _wrap

import perfplot perfplot.show(

setup=knapsack_inputs, kernels=list(map(input_wrapper, [knapsack_iterative, knapsack_ilp])), n_range=[2 ** k for k in range(15)], xlabel=”maxweight”, equality_check=None,

)

xdev.algo.knapsack_ilp(items, maxweight, verbose=False)[source]

solves knapsack using an integer linear program

Example

>>> # DISABLE_DOCTEST
>>> import ubelt as ub
>>> # Solve https://xkcd.com/287/
>>> weights = [2.15, 2.75, 3.35, 3.55, 4.2, 5.8, 6.55]
>>> values  = [2.15, 2.75, 3.35, 3.55, 4.2, 5.8, 6.55]
>>> indices = ['mixed fruit', 'french fries', 'side salad',
>>>            'hot wings', 'mozzarella sticks', 'sampler plate',
>>>            'barbecue']
>>> items = [(v, w, i) for v, w, i in zip(values, weights, indices)]
>>> #items += [(3.95, 3.95, 'mystery plate')]
>>> maxweight = 15.05
>>> verbose = True
>>> total_value, items_subset = knapsack_ilp(items, maxweight, verbose)
>>> print('items_subset = %s' % (ub.repr2(items_subset, nl=1),))
xdev.algo.number_of_decimals(num)[source]
Parameters:

num (float)

References

stackoverflow.com/questions/6189956/finding-decimal-places

Example

>>> # ENABLE_DOCTEST
>>> num = 15.05
>>> result = number_of_decimals(num)
>>> print(result)
2
xdev.algo.knapsack_iterative(items, maxweight)[source]
xdev.algo.knapsack_iterative_int(items, maxweight)[source]

Iterative knapsack method

Math:

maximize sum_{i in T} v_i subject to sum_{i in T} w_i leq W

Notes

dpmat is the dynamic programming memoization matrix. dpmat[i, w] is the total value of the items with weight at most W T is idx_subset, the set of indicies in the optimal solution

Example

>>> # ENABLE_DOCTEST
>>> weights = [1, 3, 3, 5, 2, 1] * 2
>>> items = [(w, w, i) for i, w in enumerate(weights)]
>>> maxweight = 10
>>> items = [(.8, 700, 0)]
>>> maxweight = 2000
>>> print('maxweight = %r' % (maxweight,))
>>> print('items = %r' % (items,))
>>> total_value, items_subset = knapsack_iterative_int(items, maxweight)
>>> total_weight = sum([t[1] for t in items_subset])
>>> print('total_weight = %r' % (total_weight,))
>>> print('items_subset = %r' % (items_subset,))
>>> result =  'total_value = %.2f' % (total_value,)
>>> print(result)
total_value = 0.80
xdev.algo.knapsack_iterative_numpy(items, maxweight)[source]

Iterative knapsack method

maximize sum_{i in T} v_i subject to sum_{i in T} w_i leq W

Notes

dpmat is the dynamic programming memoization matrix. dpmat[i, w] is the total value of the items with weight at most W T is the set of indicies in the optimal solution

xdev.algo.knapsack_greedy(items, maxweight)[source]

non-optimal greedy version of knapsack algorithm does not sort input. Sort the input by largest value first if desired.

Parameters:
  • `items` (tuple) – is a sequence of tuples (value, weight, id_), where value is a scalar and weight is a non-negative integer, and id_ is an item identifier.

  • `maxweight` (scalar) – is a non-negative integer.

Example

>>> # ENABLE_DOCTEST
>>> items = [(4, 12, 0), (2, 1, 1), (6, 4, 2), (1, 1, 3), (2, 2, 4)]
>>> maxweight = 15
>>> total_value, items_subset = knapsack_greedy(items, maxweight)
>>> result =  'total_value = %r\n' % (total_value,)
>>> result += 'items_subset = %r' % (items_subset,)
>>> print(result)
total_value = 7
items_subset = [(4, 12, 0), (2, 1, 1), (1, 1, 3)]