xdev.cli.dirstats module

class xdev.cli.dirstats.DirectoryStatsCLI(*args, **kwargs)[source]

Bases: DataConfig

Analysis for code in a repository

CommandLine

python ~/code/xdev/xdev/cli/repo_stats.py .

Valid options: []

Parameters:
  • *args – positional arguments for this data config

  • **kwargs – keyword arguments for this data config

classmethod _register_main(func)[source]
default = {'block_dnames': <Value(None)>, 'block_fnames': <Value(None)>, 'dpath': <Value('.')>, 'ignore_dotprefix': <Value(True)>, 'include_dnames': <Value(None)>, 'include_fnames': <Value(None)>, 'max_display_depth': <Value(None)>, 'max_files': <Value(None)>, 'max_walk_depth': <Value(None)>, 'parse_content': <Value(True)>, 'python': <Value(False)>, 'verbose': <Value(0)>, 'version': <Value(False)>}
main(**kwargs)

Example

>>> # xdoctest: +SKIP
>>> cmdline = 0
>>> kwargs = dict(dpath='module:watch')
>>> main(cmdline=cmdline, **kwargs)
normalize()
xdev.cli.dirstats.main(cmdline=1, **kwargs)[source]

Example

>>> # xdoctest: +SKIP
>>> cmdline = 0
>>> kwargs = dict(dpath='module:watch')
>>> main(cmdline=cmdline, **kwargs)
xdev.cli.dirstats._null_coerce(cls, arg, **kwargs)[source]
class xdev.cli.dirstats.DirectoryWalker(dpath, block_dnames=None, block_fnames=None, include_dnames=None, include_fnames=None, max_walk_depth=None, max_files=None, parse_content=False, show_progress=True, ignore_empty_dirs=False, **kwargs)[source]

Bases: object

Configurable directory walker that can explore a directory and report information about its contents in a concise manner.

Options will impact how long this process takes based on how much data / metadata we need to parse out of the filesystem.

Parameters:
  • dpath (str | PathLike) – the path to walk

  • block_dnames (Coercable[MultiPattern]) – blocks directory names matching this pattern

  • block_fnames (Coercable[MultiPattern]) – blocks file names matching this pattern

  • include_dnames (Coercable[MultiPattern]) – if specified, excludes directories that do NOT match this pattern.

  • include_fnames (Coercable[MultiPattern]) – if specified, excludes files that do NOT match this pattern.

  • max_files (None | int) – ignore all files in directories with more than this number.

  • max_walk_depth (None | int) – how far to recurse

  • parse_content (bool) – if True, include content analysis

  • **kwargs – passed to label options

write_network_text(**kwargs)[source]
write_report(**nxtxt_kwargs)[source]
build()[source]
_inplace_filter_dnames(dnames)[source]
_inplace_filter_fnames(fnames)[source]
_walk()[source]
property file_paths
property dir_paths
_accum_stats()[source]
_update_stats()[source]
_humanize_stats(stats, node_type, reduce_prefix=False)[source]
_find_duplicate_files()[source]
_update_path_metadata()[source]
_update_labels()[source]

Update how each node will be displayed

_sort()[source]
xdev.cli.dirstats.parse_file_stats(fpath, parse_content=True)[source]

Get information about a file, including things like number of code lines / documentation lines, if that sort of information is available.

xdev.cli.dirstats.strip_comments_and_newlines(source)[source]

Removes hashtag comments from underlying source

Parameters:

source (str | List[str])

Todo

would be better if this was some sort of configurable minify API

Example

>>> from xdev.cli.dirstats import strip_comments_and_newlines
>>> import ubelt as ub
>>> fmtkw = dict(sss=chr(39) * 3, ddd=chr(34) * 3)
>>> source = ub.codeblock(
>>>    '''
       # comment 1
       a = '# not a comment'  # comment 2

multiline_string = {ddd}

one

{ddd} b = [

1, # foo

# bar 3,

] c = 3 ‘’’).format(**fmtkw)

>>> non_comments = strip_comments_and_newlines(source)
>>> print(non_comments)
>>> assert non_comments.count(chr(10)) == 10
>>> assert non_comments.count('#') == 1
xdev.cli.dirstats.strip_docstrings(tokens)[source]

Replace docstring tokens with NL tokens in a tokenize stream.

Any STRING token not part of an expression is deemed a docstring. Indented docstrings are not yet recognised.

xdev.cli.dirstats.byte_str(num, unit='auto', precision=2)[source]

Automatically chooses relevant unit (KB, MB, or GB) for displaying some number of bytes.

Parameters:
  • num (int) – number of bytes

  • unit (str) – which unit to use, can be auto, B, KB, MB, GB, or TB

References

Returns:

string representing the number of bytes with appropriate units

Return type:

str

Example

>>> import ubelt as ub
>>> num_list = [1, 100, 1024,  1048576, 1073741824, 1099511627776]
>>> result = ub.urepr(list(map(byte_str, num_list)), nl=0)
>>> print(result)
['0.00 KB', '0.10 KB', '1.00 KB', '1.00 MB', '1.00 GB', '1.00 TB']
>>> byte_str(10, unit='B')
'10.00 B'