xdev.cli.dirstats module¶

class xdev.cli.dirstats.DirectoryStatsCLI(*args, **kwargs)[source]¶

Bases: DataConfig

Analysis for code in a repository

CommandLine

python ~/code/xdev/xdev/cli/repo_stats.py .

Valid options: []

Parameters:

*args – positional arguments for this data config
**kwargs – keyword arguments for this data config

classmethod _register_main(func)[source]¶

default = {'block_dnames': <Value(None)>, 'block_fnames': <Value(None)>, 'dpath': <Value('.')>, 'ignore_dotprefix': <Value(True)>, 'include_dnames': <Value(None)>, 'include_fnames': <Value(None)>, 'max_display_depth': <Value(None)>, 'max_files': <Value(None)>, 'max_walk_depth': <Value(None)>, 'parse_content': <Value(True)>, 'python': <Value(False)>, 'verbose': <Value(0)>, 'version': <Value(False)>}¶

main(**kwargs)¶

Example

>>> # xdoctest: +SKIP
>>> cmdline = 0
>>> kwargs = dict(dpath='module:watch')
>>> main(cmdline=cmdline, **kwargs)

normalize()¶

xdev.cli.dirstats.main(cmdline=1, **kwargs)[source]¶

Example

>>> # xdoctest: +SKIP
>>> cmdline = 0
>>> kwargs = dict(dpath='module:watch')
>>> main(cmdline=cmdline, **kwargs)

xdev.cli.dirstats._null_coerce(cls, arg, **kwargs)[source]¶

class xdev.cli.dirstats.DirectoryWalker(dpath, block_dnames=None, block_fnames=None, include_dnames=None, include_fnames=None, max_walk_depth=None, max_files=None, parse_content=False, show_progress=True, ignore_empty_dirs=False, **kwargs)[source]¶

Bases: object

Configurable directory walker that can explore a directory and report information about its contents in a concise manner.

Options will impact how long this process takes based on how much data / metadata we need to parse out of the filesystem.

Parameters:

dpath (str | PathLike) – the path to walk
block_dnames (Coercable[MultiPattern]) – blocks directory names matching this pattern
block_fnames (Coercable[MultiPattern]) – blocks file names matching this pattern
include_dnames (Coercable[MultiPattern]) – if specified, excludes directories that do NOT match this pattern.
include_fnames (Coercable[MultiPattern]) – if specified, excludes files that do NOT match this pattern.
max_files (None | int) – ignore all files in directories with more than this number.
max_walk_depth (None | int) – how far to recurse
parse_content (bool) – if True, include content analysis
**kwargs – passed to label options

write_network_text(**kwargs)[source]¶

write_report(**nxtxt_kwargs)[source]¶

build()[source]¶

_inplace_filter_dnames(dnames)[source]¶

_inplace_filter_fnames(fnames)[source]¶

_walk()[source]¶

property file_paths¶

property dir_paths¶

_accum_stats()[source]¶

_update_stats()[source]¶

_humanize_stats(stats, node_type, reduce_prefix=False)[source]¶

_find_duplicate_files()[source]¶

_update_path_metadata()[source]¶

_update_labels()[source]¶: Update how each node will be displayed

_sort()[source]¶

xdev.cli.dirstats.parse_file_stats(fpath, parse_content=True)[source]¶: Get information about a file, including things like number of code lines / documentation lines, if that sort of information is available.

xdev.cli.dirstats.strip_comments_and_newlines(source)[source]¶

Removes hashtag comments from underlying source

Parameters:: source (str | List[str])

Todo

would be better if this was some sort of configurable minify API

Example

>>> from xdev.cli.dirstats import strip_comments_and_newlines
>>> import ubelt as ub
>>> fmtkw = dict(sss=chr(39) * 3, ddd=chr(34) * 3)
>>> source = ub.codeblock(
>>>    '''
       # comment 1
       a = '# not a comment'  # comment 2

multiline_string = {ddd}

one

{ddd} b = [

1, # foo

# bar 3,

] c = 3 ‘’’).format(**fmtkw)

>>> non_comments = strip_comments_and_newlines(source)
>>> print(non_comments)
>>> assert non_comments.count(chr(10)) == 10
>>> assert non_comments.count('#') == 1

xdev.cli.dirstats.strip_docstrings(tokens)[source]¶

Replace docstring tokens with NL tokens in a tokenize stream.

Any STRING token not part of an expression is deemed a docstring. Indented docstrings are not yet recognised.

xdev.cli.dirstats.byte_str(num, unit='auto', precision=2)[source]¶

Automatically chooses relevant unit (KB, MB, or GB) for displaying some number of bytes.

Parameters:

num (int) – number of bytes
unit (str) – which unit to use, can be auto, B, KB, MB, GB, or TB

References

[WikiOrdersOfMag]

https://en.wikipedia.org/wiki/Orders_of_magnitude_(data)

Returns:: string representing the number of bytes with appropriate units
Return type:: str

Example

>>> import ubelt as ub
>>> num_list = [1, 100, 1024,  1048576, 1073741824, 1099511627776]
>>> result = ub.urepr(list(map(byte_str, num_list)), nl=0)
>>> print(result)
['0.00 KB', '0.10 KB', '1.00 KB', '1.00 MB', '1.00 GB', '1.00 TB']
>>> byte_str(10, unit='B')
'10.00 B'