cdxcore.util#

Basic utilities for Python such as type management, formatting, some trivial timers.

Import#

import cdxcore.util as util

Documentation#

Module Attributes

DEF_FILE_NAME_MAP

Default map from characters which cannot be used for filenames under either Windows or Linux to valid characters.

Functions

`fmt_big_byte_number`(byte_cnt[, str_B])	Return a formatted big byte string, e.g. 12.35MB.
`fmt_big_number`(number)	Return a formatted big number string, e.g. 12.35M instead of all digits.
`fmt_date`(dt)	Returns string representation for a date of the form "YYYY-MM-DD".
`fmt_datetime`(dt, *[, sep, ignore_ms, ignore_tz])	Convert `datetime.datetime` to a string of the form "YYYY-MM-DD HH:MM:SS".
`fmt_dict`(dct, *[, sort, none, link])	Return a readable representation of a dictionary.
`fmt_digits`(integer[, sep])	String representation of an integer with 1000 separators: 10000 becomes "10,000".
`fmt_filename`(filename[, by])	Replaces invalid filename characters such as `\', ':', or '/' by a differnet character. The returned string is technically a valid file name under both windows and linux.
`fmt_list`(lst, *[, none, link, sort])	Returns a formatted string of a list, its elements separated by commas and (by default) a final 'and'.
`fmt_now`()	Returns the `cdxcore.util.fmt_datetime()` applied to `datetime.datetime.now()`
`fmt_seconds`(seconds, *[, eps])	Generate format string for seconds, e.g. "23s"" for `seconds=23`, or "1:10" for `seconds=70`.
`fmt_time`(dt, *[, sep, ignore_ms])	Convers a time to a string with format "HH:MM:SS".
`fmt_timedelta`(dt, *[, sep])	Returns string representation for a time delta in the form "DD:HH:MM:SS,MS".
`getsizeof`(obj)	Approximates the size of an object.
`is_atomic`(o)	Whether an element is atomic.
`is_filename`(filename[, by])	Tests whether a filename is indeed a valid filename.
`is_float`(o)	Checks whether a type is a `float` which includes numpy floating types
`is_function`(f)	Checks whether `f` is a function in an extended sense.
`is_jupyter`()	Whether we operate in a jupter session.
`plain`(inn, *[, sorted_dicts, native_np, ...])	Converts a python structure into a simple atomic/list/dictionary collection such that it can be read without the specific imports used inside this program.
`qualified_name`(x[, module])	Return qualified name including module name of some Python element.
`types_functions`()	Returns a set of all `types` considered functions

Classes

`CRMan`()	Carriage Return ("\r") manager.
`Timer`()	Micro utility to measure passage of time.
`TrackTiming`()	Simplistic class to track the time it takes to run sequential tasks.

class cdxcore.util.CRMan[source]#

Bases: object

Carriage Return (”\r”) manager.

This class is meant to enable efficient per-line updates using “\r” for text output with a focus on making it work with both Jupyter and the command shell. In particular, Jupyter does not support the ANSI \33[2K ‘clear line’ code. To simulate clearing lines, CRMan keeps track of the length of the current line, and clears it by appending spaces to a message following “\r” accordingly.

This functionality does not quite work accross all terminal types which were tested. Main focus is to make it work for Jupyer for now. Any feedback on how to make this more generically operational is welcome.

crman = CRMan()
print( crman("\rmessage 111111"), end='' )
print( crman("\rmessage 2222"), end='' )
print( crman("\rmessage 33"), end='' )
print( crman("\rmessage 1\n"), end='' )

prints:

message 1

While

print( crman("\rmessage 111111"), end='' )
print( crman("\rmessage 2222"), end='' )
print( crman("\rmessage 33"), end='' )
print( crman("\rmessage 1"), end='' )
print( crman("... and more.") )

prints

message 1... and more

Attributes:

current: Return current string.

Methods

`__call__`(message)	Convert message containing "\r" and "\n" into a printable string which ensures that a "\r" string does not lead to printed artifacts.
`reset`()	Reset object.
`write`(text[, end, flush, channel])	Write to a `channel`,

__call__(message)[source]#

Convert message containing “\r” and “\n” into a printable string which ensures that a “\r” string does not lead to printed artifacts. Afterwards, the object will retain any text not terminated by “\n”.

Parameters:

messagestr: message containing “\r” and “\n”.

Returns:

Message: str: Printable string.

property current: str#

Return current string.

This is the string that CRMan is currently visible to the user since the last time a new line was printed.

reset()[source]#: Reset object.

write(text, end='', flush=True, channel=None)[source]#

Write to a channel,

Writes text to channel taking into account any current lines and any “\r” and “\n” contained in text. The end and flush parameters mirror those of print().

Parameters:

textstr: Text to print, containing “\r” and “\n”.
end, flushoptional: end and flush parameters mirror those of print().
channelCallable: Callable to output the residual text. If None, the default, use print() to write to stdout.

cdxcore.util.DEF_FILE_NAME_MAP = {'*': '@', '/': '_', ':': ';', '<': '(', '>': ')', '?': '!', '\\': '_', '|': '_'}#: Default map from characters which cannot be used for filenames under either Windows or Linux to valid characters.

class cdxcore.util.Timer[source]#

Bases: object

Micro utility to measure passage of time.

Example:

from cdxcore.util import Timer
with Timer() as t:
    .... do somthing ...
    print(f"This took {t}.")

Attributes:

fmt_seconds: Seconds elapsed since construction or cdxcore.util.Timer.reset(), formatted using cdxcore.util.fmt_seconds()
hours: Hours passed since construction or cdxcore.util.Timer.reset()
minutes: Minutes passed since construction or cdxcore.util.Timer.reset()
seconds: Seconds elapsed since construction or cdxcore.util.Timer.reset()

Methods

`interval_test`(interval)	Tests if `interval` seconds have passed.
`reset`()	Resets the timer.

property fmt_seconds#: Seconds elapsed since construction or cdxcore.util.Timer.reset(), formatted using cdxcore.util.fmt_seconds()

property hours: float#: Hours passed since construction or cdxcore.util.Timer.reset()

interval_test(interval)[source]#

Tests if interval seconds have passed. If yes, reset timer and return True. Otherwise return False.

Usage:

from cdxcore.util import Timer
tme = Timer()
for i in range(n):
    if tme.test_dt_seconds(2.):
        print(f"\\r{i+1}/{n} done. Time taken so far {tme}.", end='', flush=True)
print("\\rDone. This took {tme}.")

property minutes: float#: Minutes passed since construction or cdxcore.util.Timer.reset()

reset()[source]#: Resets the timer.

property seconds: float#: Seconds elapsed since construction or cdxcore.util.Timer.reset()

class cdxcore.util.TrackTiming[source]#

Bases: object

Simplistic class to track the time it takes to run sequential tasks.

Usage:

from cdxcore.util import TrackTiming
timer = TrackTiming()   # clock starts

# do job 1
timer += "Job 1 done"

# do job 2
timer += "Job 2 done"

print( timer.summary() )

Attributes:

tracked: Returns dictionary of tracked texts

Methods

`reset_all`()	Reset timer, and clear all tracked items
`reset_timer`()	Reset the timer to current time
`summary`([fmat, jn_fmt])	Generate summary string by applying some formatting
`track`(text, args, *kwargs)	Track 'text', formatted with 'args' and 'kwargs'

reset_all()[source]#: Reset timer, and clear all tracked items

reset_timer()[source]#: Reset the timer to current time

summary(fmat='%(text)s: %(fmt_seconds)s', jn_fmt=', ')[source]#

Generate summary string by applying some formatting

Parameters:

fmatstr, optional

Format string using %(). Arguments are text, seconds (as int) and fmt_seconds (a string).

Default is "%(text)s: %(fmt_seconds)s".

jn_fmtstr, optional

String to be used between two texts. Default ``”, “ ``.

Returns:

Summarystr: The combined summary string

track(text, *args, **kwargs)[source]#: Track ‘text’, formatted with ‘args’ and ‘kwargs’

property tracked: list#: Returns dictionary of tracked texts

cdxcore.util.fmt_big_byte_number(byte_cnt, str_B=True)[source]#

Return a formatted big byte string, e.g. 12.35MB. Uses 1024 as base for KB.

Use cdxcore.util.fmt_big_number() for converting general numbers using 1000 blocks instead.

Parameters:

byte_cntint

Number of bytes.

str_Bbool

If True, return "GB", "MB" and "KB" units. Moreover, if byte_cnt` is less than 10KB, then this will add ``"bytes" e.g. "1024 bytes".

If False, return "G", "M" and "K" only, and do not add "bytes" to smaller byte_cnt.

Returns:

Textstr: String.

cdxcore.util.fmt_big_number(number)[source]#

Return a formatted big number string, e.g. 12.35M instead of all digits.

Uses decimal system and “B” for billions. Use cdxcore.util.fmt_big_byte_number() for byte sizes i.e. 1024 units.

Parameters:

numberint: Number to format.

Returns:

Textstr: String.

cdxcore.util.fmt_date(dt)[source]#

Returns string representation for a date of the form “YYYY-MM-DD”.

If passed a datetime.datetime, it will format its datetime.datetime.date().

cdxcore.util.fmt_datetime(dt, *, sep=':', ignore_ms=False, ignore_tz=True)[source]#

Convert datetime.datetime to a string of the form “YYYY-MM-DD HH:MM:SS”.

If present, microseconds are added as digits:

YYYY-MM-DD HH:MM:SS,MICROSECONDS

Optinally a time zone is added via:

YYYY-MM-DD HH:MM:SS+HH
YYYY-MM-DD HH:MM:SS+HH:MM

Output is reduced accordingly if dt is a datetime.time or datetime.date.

Parameters:

dtdatetime.datetime, datetime.date, or datetime.time: Input.
sepstr, optional: Seperator for hours, minutes, seconds. The default ':' is most appropriate for viusalization but is not suitable for filenames.
ignore_msbool, optional: Whether to ignore microseconds. Default False.
ignore_tzbool, optional: Whether to ignore the time zone. Default True.

Returns:

Textstr: String.

cdxcore.util.fmt_dict(dct, *, sort=False, none='-', link='and')[source]#

Return a readable representation of a dictionary.

This assumes that the elements of the dictionary itself can be formatted well with str().

For a dictionary dict(a=1,b=2,c=3) this function will return "a: 1, b: 2, and c: 3".

Parameters:

dctdict: The dictionary to format.
sortbool, optional: Whether to sort the keys. Default is False.
nonestr, optional: String to be used if dictionary is empty. Default is "-".
linkstr, optional: String to be used to link the last element to the previous string. Default is "and".

Returns:

Textstr: String.

cdxcore.util.fmt_digits(integer, sep=',')[source]#

String representation of an integer with 1000 separators: 10000 becomes “10,000”.

Parameters:

integerint: The number. The function will int() the input which allows for processing of a number of inputs (such as strings) but might cut off floating point numbers.
sepstr: Separator; "," by default.

Returns:

Textstr: String.

cdxcore.util.fmt_filename(filename, by='default')[source]#

Replaces invalid filename characters such as `\’, ‘:’, or ‘/’ by a differnet character. The returned string is technically a valid file name under both windows and linux.

However, that does not prevent the filename to be a reserved name, for example “.” or “..”.

Parameters:

filenamestr: Input string.
bystr | Mapping, optional.: A dictionary of characters and their replacement. The default value "default" leads to using cdxcore.util.DEF_FILE_NAME_MAP.

Returns:

Textstr: Filename

cdxcore.util.fmt_list(lst, *, none='-', link='and', sort=False)[source]#

Returns a formatted string of a list, its elements separated by commas and (by default) a final ‘and’.

If the list is [1,2,3] then the function will return "1, 2 and 3".

Parameters:

lstlist.: The list() operator is applied to lst, so it will resolve dictionaries and generators.
nonestr, optional: String to be used when list is empty. Default is "-".
linkstr, optional: String to be used to connect the last item. Default is "and".
sortbool, optional: Whether to sort the list. Default is False.

Returns:

Textstr: String.

cdxcore.util.fmt_now()[source]#: Returns the cdxcore.util.fmt_datetime() applied to datetime.datetime.now()

cdxcore.util.fmt_seconds(seconds, *, eps=1e-08)[source]#

Generate format string for seconds, e.g. “23s”” for seconds=23, or “1:10” for seconds=70.

Parameters:

secondsfloat: Seconds as a float.
epsfloat: anything below eps is considered zero. Default 1E-8.

Returns:

Secondsstring

cdxcore.util.fmt_time(dt, *, sep=':', ignore_ms=False)[source]#

Convers a time to a string with format “HH:MM:SS”.

Microseconds are added as digits:

HH:MM:SS,MICROSECONDS

If passed a datetime.datetime, then this function will format only its datetime.datetime.time() part.

Time Zones

Note that while datetime.time objects may carry a tzinfo time zone object, the corresponding datetime.time.otcoffset() function returns None if we donot provide a dt parameter, see tzinfo documentation. That means datetime.time.otcoffset() is only useful if we have datetime.datetime object at hand. That makes sense as a time zone can chnage date as well.

We therefore here do not allow dt to contain a time zone.

Use cdxcore.util.fmt_datetime() for time zone support

Parameters:

dtdatetime.time: Input.
sepstr, optional: Seperator for hours, minutes, seconds. The default ':' is most appropriate for viusalization but is not suitable for filenames.
ignore_msbool: Whether to ignore microseconds. Default is False.

Returns:

Textstr: String.

cdxcore.util.fmt_timedelta(dt, *, sep='')[source]#

Returns string representation for a time delta in the form “DD:HH:MM:SS,MS”.

Parameters:

dtdatetime.timedelta

Timedelta.

sep

Identify the three separators: between days, and HMS and between microseconds:

DD*HH*MM*SS*MS
  0  1  1  2

sep can be a string, in which case:
- If it is an empty string, all separators are ''.
- A single character will be reused for all separators.
- If the string has length 2, then the last character is used for '2'.
- If the string has length 3, then the chracters are used accordingly.
sep can also be a collection ie a tuple or list. In this case each element is used accordingly.

Returns:

Textstr: String with leading sign. Returns “” if timedelta is 0.

cdxcore.util.getsizeof(obj)[source]#

Approximates the size of an object.

In addition to calling sys.getsizeof() this function also iterates embedded containers, numpy arrays, and panda dataframes. :meta private:

cdxcore.util.is_atomic(o)[source]#

Whether an element is atomic.

Returns True if o is a string, int, float, datedatime.date, bool, or a numpy.generic

cdxcore.util.is_filename(filename, by='default')[source]#

Tests whether a filename is indeed a valid filename.

Parameters:

filenamestr: Supposed filename.
bystr | Collection, optional: A collection of invalid characters. The default value "default" leads to using they keys of cdxcore.util.DEF_FILE_NAME_MAP.

Returns:

Validityvool: True if filename does not contain any invalid characters contained in by.

cdxcore.util.is_float(o)[source]#: Checks whether a type is a float which includes numpy floating types

cdxcore.util.is_function(f)[source]#

Checks whether f is a function in an extended sense.

Check cdxcore.util.types_functions() for what is tested against. In particular is_function does not test positive for properties.

cdxcore.util.plain(inn, *, sorted_dicts=False, native_np=False, dt_to_str=False)[source]#

Converts a python structure into a simple atomic/list/dictionary collection such that it can be read without the specific imports used inside this program.

For example, objects are converted into dictionaries of their data fields.

Parameters:

inn: some object.
sorted_dictsbool, optional: use SortedDicts instead of dicts. Since Python 3.7 all dictionaries are sorted anyway.
native_npbool, optional: convert numpy to Python natives.
dt_to_strbool, optional: convert dates, times, and datetimes to strings.

Returns:

Textstr: Filename

cdxcore.util.qualified_name(x, module=False)[source]#

Return qualified name including module name of some Python element.

For the most part, this function will try to getattr() the __qualname__ and __name__ of x or its type. If all of these fail, an attempt is made to convert type(x) into a string.

Class Properties

When reporting qualified names for a property(), there is a nuance: at class level, a property will be identified by its underlying function name. Once an object is created, though, the property will be identified by the return type of the property:

class A(object):
    @property
        def p(self):
            return x

qualified_name(A.p)    # -> "A.p"
qualified_name(A().p)  # -> "int"

Parameters:

xany

Some Python element.

modulebool, optional: Whether to also return the containing module if available.

Returns

——-

qualified namestr: The name, if module is False.
(qualified name, module)tuple: The name, if module is True. Note that the module name returned might be "" if no module name could be determined.

Raises:

RuntimeError if not qualfied name for x or its type could be found.

cdxcore.util.types_functions()[source]#: Returns a set of all types considered functions

cdxcore.util#

Import#

Documentation#

This Page