cdxcore.util#
Basic utilities for Python such as type management, formatting, some trivial timers.
Import#
import cdxcore.util as util
Documentation#
Module Attributes
Default map from characters which cannot be used for filenames under either Windows or Linux to valid characters. |
Functions
|
Return a formatted big byte string, e.g. 12.35MB. |
|
Return a formatted big number string, e.g. 12.35M instead of all digits. |
|
Returns string representation for a date of the form "YYYY-MM-DD". |
|
Convert |
|
Return a readable representation of a dictionary. |
|
String representation of an integer with 1000 separators: 10000 becomes "10,000". |
|
Replaces invalid filename characters such as `\', ':', or '/' by a differnet character. The returned string is technically a valid file name under both windows and linux. |
|
Returns a formatted string of a list, its elements separated by commas and (by default) a final 'and'. |
|
Returns the |
|
Generate format string for seconds, e.g. "23s"" for |
|
Convers a time to a string with format "HH:MM:SS". |
|
Returns string representation for a time delta in the form "DD:HH:MM:SS,MS". |
|
Approximates the size of an object. |
|
Whether an element is atomic. |
|
Tests whether a filename is indeed a valid filename. |
|
Checks whether a type is a |
|
Checks whether |
|
Whether we operate in a jupter session. |
|
Converts a python structure into a simple atomic/list/dictionary collection such that it can be read without the specific imports used inside this program. |
|
Return qualified name including module name of some Python element. |
Returns a set of all |
Classes
|
Carriage Return ("\r") manager. |
|
Micro utility to measure passage of time. |
Simplistic class to track the time it takes to run sequential tasks. |
- class cdxcore.util.CRMan[source]#
Bases:
object
Carriage Return (”\r”) manager.
This class is meant to enable efficient per-line updates using “\r” for text output with a focus on making it work with both Jupyter and the command shell. In particular, Jupyter does not support the ANSI \33[2K ‘clear line’ code. To simulate clearing lines,
CRMan
keeps track of the length of the current line, and clears it by appending spaces to a message following “\r” accordingly.This functionality does not quite work accross all terminal types which were tested. Main focus is to make it work for Jupyer for now. Any feedback on how to make this more generically operational is welcome.
crman = CRMan() print( crman("\rmessage 111111"), end='' ) print( crman("\rmessage 2222"), end='' ) print( crman("\rmessage 33"), end='' ) print( crman("\rmessage 1\n"), end='' )
prints:
message 1
While
print( crman("\rmessage 111111"), end='' ) print( crman("\rmessage 2222"), end='' ) print( crman("\rmessage 33"), end='' ) print( crman("\rmessage 1"), end='' ) print( crman("... and more.") )
prints
message 1... and more
- Attributes:
current
Return current string.
Methods
__call__
(message)Convert message containing "\r" and "\n" into a printable string which ensures that a "\r" string does not lead to printed artifacts.
reset
()Reset object.
write
(text[, end, flush, channel])Write to a
channel
,- __call__(message)[source]#
Convert message containing “\r” and “\n” into a printable string which ensures that a “\r” string does not lead to printed artifacts. Afterwards, the object will retain any text not terminated by “\n”.
- Parameters:
- messagestr
message containing “\r” and “\n”.
- Returns:
- Message: str
Printable string.
- property current: str#
Return current string.
This is the string that
CRMan
is currently visible to the user since the last time a new line was printed.
- cdxcore.util.DEF_FILE_NAME_MAP = {'*': '@', '/': '_', ':': ';', '<': '(', '>': ')', '?': '!', '\\': '_', '|': '_'}#
Default map from characters which cannot be used for filenames under either Windows or Linux to valid characters.
- class cdxcore.util.Timer[source]#
Bases:
object
Micro utility to measure passage of time.
Example:
from cdxcore.util import Timer with Timer() as t: .... do somthing ... print(f"This took {t}.")
- Attributes:
fmt_seconds
Seconds elapsed since construction or
cdxcore.util.Timer.reset()
, formatted usingcdxcore.util.fmt_seconds()
hours
Hours passed since construction or
cdxcore.util.Timer.reset()
minutes
Minutes passed since construction or
cdxcore.util.Timer.reset()
seconds
Seconds elapsed since construction or
cdxcore.util.Timer.reset()
Methods
interval_test
(interval)Tests if
interval
seconds have passed.reset
()Resets the timer.
- property fmt_seconds#
Seconds elapsed since construction or
cdxcore.util.Timer.reset()
, formatted usingcdxcore.util.fmt_seconds()
- property hours: float#
Hours passed since construction or
cdxcore.util.Timer.reset()
- interval_test(interval)[source]#
Tests if
interval
seconds have passed. If yes, reset timer and return True. Otherwise return False.Usage:
from cdxcore.util import Timer tme = Timer() for i in range(n): if tme.test_dt_seconds(2.): print(f"\\r{i+1}/{n} done. Time taken so far {tme}.", end='', flush=True) print("\\rDone. This took {tme}.")
- property minutes: float#
Minutes passed since construction or
cdxcore.util.Timer.reset()
- property seconds: float#
Seconds elapsed since construction or
cdxcore.util.Timer.reset()
- class cdxcore.util.TrackTiming[source]#
Bases:
object
Simplistic class to track the time it takes to run sequential tasks.
Usage:
from cdxcore.util import TrackTiming timer = TrackTiming() # clock starts # do job 1 timer += "Job 1 done" # do job 2 timer += "Job 2 done" print( timer.summary() )
- Attributes:
tracked
Returns dictionary of tracked texts
Methods
Reset timer, and clear all tracked items
Reset the timer to current time
summary
([fmat, jn_fmt])Generate summary string by applying some formatting
track
(text, *args, **kwargs)Track 'text', formatted with 'args' and 'kwargs'
- cdxcore.util.fmt_big_byte_number(byte_cnt, str_B=True)[source]#
Return a formatted big byte string, e.g. 12.35MB. Uses 1024 as base for KB.
Use
cdxcore.util.fmt_big_number()
for converting general numbers using 1000 blocks instead.- Parameters:
- byte_cntint
Number of bytes.
- str_Bbool
If
True
, return"GB"
,"MB"
and"KB"
units. Moreover, ifbyte_cnt` is less than 10KB, then this will add ``"bytes"
e.g."1024 bytes"
.If
False
, return"G"
,"M"
and"K"
only, and do not add"bytes"
to smallerbyte_cnt
.
- Returns:
- Textstr
String.
- cdxcore.util.fmt_big_number(number)[source]#
Return a formatted big number string, e.g. 12.35M instead of all digits.
Uses decimal system and “B” for billions. Use
cdxcore.util.fmt_big_byte_number()
for byte sizes i.e. 1024 units.- Parameters:
- numberint
Number to format.
- Returns:
- Textstr
String.
- cdxcore.util.fmt_date(dt)[source]#
Returns string representation for a date of the form “YYYY-MM-DD”.
If passed a
datetime.datetime
, it will format itsdatetime.datetime.date()
.
- cdxcore.util.fmt_datetime(dt, *, sep=':', ignore_ms=False, ignore_tz=True)[source]#
Convert
datetime.datetime
to a string of the form “YYYY-MM-DD HH:MM:SS”.If present, microseconds are added as digits:
YYYY-MM-DD HH:MM:SS,MICROSECONDS
Optinally a time zone is added via:
YYYY-MM-DD HH:MM:SS+HH YYYY-MM-DD HH:MM:SS+HH:MM
Output is reduced accordingly if
dt
is adatetime.time
ordatetime.date
.- Parameters:
- dt
datetime.datetime
,datetime.date
, ordatetime.time
Input.
- sepstr, optional
Seperator for hours, minutes, seconds. The default
':'
is most appropriate for viusalization but is not suitable for filenames.- ignore_msbool, optional
Whether to ignore microseconds. Default
False
.- ignore_tzbool, optional
Whether to ignore the time zone. Default
True
.
- dt
- Returns:
- Textstr
String.
- cdxcore.util.fmt_dict(dct, *, sort=False, none='-', link='and')[source]#
Return a readable representation of a dictionary.
This assumes that the elements of the dictionary itself can be formatted well with
str()
.For a dictionary
dict(a=1,b=2,c=3)
this function will return"a: 1, b: 2, and c: 3"
.- Parameters:
- dctdict
The dictionary to format.
- sortbool, optional
Whether to sort the keys. Default is
False
.- nonestr, optional
String to be used if dictionary is empty. Default is
"-"
.- linkstr, optional
String to be used to link the last element to the previous string. Default is
"and"
.
- Returns:
- Textstr
String.
- cdxcore.util.fmt_digits(integer, sep=',')[source]#
String representation of an integer with 1000 separators: 10000 becomes “10,000”.
- Parameters:
- integerint
The number. The function will
int()
the input which allows for processing of a number of inputs (such as strings) but might cut off floating point numbers.- sepstr
Separator;
","
by default.
- Returns:
- Textstr
String.
- cdxcore.util.fmt_filename(filename, by='default')[source]#
Replaces invalid filename characters such as `\’, ‘:’, or ‘/’ by a differnet character. The returned string is technically a valid file name under both windows and linux.
However, that does not prevent the filename to be a reserved name, for example “.” or “..”.
- Parameters:
- filenamestr
Input string.
- bystr | Mapping, optional.
A dictionary of characters and their replacement. The default value
"default"
leads to usingcdxcore.util.DEF_FILE_NAME_MAP
.
- Returns:
- Textstr
Filename
- cdxcore.util.fmt_list(lst, *, none='-', link='and', sort=False)[source]#
Returns a formatted string of a list, its elements separated by commas and (by default) a final ‘and’.
If the list is
[1,2,3]
then the function will return"1, 2 and 3"
.- Parameters:
- lstlist.
The
list()
operator is applied tolst
, so it will resolve dictionaries and generators.- nonestr, optional
String to be used when
list
is empty. Default is"-"
.- linkstr, optional
String to be used to connect the last item. Default is
"and"
.- sortbool, optional
Whether to sort the list. Default is
False
.
- Returns:
- Textstr
String.
- cdxcore.util.fmt_now()[source]#
Returns the
cdxcore.util.fmt_datetime()
applied todatetime.datetime.now()
- cdxcore.util.fmt_seconds(seconds, *, eps=1e-08)[source]#
Generate format string for seconds, e.g. “23s”” for
seconds=23
, or “1:10” forseconds=70
.- Parameters:
- secondsfloat
Seconds as a float.
- epsfloat
anything below
eps
is considered zero. Default1E-8
.
- Returns:
- Secondsstring
- cdxcore.util.fmt_time(dt, *, sep=':', ignore_ms=False)[source]#
Convers a time to a string with format “HH:MM:SS”.
Microseconds are added as digits:
HH:MM:SS,MICROSECONDS
If passed a
datetime.datetime
, then this function will format only itsdatetime.datetime.time()
part.Time Zones
Note that while
datetime.time
objects may carry atzinfo
time zone object, the correspondingdatetime.time.otcoffset()
function returnsNone
if we donot provide adt
parameter, see tzinfo documentation. That meansdatetime.time.otcoffset()
is only useful if we havedatetime.datetime
object at hand. That makes sense as a time zone can chnage date as well.We therefore here do not allow
dt
to contain a time zone.Use
cdxcore.util.fmt_datetime()
for time zone support- Parameters:
- dt
datetime.time
Input.
- sepstr, optional
Seperator for hours, minutes, seconds. The default
':'
is most appropriate for viusalization but is not suitable for filenames.- ignore_msbool
Whether to ignore microseconds. Default is
False
.
- dt
- Returns:
- Textstr
String.
- cdxcore.util.fmt_timedelta(dt, *, sep='')[source]#
Returns string representation for a time delta in the form “DD:HH:MM:SS,MS”.
- Parameters:
- dt
datetime.timedelta
Timedelta.
- sep
Identify the three separators: between days, and HMS and between microseconds:
DD*HH*MM*SS*MS 0 1 1 2
sep
can be a string, in which case:If it is an empty string, all separators are
''
.A single character will be reused for all separators.
If the string has length 2, then the last character is used for
'2'
.If the string has length 3, then the chracters are used accordingly.
sep
can also be a collection ie atuple
orlist
. In this case each element is used accordingly.
- dt
- Returns:
- Textstr
String with leading sign. Returns “” if
timedelta
is 0.
- cdxcore.util.getsizeof(obj)[source]#
Approximates the size of an object.
In addition to calling
sys.getsizeof()
this function also iterates embedded containers, numpy arrays, and panda dataframes. :meta private:
- cdxcore.util.is_atomic(o)[source]#
Whether an element is atomic.
Returns
True
ifo
is astring
,int
,float
,datedatime.date
,bool
, or anumpy.generic
- cdxcore.util.is_filename(filename, by='default')[source]#
Tests whether a filename is indeed a valid filename.
- Parameters:
- filenamestr
Supposed filename.
- bystr | Collection, optional
A collection of invalid characters. The default value
"default"
leads to using they keys ofcdxcore.util.DEF_FILE_NAME_MAP
.
- Returns:
- Validityvool
True
iffilename
does not contain any invalid characters contained inby
.
- cdxcore.util.is_float(o)[source]#
Checks whether a type is a
float
which includes numpy floating types
- cdxcore.util.is_function(f)[source]#
Checks whether
f
is a function in an extended sense.Check
cdxcore.util.types_functions()
for what is tested against. In particularis_function
does not test positive for properties.
- cdxcore.util.plain(inn, *, sorted_dicts=False, native_np=False, dt_to_str=False)[source]#
Converts a python structure into a simple atomic/list/dictionary collection such that it can be read without the specific imports used inside this program.
For example, objects are converted into dictionaries of their data fields.
- Parameters:
- inn
some object.
- sorted_dictsbool, optional
use SortedDicts instead of dicts. Since Python 3.7 all dictionaries are sorted anyway.
- native_npbool, optional
convert numpy to Python natives.
- dt_to_strbool, optional
convert dates, times, and datetimes to strings.
- Returns:
- Textstr
Filename
- cdxcore.util.qualified_name(x, module=False)[source]#
Return qualified name including module name of some Python element.
For the most part, this function will try to
getattr()
the__qualname__
and__name__
ofx
or its type. If all of these fail, an attempt is made to converttype(x)
into a string.Class Properties
When reporting qualified names for a
property()
, there is a nuance: at class level, a property will be identified by its underlying function name. Once an object is created, though, the property will be identified by the return type of the property:class A(object): @property def p(self): return x qualified_name(A.p) # -> "A.p" qualified_name(A().p) # -> "int"
- Parameters:
- xany
Some Python element.
- modulebool, optional
Whether to also return the containing module if available.
- Returns
- ——-
- qualified namestr
The name, if
module
isFalse
.- (qualified name, module)tuple
The name, if
module
isTrue
. Note that the module name returned might be""
if no module name could be determined.
- Raises:
RuntimeError
if not qualfied name forx
or its type could be found.