cdxcore.config#
Tooling for setting up program-wide configuration hierachies. Aimed at machine learning programs to ensure consistency of code accross experimentation.
Overview#
Basic config construction:
from cdxbasics.config import Config, Int
config = Config()
config.num_batches = 1000 # object-like assigment of config values
config.network.depth = 3 # on-the-fly hierarchy generation: here `network` becomes a sub-config
config.network.width = 100
...
def train(config):
num_batches = config("num_batches", 10, Int>=2, "Number of batches. Must be at least 2")
...
Key features#
Detect misspelled parameters by checking that all parameters provided via a config by a user have been read.
Provide summary of all parameters used, including summary help for what they were for.
Nicer object attribute synthax than dictionary notation, in particular for nested configurations.
Automatic conversion including simple value validation to ensure user-provided values are within a given range or from a list of options.
Creating Configs#
Set data with both dictionary and member notation:
config = Config()
config['features'] = [ 'time', 'spot' ] # examplearray-type assignment
config.scaling = [ 1., 1000. ] # example object-type assignment
Reading a Config#
When reading the value for a key
from a config, cdxcore.config.Config.__call__()
expects a key
, a default
value, a cast
type, and a brief help
text.
The function first attempts to find key
in the provided Config:
If
key
is found, it casts the value provided forkey
using thecast
type and returns.If
key
is not found, then the default value will be returned (after also being cast usingcast
).
Example:
from cdxcore.config import Config
import numpy as np
class Model(object):
def __init__( self, config ):
# read top level parameters
self.features = config("features", [], list, "Features for the agent" )
self.scaling = config("scaling", [], np.asarray, "Scaling for the features", help_default="no scaling")
model = Model( config )
Most of the example is self-explanatory, but note that
the :class:’numpy.asarray` provided as cast
parameter for
weights
means that any values passed by the user will be automatically
converted to numpy.ndarray
objects.
The help
text parameter allows providing information on what variables
are read from the config. The latter can be displayed using the function
cdxcore.config.Config.usage_report()
. (There a number of further parameters to
cdxcore.config.Config.__call__()
to fine-tune this report such as the help_defaults
parameter used above).
In the above example, print( config.usage_report() )
will return:
config['features'] = ['time', 'spot'] # Features for the agent; default: []
config['scaling'] = [ 1. 1000.] # Weigths for the agent; default: no initial weights
Sub-Configs#
You can write and read sub-configurations directly with member notation, without having to explicitly create an entry for the sub-config:
Assume as before:
config = Config()
config['features'] = [ 'time', 'spot' ]
config.scaling = [ 1., 1000. ]
Then create a network
sub configuration with member notation on the fly:
config.network.depth = 10
config.network.width = 100
config.network.activation = 'relu'
This is equivalent to:
config.network = Config()
config.network.depth = 10
config.network.width = 100
config.network.activation = 'relu'
Now use naturally as follows:
from cdxcore.config import Config
import numpy as np
class Network(object):
def __init__( self, config ):
self.depth = config("depth", 1, Int>0, "Depth of the network")
self.width = config("width", 1, Int>0, "Width of the network")
self.activation = config("activation", "selu", str, "Activation function")
config.done() # see below
class Model(object):
def __init__( self, config ):
# read top level parameters
self.features = config("features", [], list, "Features for the agent" )
self.weights = config("weights", [], np.asarray, "Weigths for the agent", help_default="no initial weights")
self.networks = Network( config.network )
config.done() # see below
model = Model( config )
Imposing Simple Restrictions on Values#
The cast
parameter to cdxcore.config.Config.__call__()
is a callable; this allows imposing
simple restrictions to any values read from a config.
To this end, import the respective type operators:
from cdxcore.config import Int, Float
Implement a one-sided restriction:
# example enforcing simple conditions
self.width = network('width', 100, Int>3, "Width for the network")
Restrictions on both sides of a scalar:
# example encorcing two-sided conditions
self.percentage = network('percentage', 0.5, ( Float >= 0. ) & ( Float <= 1.), "A percentage")
Enforce the value being a member of a list:
# example ensuring a returned type is from a list
self.ntype = network('ntype', 'fastforward', ['fastforward','recurrent','lstm'], "Type of network")
We can allow a returned value to be one of several casting types by using tuples.
The most common use case is that None
is a valid value, too.
For example, assume that the name
of the network model should be a string or None
.
This is implemented as:
# example allowing either None or a string
self.keras_name = network('name', None, (None, str), "Keras name of the network model")
We can combine conditional expressions with the tuple notation:
# example allowing either None or a positive int
self.batch_size = network('batch_size', None, (None, Int>0), "Batch size or None for TensorFlow's default 32", help_cast="Positive integer, or None")
Ensuring that we had no Typos & that all provided Data is meaningful#
A common issue when using dictionary-based code configuration is that we might misspell one of the parameters. Unless this is a mandatory parameter we might not notice that we have not actually changed its value.
To check that all values of a config were read use cdxcore.config.Config.done()
.
It will alert you if there are keywords or children which have not been read.
Most likely, those will be typos. Consider the following example where width
is misspelled in our config:
class Network(object):
def __init__( self, config ):
# read top level parameters
self.depth = config("depth", 1, Int>=1, "Depth of the network")
self.width = config("width", 3, Int>=1, "Width of the network")
self.activaton = config("activation", "relu", help="Activation function", help_cast="String with the function name, or function")
config.done() # <-- test that all members of config where read
config = Config()
config.features = ['time', 'spot']
config.network.depth = 10
config.network.activation = 'relu'
config.network.widht = 100 # (intentional typo)
n = Network(config.network)
Since width
was misspelled in setting up the config,
a cdxcore.config.NotDoneError
exception is raised:
NotDoneError: Error closing Config 'config.network': the following config arguments were not read: widht
Summary of all variables read from this object:
config.network['activation'] = relu # Activation function; default: relu
config.network['depth'] = 10 # Depth of the network; default: 1
config.network['width'] = 3 # Width of the network; default: 3
Note that you can also call cdxcore.config.Config.done()
at top level:
class Network(object):
def __init__( self, config ):
# read top level parameters
self.depth = config("depth", 1, Int>=1, "Depth of the network")
self.width = config("width", 3, Int>=1, "Width of the network")
self.activaton = config("activation", "relu", help="Activation function", help_cast="String with the function name, or function")
config = Config()
config.features = ['time', 'spot']
config.network.depth = 10
config.network.activation = 'relu'
config.network.widht = 100 # (intentional typo)
n = Network(config.network)
test_features = config("features", [], list, "Features for my network")
config.done()
produces:
NotDoneError: Error closing Config 'config.network': the following config arguments were not read: widht
Summary of all variables read from this object:
config.network['activation'] = relu # Activation function; default: relu
config.network['depth'] = 10 # Depth of the network; default: 1
config.network['width'] = 3 # Width of the network; default: 3
#
config['features'] = ['time', 'spot'] # Features for my network; default: []
You can check the status of the use of the config by using the cdxcore.config.Config.not_done
property.
Detaching Child Configs#
You can also detach a child config,
which allows you to store it for later use without triggering cdxcore.config.Config.done()
errors:
def read_config( self, confg ):
...
self.config_training = config.training.detach()
config.done()
The function cdxcore.config.Config.detach()
will mark he original child but not the detached
child itself as ‘done’.
Therefore, we will need to call cdxcore.config.Config.done()
for the detached child
when we finished processing it:
def training(self):
epochs = self.config_training("epochs", 100, int, "Epochs for training")
batch_size = self.config_training("batch_size", None, help="Batch size. Use None for default of 32" )
self.config_training.done()
Various Copy Operations#
When making a copy of a config we will need to decide about the semantics of the operation.
A cdxcore.config.Config
object contains
Inputs: the user’s input hierarchy. This is accessible via
cdxcore.config.Config.children
andcdxcore.config.Config.keys()
.All copy operations share (and do not modify) the user’s input. See also
cdxcore.config.Config.input_report()
.Done Status: to check whether all parameters provided by the users are read by some code config keeps track of which parameters were read with
cdxcore.config.Config.__call__()
. This list is checked against whencdxcore.config.Config.done()
is called.This list of elements not yet read can be obtained using
cdxcore.config.Config.input_dict()
.Consistency: a
cdxcore.config.Config
object makes sure that if a parameter is requested twice withcdxcore.config.Config.__call__()
then the respectivedefault
andhelp
values are consistency between function calls. This avoids typically divergence of code where one part of code assumes a different default value than another.Recorded consistency information are accessible via
cdxcore.config.Config.recorder
.Note that you can read a parameter “quietly” without recording any usage by using the
[]
operator.
Accordingly, when making a copy of self
we need to determine the relationship of the copy with
above.
cdxcore.config.Config.detach()
: use case is deferring usage of a config to a later point.Done status:
self
is marked as “done”; the copy is used keep track of usage of the remaining parameters.Consistency: both
self
and the copy share the same consistency recorder.
cdxcore.config.Config.copy()
: make an indepedent copy of the current status ofself
.Done status: the copy has an inpendent copy of the “done” status of
self
.Consistency: the copy has an inpendent copy of the consistency recorder of
self
.
cdxcore.config.Config.clean_copy()
: make an indepedent copy ofself
, and reset all usage information.Done status: the copy has an empty “done” status.
Consistency: the copy has an empty consistency recorder.
cdxcore.config.Config.shallow_copy()
: make a shallow copy which shares all future usage tracking withself
.The copy acts as a view on
self
. This is the semantic of the copy constructor.Done status: the copy and
self
share all “done” status; if a parameter is read with one, it is considered “done” by both.Consistency: the copy and
self
share all consistency handling. If a parameter is read with one with a givendefault
andhelp
, the other must use the same values when accessing the same parameter.
Self-Recording All Available Configuration Parameters#
Once your program ran, you can read the summary of all values read, their defaults, and their help texts:
print( config.usage_report( with_cast=True ) )
Prints:
config.network['activation'] = relu # (str) Activation function for the network; default: relu
config.network['depth'] = 10 # (int) Depth for the network; default: 10000
config.network['width'] = 100 # (int>3) Width for the network; default: 100
config.network['percentage'] = 0.5 # (float>=0. and float<=1.) Width for the network; default: 0.5
config.network['ntype'] = 'fastforward' # (['fastforward','recurrent','lstm']) Type of network; default 'fastforward'
config.training['batch_size'] = None # () Batch size. Use None for default of 32; default: None
config.training['epochs'] = 100 # (int) Epochs for training; default: 100
config['features'] = ['time', 'spot'] # (list) Features for the agent; default: []
config['weights'] = [1 2 3] # (asarray) Weigths for the agent; default: no initial weights
Unique Hash#
Another common use case is that we wish to cache the result of some complex operation.
Assuming that the config describes all relevant parameters, and is therefore a valid ID for
the data we wish to cache, we can use cdxcore.config.Config.unique_hash()
to obtain a unique hash ID for the given config.
cdxcore.config.Config
also implements
the custom hashing protocol __unique_hash__
defined by cdxcore.uniquehash.UniqueHash
,
which means that if a Config
is used during a hashing function from cdxcore.uniquehash
the config will be hashed correctly.
A fully transparent caching framework which supports code versioning and transparent
hashing of function parameters is implemented with cdxcore.subdir.SubDir.cache()
.
Consistent ** kwargs Handling#
The Config class can be used to improve ** kwargs
handling.
Assume we have:
def f(** kwargs):
a = kwargs.get("difficult_name", 10)
b = kwargs.get("b", 20)
We run the usual risk of a user mispronouncing a parameter name which we would never know. Therefore we may improve upon the above with:
def f(**kwargs):
kwargs = Config(kwargs)
a = kwargs("difficult_name", 10)
b = kwargs("b", 20)
kwargs.done()
If now a user calls f
with, say, config(difficlt_name=5)
an error will be raised.
A more advanced pattern is to allow both config
and kwargs
function parameters. In this case, the user
can both provide a config
or specify its parameters directory:
def f( config=None, **kwargs):
config = Config.config_kwargs(config,kwargs)
a = config("difficult_name", 10, int)
b = config("b", 20, int)
config.done()
Any of the following function calls are now valid:
f( Config(difficult_name=11, b=21) ) # use a Config
f( difficult_name=12, b=22 ) # use a kwargs
f( Config(difficult_name=11, b=21), b=22 ) # use both; kwargs overwrite config values
Dataclasses#
dataclasses
rely on default values of any member being “frozen” objects, which most user-defined objects and
cdxcore.config.Config
objects are not.
This limitation applies as well to flax modules.
To use non-frozen default values, use the
cdxcore.config.Config.as_field()
function:
from cdxcore.config import Config
from dataclasses import dataclass
@dataclass
class Data:
data : Config = Config().as_field()
def f(self):
return self.data("x", 1, Int>0, "A positive integer")
d = Data() # default constructor used.
d.f()
Import#
from cdxcore.config import Config
Documentation#
Module Attributes
Value indicating no default is available for a given parameter. |
|
Allows to apply basic range conditions to |
|
Allows to apply basic range conditions to |
Functions
|
Default implementation for a usage pattern where the user can provide both a |
|
Assess whether a parameters is a |
Classes
|
A simple Config class for hierarchical dictionary-like configurations but with type checking, detecting missspelled parameters, and simple built-in help. |
Exceptions
|
Raised when |
|
Raised when |
|
Raised when |
- exception cdxcore.config.CastError(key, config_name, exception)[source]#
Bases:
RuntimeError
Raised when
cdxcore.config.Config.__call__()
could not cast a value provided by the user to the specified type.- Parameters:
- keystr
Key name of the parameter which failed to cast.
- config_namestr
Name of the
Config
.- exception
Exception
Orginal exception raised by the cast.
- config_name#
Hierarchical name of the config.
- key#
Key of the parameter which failed to cast.
- class cdxcore.config.Config(*args, config_name=None, **kwargs)[source]#
Bases:
OrderedDict
A simple Config class for hierarchical dictionary-like configurations but with type checking, detecting missspelled parameters, and simple built-in help.
See
cdxcore.config
for an extensive discussion of features.- Parameters:
- *argslist
List of
Mapping
to iteratively create a new config with.If the first element is a
Config
, and no other parameters are passed, then this object will be a shallow copy of thatConfig
. It then shares all usage recording. Seecdxcore.config.Config.shallow_copy()
.- config_namestr, optional
Name of the configuration for report_usage. Default is
"config"
.- ** kwargsdict
Additional key/value pairs to initialize the config with, e.g.``Config(a=1, b=2)``.
- Attributes:
children
Dictionary of the child configs of
self
.config_name
Qualified name of this config.
is_empty
Whether any parameters have been set, at parent level or at any child level.
not_done
Returns a dictionary of keys which were not read yet.
recorder
Returns the “recorder”, a
sortedcontainers.SortedDict
which containskey
,default
,cast
,help
, and all other function parameters for all calls ofcdxcore.config.Config.__call__()
.
Methods
__call__
(key[, default, cast, help, ...])Reads a parameter
key
from the config subject to casting withcast
.as_dict
([mark_done])Convert
self
into a dictionary of dictionaries.as_field
()This function provides support for
dataclasses.dataclass
fields withConfig
default values.Make a copy of
self
, and reset it to the original input state from the user.clear
(/)config_kwargs
(config, kwargs[, config_name])Default implementation for a usage pattern where the user can provide both a
cdxcore.config.Config
parameter and** kwargs
.copy
()Return a fully independent copy of
self
.delete_children
(names)Delete one or several children from
self
.detach
()Returns a copy of
self
, and setsself
to "done".done
([include_children, mark_done])Closes the config and checks that no unread parameters remain.
fromkeys
(/, iterable[, value])Create a new ordered dictionary with keys from iterable and values set to value.
get
(*kargs, **kwargs)Returns
cdxcore.config.Config.__call__()
(*kargs, **kwargs)
.get_default
(*kargs, **kwargs)Returns
cdxcore.config.Config.__call__()
(*kargs, **kwargs)
.get_raw
(key[, default])Reads the raw value for
key
without any casting, nor marking the element as read, nor recording access to the element.get_recorded
(key)Returns the casted value returned for
key
previously.input_dict
([ignore_underscore])Returns a
cdxcore.pretty.PrettyObject
of all inputs into this config.input_report
([max_value_len])Returns a report of all inputs in a readable format.
items
(/)Return a set-like object providing a view on the dict's items.
keys
()Returns the keys for the immediate parameters of this config.
mark_done
([include_children])Mark all members as "done" (having been used).
move_to_end
(/, key[, last])Move an existing element to the end (or beginning if last is false).
pop
(/, key[, default])If the key is not found, return the default if given; otherwise, raise a KeyError.
popitem
(/[, last])Remove and return a (key, value) pair from the dictionary.
record_key
(key)Returns the fully qualified string key for
key
.reset
()Reset all usage information.
Remove and return a (key, value) pair from the dictionary.
record_key
(key)Returns the fully qualified string key for
key
.reset
()Reset all usage information.
Reset the internal list of which are "done" (used).
setdefault
(/, key[, default])Insert key with a value of default if key is not in the dictionary.
Return a shallow copy of
self
which shares all usage tracking withself
going forward.to_config
(kwargs[, config_name])Assess whether a parameters is a
cdxcore.config.Config
, and otherwise tries to convert it into one.unique_hash
(*[, unique_hash, debug_trace, ...])Returns a unique hash key for this object - based on its provided inputs and not based on its usage.
update
([other])Overwrite values of 'self' new values. Accepts the two main formats::.
usage_report
([with_values, with_help, ...])Generate a human readable report of all variables read from this config.
Returns a string representation of current usage, calling
repr()
for each value.Return a flat sorted dictionary of both "used" and, where not used, "input" values.
used_info
(key)Returns the usage stats for a given key in the form of a tuple
(done, record)
.values
(/)Return an object providing a view on the dict's values.
- __call__(key, default=<cdxcore.config._ID object>, cast=None, help=None, help_default=None, help_cast=None, mark_done=True, record=True)[source]#
Reads a parameter
key
from the config subject to casting withcast
. If not found, returndefault
Examples:
config("key") # returns the value for 'key' or if not found raises an exception config("key", 1) # returns the value for 'key' or if not found returns 1 config("key", 1, int) # if 'key' is not found, return 1. If it is found cast the result with int(). config("key", 1, int, "A number" # also stores an optional help text. # Call usage_report() after the config has been read to a get a full # summary of all data requested from this config.
Use
cdxcore.config.Int
andcdxcore.config.Float
to ensure a number is within a given range:config("positive_int", 1, Int>=1, "A positive integer") config("ranged_int", 1, (Int>=0)&(Int<=10), "An integer between 0 and 10, inclusive") config("positive_float", 1, Float>0., "A positive integerg"
Choices are implemented with lists:
config("difficulty", 'easy', ['easy','medium','hard'], "Choose one")
Alternative types are implemented with tuples:
config("difficulty", None, (None, ['easy','medium','hard']), "None or a level of difficulty") config("level", None, (None, Int>=0), "None or a non-negative level")
- Parameters:
- keystring
Keyword to read.
- defaultoptional
Default value. Set to
cdxcore.config.Config.no_default
for mandatory parameters without default. If then ‘key’ cannot be found aKeyError
is raised.- castCallable, optional
If
None
, any value provided by the user will be acceptable.If not
None
, the function will attempt to cast the value provided by the user withcast()
. For example, ifcast = int
, then the function will applyint(x)
to the user’s inputx
.This function also allows passing the following complex arguments:
A list, in which case it is assumed that the
key
must be from this list. The type of the first element of the list will be used tocast()
values to the target type.cdxcore.config.Int
andcdxcore.config.Float
allow defining constrained integers and floating point numbers, respectively.A tuple of types, in which case any of the types is acceptable. A
None
here means that the valueNone
is acceptabl (it does not mean that any value is acceptable).Any callable to validate a parameter.
- helpstr, optional
If provied adds a help text when self documentation is used.
- help_defaultstr, optional
If provided, specifies the default value in plain text. If not provided,
help_default
is equal to the string representation of thedefault
value, if any. Use this for complex default values which are hard to read.- help_caststr, optional
If provided, specifies a description of the cast type. If not provided,
help_cast
is set to the string representation ofcast
, orNone
ifcast` is ``None
. Complex casts are supported. Use this for cast types which are hard to read.- mark_donebool, optional
If true, marks the respective element as read once the function returned successfully.
- recordbool, optional
If True, records consistency usage of the key and validates that previous usage of the key is consistent with the current usage, e.g. that the default values are consistent and that if help was provided it is the same.
- Returns:
- Parameter value.
- Raises:
KeyError
:If
key
could not be found.ValueError
:For input errors.
cdxcore.config.InconsistencyError
:If
key
was previously accessed with differentdefault
,help
,help_default
orhelp_cast
values. For all the help texts empty strings are not compared, i.e.__call__("x", default=1)
will succeed even if a previous call was__call__("x", default=1, help="value for x")
.Note that
cast
is not validated.cdxcore.config.CastError
:If an error occcurs casting a provided value.
- as_dict(mark_done=True)[source]#
Convert
self
into a dictionary of dictionaries.- Parameters:
- mark_donebool
If True, then all members of this config will be considered “done” upon return of this function.
- Returns:
- Dictdict
Dictionary of dictionaries.
- as_field()[source]#
This function provides support for
dataclasses.dataclass
fields withConfig
default values.When adding a field with a non-frozen default value to a
@dataclass
class, adefault_factory
has to be provided. The functionas_field
returns the correspondingdataclasses.Field
element by returning simply:def factory(): return self return dataclasses.field( default_factory=factory )
Usage is as follows:
from dataclasses import dataclass @dataclass class A: data : Config = Config(x=2).as_field() a = A() print(a.data['x']) # -> "2" a = A(data=Config(x=3)) print(a.data['x']) # -> "3"
- property children: OrderedDict#
Dictionary of the child configs of
self
.
- clean_copy()[source]#
Make a copy of
self
, and reset it to the original input state from the user.As an example, the following allows using different default values for config members of the same name:
base = Config() _ = base('a', 1) # read a with default 1 copy = base.copy() # copy will know 'a' as used with default 1 # 'b' was not used yet _ = base('b', 111) # read 'b' with default 111 _ = copy('b', 222) # read 'b' with default 222 -> ok _ = copy('a', 2) # use 'a' with default 2 -> ok
Use
cdxcore.config.Config.copy()
for a making a copy which tracks prior usage information.See also the summary on various copy operations in
cdxcore.config
.
- static config_kwargs(config, kwargs, config_name='kwargs')[source]#
Default implementation for a usage pattern where the user can provide both a
cdxcore.config.Config
parameter and** kwargs
.Example:
def f(config, **kwargs): config = Config.config_kwargs( config, kwargs ) ... x = config("x", 1, ...) config.done() # <-- important to do this here. Remembert that config_kwargs() calls 'detach'
and then one can use either of the following:
f(Config(x=1)) f(x=1)
Important:
config_kwargs
callscdxcore.config.Config.detach()
to obtain a copy ofconfig
. This meanscdxcore.config.Config.done()
must be called explicitly for the returned object even ifdone()
will be called elsewhere for the sourceconfig
.- Parameters:
- configConfig
A
Config
object orNone
.- kwargsMapping
If
config
is provided, the function will callcdxcore.config.Config.update()
withkwargs
.- config_namestr
A declarative name for the config if
config
is not proivded.
- Returns:
- configConfig
A new config object. Please note that if
config
was provided, then this a copy obtained from callingcdxcore.config.Config.detach()
, which means thatcdxcore.config.Config.done()
must be called explicitly for this object to ensure no parameters were misspelled (it is not sufficient ifcdxcore.config.Config.done()
is called forconfig
.)
- copy()[source]#
Return a fully independent copy of
self
.The copy has an independent “done” status of
self
.The copy has an independent usage consistency status.
self
will remain untouched. In particular, in contrast tocdxcore.config.Config.detach()
it will not be set to “done”.
As an example, the following allows using different default values for config members of the same name:
base = Config() _ = base('a', 1) # read a with default 1 copy = base.copy() # copy will know 'a' as used with default 1 # 'b' was not used yet _ = base('b', 111) # read 'b' with default 111 _ = copy('b', 222) # read 'b' with default 222 -> ok _ = copy('a', 2) # use 'a' with default 2 -> will fail
Use
cdxcore.config.Config.clean_copy()
for making a copy which discards any prior usage information.See also the summary on various copy operations in
cdxcore.config
.
- delete_children(names)[source]#
Delete one or several children from
self
.This function does not delete recorded consistency information (
defaults
andhelp
recorded from prior uses ofcdxcore.config.Config.__call__()
).
- detach()[source]#
Returns a copy of
self
, and setsself
to “done”.The purpose of this function is to defer using a config (often a sub-config) to a later point, while maintaining consistency of usage.
The copy has the same “done” status as
self
at the time of callingdetach()
.The copy shares usage consistency checks with
self
, i.e. if the same parameter is read with differentdefault
orhelp
values an error is raised.The function flags
self
as “done” usingcdxcore.config.Config.mark_done()
.
For example:
class Example(object): def __init__( config ): self.a = config('a', 1, Int>=0, "'a' value") self.later = config.later.detach() # detach sub-config self._cache = None config.done() def function(self): if self._cache is None: self._cache = Cache(self.later) # deferred use of the self.later config. Cache() calls done() on self.later return self._cache
See also the summary on various copy operations in
cdxcore.config
.- Returns:
- copyConfig
A copy of
self
.
- done(include_children=True, mark_done=True)[source]#
Closes the config and checks that no unread parameters remain. This is used to detect typos in configuration files.
Raises a
cdxcore.config.NotDoneError
if there are unused parameters inself
.Consider this example:
config = Config() config.a = 1 config.child.b = 2 _ = config.a # read a child = config.child config.done() # error because config.child.b has not been read yet print( child.b )
This example raises an error because
config.child.b
was not read. If you wish to process the sub-configconfig.child
later, usecdxcore.config.Config.detach()
:config = Config() config.a = 1 config.child.b = 2 _ = config.a # read a child = config.child.detach() config.done() # no error, even though confg.child.b has not been read yet print( child.b ) child.done() # need to call done() for the child
By default this function also validates that all child configs were “done”.
See Also
cdxcore.config.Config.mark_done()
marks all parameters as “done” (used).cdxcore.config.Config.reset_done()
marks all parameters as “not done”.cdxcore.config.Config.clean_copy()
makes a copy ofself
without any usage information.Introduction to the various copy operations in
cdxcore.config
.
- Parameters:
- include_children: bool
Validate child configs, too. Stronly recommended default.
- mark_done:
Upon completion mark this config as ‘done’. This stops it being modified; that also means subsequent calls to done() will be successful.
- Raises:
cdxcore.config.NotDoneError
If not all elements were read.
- get(*kargs, **kwargs)[source]#
Returns
cdxcore.config.Config.__call__()
(*kargs, **kwargs)
.
- get_default(*kargs, **kwargs)[source]#
Returns
cdxcore.config.Config.__call__()
(*kargs, **kwargs)
.
- get_raw(key, default=<cdxcore.config._ID object>)[source]#
Reads the raw value for
key
without any casting, nor marking the element as read, nor recording access to the element.Equivalent to using
cdxcore.config.Config.__call__()
(key, default, mark_done=False, record=False )
which, withoutdefault
, is turn itself equivalent toself[key]
- get_recorded(key)[source]#
Returns the casted value returned for
key
previously.If the parameter
key
was provided as part of the input data, this value is returned, subject to casting.If
key
was not part of the input data, and adefault
was provided when the parameter was read withcdxcore.config.Config.__call__()
, then return this default value, subject to casting.- Raises:
KeyError
:If the key was not previously read successfully.
- input_dict(ignore_underscore=True)[source]#
Returns a
cdxcore.pretty.PrettyObject
of all inputs into this config.
- input_report(max_value_len=100)[source]#
Returns a report of all inputs in a readable format. Assumes that
str()
converts all values into some readable format.- Parameters:
- max_value_lenint
Limits the length of
str()
for each value tomax_value_len
characters. Set toNone
to not limit the length.
- Returns:
- Reportstr
- property is_empty: bool#
Whether any parameters have been set, at parent level or at any child level.
- keys()[source]#
Returns the keys for the immediate parameters of this config. This call will not return the names of child config; use
cdxcore.config.Config.children
.Use
cdxcore.config.Config.input_dict()
to obtain the full hierarchy of input parameters.
- no_default = <cdxcore.config._ID object>#
- property not_done: dict#
Returns a dictionary of keys which were not read yet.
- Returns:
- not_done: dict
Dictionary of dictionaries: for value parameters, the respective entry is their
key
andFalse
; for children thekey
is followed by theirnot_done
dictionary.
- record_key(key)[source]#
Returns the fully qualified string key for
key
.It has the form
config1.config['entry']
.
- property recorder: SortedDict#
Returns the “recorder”, a
sortedcontainers.SortedDict
which containskey
,default
,cast
,help
, and all other function parameters for all calls ofcdxcore.config.Config.__call__()
. It is used to ensure consistency of parameter calls.Use for debugging only.
- reset()[source]#
Reset all usage information.
Use
cdxcore.config.Config.reset_done()
to only reset the information whether a key was used, but to keep consistency information on previously used default and/or help values.
- reset_done()[source]#
Reset the internal list of which are “done” (used).
Typically “done” means that a parameter has been read using
cdxcore.config.Config.call()
.This function does not reset the consistency recording of previous uses of each key. This ensures consistency of default values between uses of keys. Use
cdxcore.config.Config.reset()
to reset all “done” and reset all usage records.See also the summary on various copy operations in
cdxcore.config
.
- shallow_copy()[source]#
Return a shallow copy of
self
which shares all usage tracking withself
going forward.The copy shares the “done” status of
self
.The copy shares all consistency usage status of
self
.self
will not be flagged as ‘done’
- static to_config(kwargs, config_name='kwargs')[source]#
Assess whether a parameters is a
cdxcore.config.Config
, and otherwise tries to convert it into one. Classic use case is to transform** kwargs
to acdxcore.config.Config
to allow type checking and prevent spelling errors.- Returns:
- configConfig
If
kwargs
is already acdxcore.config.Config
it is returned. Otherwise, create a newcdxcore.config.Config
fromkwargs
named usingconfig_name
.
- unique_hash(*, unique_hash=None, debug_trace=None, input_only=True, **unique_hash_parameters)[source]#
Returns a unique hash key for this object - based on its provided inputs and not based on its usage.
This function allows both provision of an existing
unique_hash
function or to specify one on the fly usingunique_hash_parameters
. That means instead of:from cdxcore.uniquehash import UniqueHash self.unique_hash( unique_hash=UniqueHash(**p) )
we can directly call:
self.unique_hash( **p )
The purpose of this function is to allow indexing results of heavy computations which were configured with
Config
with a simple hash key. A typical application is caching of results based on the relevant user-configuration.An example for a simplistic cache:
from cdxcore.config import Config import tempfile as tempfile import pickle as pickle def big_function( cache_dir : str, config : Config = None, **kwargs ): assert not cache_dir[-1] in ["/","\\"], cache_dir config = Config.config_kwargs( config, kwargs ) uid = config.unique_hash(length=8) cfile = f"{cache_dir}/{uid}.pck" # attempt to read cache try: with open(cfile, "rb") as f: return pickle.load(f) except FileNotFoundError: pass # do something big... result = config("a", 0, int, "Value 'a'") * 1000 # write cache with open(cfile, "wb") as f: pickle.dump(result,f) return result cache_dir = tempfile.mkdtemp() # for real applications, use a permanent cache_dir. _ = big_function( cache_dir = cache_dir, a=1 ) print(_)
A more sophisticated framework which includes code versioning via
cdxcore.version.version()
is implemented withcdxcore.subdir.SubDir.cache()
.Unique Hash Default Semantics
Please consult the documentation for
cdxcore.uniquehash.UniqueHash
before using this functionality; in particular note that by default this function ignores config keys or children with leading underscores; setparse_underscore
to"protected"
or"private"
to change this behaviour.Why is “Usage” not Considered when Computing the Hash (by Default)
When using
Config
to configure our environment, then we have not only the user’s input values but also the realized values in the form of defaults for those values the user has not provided. In most cases, these are the majority of values.By only considering actual input values when computing a hash, we stipulate that defaults are not part of the current unique characteristic of the environment.
That seems inconsistent: consider a program which reads a parameter
activation
with defaultrelu
. The hash key will be different for the case where the user does not provide a value foractivation
, and the case where its value is set torelu
by the user. The effectiveactivation
value in both cases isrelu
– why would we not want this to be identified as the same environment configuration.The following illustrates this dilemma:
def big_function( config ): _ = config("activation", "relu", str, "Activation function") config.done() config = Config() big_function( config ) print( config.unique_hash(length=8) ) # -> 36e9d246 config = Config(activation="relu") big_function( config ) print( config.unique_hash(length=8) ) # -> d715e29c
Robustness
The key driver of using only input values for hashing is the prevalence of reading (child) configs close to the use of their parameters. That means that often config parameters are only read (and therefore their usage registered) if the respective computation is actually executed: even the
big_function
example above shows this issue: the callconfig("a", 0, int, "Value 'a'")
will only be executed if the cache could not be found.This can be rectified if it is ensured that all config parameters are read regardless of actual executed code. In this case, set the parameter
input_only
forunique_hash()
toFalse
. Note that when usingcdxcore.config.Config.detach()
you must make sure to have processed all detached configurations before callingunique_hash()
.- Parameters:
- unique_hash_parametersdict
If
unique_hash
isNone
these parameters are passed tocdxcore.uniquehash.UniqueHash.__call__()
to obtain the corrsponding hashing function.- unique_hashCallable
A function to return unique hashes, usally generated using
cdxcore.uniquehash.UniqueHash
.- debug_trace
cdxcore.uniquehash.DebugTrace
Allows tracing of hashing activity for debugging purposes. Two implementations of
DebugTrace
are currently available:cdxcore.uniquehash.DebugTraceVerbose
simply prints out hashing activity to stdout.cdxcore.uniquehash.DebugTraceCollect
collects an array of tracing information. The object itself is an iterable which contains the respective tracing information once the hash function has returned.
- input_onlybool
Expert use only.
If True (the default) only user-provided inputs are used to compute the unique hash. If False, then the result of
cdxcore.config.Config.usage_value_dict()
is used to generate the hash. Make sure you read and understand the discussion above on the topic.
- Returns:
- Unique hash, str
A unique hash of at most the length specified via either
unique_hash
orunique_hash_parameters
.
- update(other=None, **kwargs)[source]#
Overwrite values of ‘self’ new values. Accepts the two main formats:
update( dictionary ) update( config ) update( a=1, b=2 ) update( {'x.a':1 } ) # hierarchical assignment self.x.a = 1
- Parameters:
- otherdict, Config
Copy all content of
other
into``self``.If
other
is a config: elements will be clean_copy()ed;other
will not be marked as “read”.If
other
is a dictionary, then ‘.’ notation can be used for hierarchical assignments- **kwargs
Allows assigning specific values.
- Returns:
- selfConfig
- usage_report(with_values=True, with_help=True, with_defaults=True, with_cast=False, filter_path=None)[source]#
Generate a human readable report of all variables read from this config.
- Parameters:
- with_valuesbool, optional
Whether to also print values. This can be hard to read if values are complex objects
- with_help: bool, optional
Whether to print help
- with_defaults: bool, optional
Whether to print default values
- with_cast: bool, optional
Whether to print types
- filter_pathstr, optional
If provided, will match the beginning of the fully qualified path of all children vs this string. Most useful with
filter_path = self.config_name
which ensures only children of this (child) config are shown.
- Returns:
- Reportstr
- usage_reproducer()[source]#
Returns a string representation of current usage, calling
repr()
for each value.
- usage_value_dict()[source]#
Return a flat sorted dictionary of both “used” and, where not used, “input” values.
A “used” value has either been read from user input or was provided as a default. In both cases, it will have been subject to casting.
This function will raise a
RuntimeError
in either of the following two cases:A key was marked as “done” (read), but no “value” was recorded at that time. A simple example is when
cdxcore.config.Config.detach()
was called to create a child config, but that config has not yet been read.A key has not been read yet, but there is a record of a value being returned. An example of this happening is if
cdxcore.config.Config.reset_done()
is called.
- cdxcore.config.Float = <cdxcore.config._CastCond object>#
Allows to apply basic range conditions to
float
parameters.For example:
timeout = config("timeout", 0.5, Float>=0., "Timeout")
In combination with
&
we can limit a float to a range:probability = config("probability", 0.5, (Float>=0.) & (Float <= 1.), "Probability")
- exception cdxcore.config.InconsistencyError(key, config_name, message)[source]#
Bases:
RuntimeError
Raised when
cdxcore.config.Config.__call__()
used inconsistently between function calls for a given parameter.The
Config
semantics require that parameters are accessed used with consistent default and help values betweencdxcore.config.Config.__call__()
calls.For raw access to any paramters, use
[]
.- config_name#
Hierarchical name of the config.
- key#
The offending parameter key.
- cdxcore.config.Int = <cdxcore.config._CastCond object>#
Allows to apply basic range conditions to
int
parameters.For example:
num_steps = config("num_steps", 1, Int>0., "Number of steps")
In combination with
&
we can limit an int to a range:bus_days_per_year = config(“bus_days_per_year”, 255, (Int > 0) & (Int < 365), “Business days per year”)
- exception cdxcore.config.NotDoneError(not_done, config_name, message)[source]#
Bases:
RuntimeError
Raised when
cdxcore.config.Config.done()
finds that some config parameters have not been read.The set of those arguments is accessible via
cdxcore.config.NotDoneError.not_done
.- config_name#
Hierarchical name of the config.
- not_done#
The oarameter keys which were not read when
cdxcore.config.Config.done()
was called.
- cdxcore.config.config_kwargs(config, kwargs, config_name='kwargs')#
Default implementation for a usage pattern where the user can provide both a
cdxcore.config.Config
parameter and** kwargs
.Example:
def f(config, **kwargs): config = Config.config_kwargs( config, kwargs ) ... x = config("x", 1, ...) config.done() # <-- important to do this here. Remembert that config_kwargs() calls 'detach'
and then one can use either of the following:
f(Config(x=1)) f(x=1)
Important:
config_kwargs
callscdxcore.config.Config.detach()
to obtain a copy ofconfig
. This meanscdxcore.config.Config.done()
must be called explicitly for the returned object even ifdone()
will be called elsewhere for the sourceconfig
.- Parameters:
- configConfig
A
Config
object orNone
.- kwargsMapping
If
config
is provided, the function will callcdxcore.config.Config.update()
withkwargs
.- config_namestr
A declarative name for the config if
config
is not proivded.
- Returns:
- configConfig
A new config object. Please note that if
config
was provided, then this a copy obtained from callingcdxcore.config.Config.detach()
, which means thatcdxcore.config.Config.done()
must be called explicitly for this object to ensure no parameters were misspelled (it is not sufficient ifcdxcore.config.Config.done()
is called forconfig
.)
- cdxcore.config.no_default = <cdxcore.config._ID object>#
Value indicating no default is available for a given parameter.
- cdxcore.config.to_config(kwargs, config_name='kwargs')#
Assess whether a parameters is a
cdxcore.config.Config
, and otherwise tries to convert it into one. Classic use case is to transform** kwargs
to acdxcore.config.Config
to allow type checking and prevent spelling errors.- Returns:
- configConfig
If
kwargs
is already acdxcore.config.Config
it is returned. Otherwise, create a newcdxcore.config.Config
fromkwargs
named usingconfig_name
.