cdxcore.config#
Tooling for setting up program-wide configuration hierachies. Aimed at machine learning programs to ensure consistency of code accross experimentation.
Overview#
Basic config construction:
from cdxbasics.config import Config, Int
config = Config()
config.num_batches = 1000 # object-like assigment of config values
config.network.depth = 3 # on-the-fly hierarchy generation: here `network` becomes a sub-config
config.network.width = 100
...
def train(config):
num_batches = config("num_batches", 10, Int>=2, "Number of batches. Must be at least 2")
...
Key features#
Detect misspelled parameters by checking that all parameters provided via a config by a user have been read.
Provide summary of all parameters used, including summary help for what they were for.
Nicer object attribute synthax than dictionary notation, in particular for nested configurations.
Automatic conversion including simple value validation to ensure user-provided values are within a given range or from a list of options.
Creating Configs#
Set data with both dictionary and member notation:
config = Config()
config['features'] = [ 'time', 'spot' ] # examplearray-type assignment
config.scaling = [ 1., 1000. ] # example object-type assignment
Reading a Config#
When reading the value for a key from a config, cdxcore.config.Config.__call__()
expects a key, a default value, a cast type, and a brief help text.
The function first attempts to find key in the provided Config:
If
keyis found, it casts the value provided forkeyusing thecasttype and returns.If
keyis not found, then the default value will be returned (after also being cast usingcast).
Example:
from cdxcore.config import Config
import numpy as np
class Model(object):
def __init__( self, config ):
# read top level parameters
self.features = config("features", [], list, "Features for the agent" )
self.scaling = config("scaling", [], np.asarray, "Scaling for the features", help_default="no scaling")
model = Model( config )
Most of the example is self-explanatory, but note that
the :class:’numpy.asarray` provided as cast parameter for
weights means that any values passed by the user will be automatically
converted to numpy.ndarray objects.
The help text parameter allows providing information on what variables
are read from the config. The latter can be displayed using the function
cdxcore.config.Config.usage_report(). (There a number of further parameters to
cdxcore.config.Config.__call__() to fine-tune this report such as the help_defaults
parameter used above).
In the above example, print( config.usage_report() ) will return:
config['features'] = ['time', 'spot'] # Features for the agent; default: []
config['scaling'] = [ 1. 1000.] # Weigths for the agent; default: no initial weights
Sub-Configs#
You can write and read sub-configurations directly with member notation, without having to explicitly create an entry for the sub-config:
Assume as before:
config = Config()
config['features'] = [ 'time', 'spot' ]
config.scaling = [ 1., 1000. ]
Then create a network sub configuration with member notation on the fly:
config.network.depth = 10
config.network.width = 100
config.network.activation = 'relu'
This is equivalent to:
config.network = Config()
config.network.depth = 10
config.network.width = 100
config.network.activation = 'relu'
Now use naturally as follows:
from cdxcore.config import Config
import numpy as np
class Network(object):
def __init__( self, config ):
self.depth = config("depth", 1, Int>0, "Depth of the network")
self.width = config("width", 1, Int>0, "Width of the network")
self.activation = config("activation", "selu", str, "Activation function")
config.done() # see below
class Model(object):
def __init__( self, config ):
# read top level parameters
self.features = config("features", [], list, "Features for the agent" )
self.weights = config("weights", [], np.asarray, "Weigths for the agent", help_default="no initial weights")
self.networks = Network( config.network )
config.done() # see below
model = Model( config )
Imposing Simple Restrictions on Values#
The cast parameter to cdxcore.config.Config.__call__() is a callable; this allows imposing
simple restrictions to any values read from a config.
To this end, import the respective type operators:
from cdxcore.config import Int, Float
Implement a one-sided restriction:
# example enforcing simple conditions
self.width = network('width', 100, Int>3, "Width for the network")
Restrictions on both sides of a scalar:
# example encorcing two-sided conditions
self.percentage = network('percentage', 0.5, ( Float >= 0. ) & ( Float <= 1.), "A percentage")
Enforce the value being a member of a list:
# example ensuring a returned type is from a list
self.ntype = network('ntype', 'fastforward', ['fastforward','recurrent','lstm'], "Type of network")
We can allow a returned value to be one of several casting types by using tuples.
The most common use case is that None is a valid value, too.
For example, assume that the name of the network model should be a string or None.
This is implemented as:
# example allowing either None or a string
self.keras_name = network('name', None, (None, str), "Keras name of the network model")
We can combine conditional expressions with the tuple notation:
# example allowing either None or a positive int
self.batch_size = network('batch_size', None, (None, Int>0), "Batch size or None for TensorFlow's default 32", help_cast="Positive integer, or None")
Ensuring that we had no Typos & that all provided Data is meaningful#
A common issue when using dictionary-based code configuration is that we might misspell one of the parameters. Unless this is a mandatory parameter we might not notice that we have not actually changed its value.
To check that all values of a config were read use cdxcore.config.Config.done().
It will alert you if there are keywords or children which have not been read.
Most likely, those will be typos. Consider the following example where width is misspelled in our config:
class Network(object):
def __init__( self, config ):
# read top level parameters
self.depth = config("depth", 1, Int>=1, "Depth of the network")
self.width = config("width", 3, Int>=1, "Width of the network")
self.activaton = config("activation", "relu", help="Activation function", help_cast="String with the function name, or function")
config.done() # <-- test that all members of config where read
config = Config()
config.features = ['time', 'spot']
config.network.depth = 10
config.network.activation = 'relu'
config.network.widht = 100 # (intentional typo)
n = Network(config.network)
Since width was misspelled in setting up the config,
a cdxcore.config.NotDoneError exception is raised:
NotDoneError: Error closing Config 'config.network': the following config arguments were not read: widht
Summary of all variables read from this object:
config.network['activation'] = relu # Activation function; default: relu
config.network['depth'] = 10 # Depth of the network; default: 1
config.network['width'] = 3 # Width of the network; default: 3
Note that you can also call cdxcore.config.Config.done() at top level:
class Network(object):
def __init__( self, config ):
# read top level parameters
self.depth = config("depth", 1, Int>=1, "Depth of the network")
self.width = config("width", 3, Int>=1, "Width of the network")
self.activaton = config("activation", "relu", help="Activation function", help_cast="String with the function name, or function")
config = Config()
config.features = ['time', 'spot']
config.network.depth = 10
config.network.activation = 'relu'
config.network.widht = 100 # (intentional typo)
n = Network(config.network)
test_features = config("features", [], list, "Features for my network")
config.done()
produces:
NotDoneError: Error closing Config 'config.network': the following config arguments were not read: widht
Summary of all variables read from this object:
config.network['activation'] = relu # Activation function; default: relu
config.network['depth'] = 10 # Depth of the network; default: 1
config.network['width'] = 3 # Width of the network; default: 3
#
config['features'] = ['time', 'spot'] # Features for my network; default: []
You can check the status of the use of the config by using the cdxcore.config.Config.not_done property.
Detaching Child Configs#
You can also detach a child config,
which allows you to store it for later use without triggering cdxcore.config.Config.done() errors:
def read_config( self, confg ):
...
self.config_training = config.training.detach()
config.done()
The function cdxcore.config.Config.detach() will mark he original child but not the detached
child itself as ‘done’.
Therefore, we will need to call cdxcore.config.Config.done() for the detached child
when we finished processing it:
def training(self):
epochs = self.config_training("epochs", 100, int, "Epochs for training")
batch_size = self.config_training("batch_size", None, help="Batch size. Use None for default of 32" )
self.config_training.done()
Various Copy Operations#
When making a copy of a config we will need to decide about the semantics of the operation.
A cdxcore.config.Config object contains
Inputs: the user’s input hierarchy. This is accessible via
cdxcore.config.Config.childrenandcdxcore.config.Config.keys().All copy operations share (and do not modify) the user’s input. See also
cdxcore.config.Config.input_report().Done Status: to check whether all parameters provided by the users are read by some code config keeps track of which parameters were read with
cdxcore.config.Config.__call__(). This list is checked against whencdxcore.config.Config.done()is called.This list of elements not yet read can be obtained using
cdxcore.config.Config.input_dict().Consistency: a
cdxcore.config.Configobject makes sure that if a parameter is requested twice withcdxcore.config.Config.__call__()then the respectivedefaultandhelpvalues are consistency between function calls. This avoids typically divergence of code where one part of code assumes a different default value than another.Recorded consistency information are accessible via
cdxcore.config.Config.recorder.Note that you can read a parameter “quietly” without recording any usage by using the
[]operator.
Accordingly, when making a copy of self we need to determine the relationship of the copy with
above.
cdxcore.config.Config.detach(): use case is deferring usage of a config to a later point.Done status:
selfis marked as “done”; the copy is used keep track of usage of the remaining parameters.Consistency: both
selfand the copy share the same consistency recorder.
cdxcore.config.Config.copy(): make an indepedent copy of the current status ofself.Done status: the copy has an inpendent copy of the “done” status of
self.Consistency: the copy has an inpendent copy of the consistency recorder of
self.
cdxcore.config.Config.clean_copy(): make an indepedent copy ofself, and reset all usage information.Done status: the copy has an empty “done” status.
Consistency: the copy has an empty consistency recorder.
cdxcore.config.Config.shallow_copy(): make a shallow copy which shares all future usage tracking withself.The copy acts as a view on
self. This is the semantic of the copy constructor.Done status: the copy and
selfshare all “done” status; if a parameter is read with one, it is considered “done” by both.Consistency: the copy and
selfshare all consistency handling. If a parameter is read with one with a givendefaultandhelp, the other must use the same values when accessing the same parameter.
Self-Recording All Available Configuration Parameters#
Once your program ran, you can read the summary of all values read, their defaults, and their help texts:
print( config.usage_report( with_cast=True ) )
Prints:
config.network['activation'] = relu # (str) Activation function for the network; default: relu
config.network['depth'] = 10 # (int) Depth for the network; default: 10000
config.network['width'] = 100 # (int>3) Width for the network; default: 100
config.network['percentage'] = 0.5 # (float>=0. and float<=1.) Width for the network; default: 0.5
config.network['ntype'] = 'fastforward' # (['fastforward','recurrent','lstm']) Type of network; default 'fastforward'
config.training['batch_size'] = None # () Batch size. Use None for default of 32; default: None
config.training['epochs'] = 100 # (int) Epochs for training; default: 100
config['features'] = ['time', 'spot'] # (list) Features for the agent; default: []
config['weights'] = [1 2 3] # (asarray) Weigths for the agent; default: no initial weights
Unique Hash#
Another common use case is that we wish to cache the result of some complex operation.
Assuming that the config describes all relevant parameters, and is therefore a valid ID for
the data we wish to cache, we can use cdxcore.config.Config.unique_hash()
to obtain a unique hash ID for the given config.
cdxcore.config.Config also implements
the custom hashing protocol __unique_hash__ defined by cdxcore.uniquehash.UniqueHash,
which means that if a Config is used during a hashing function from cdxcore.uniquehash
the config will be hashed correctly.
A fully transparent caching framework which supports code versioning and transparent
hashing of function parameters is implemented with cdxcore.subdir.SubDir.cache().
Consistent ** kwargs Handling#
The Config class can be used to improve ** kwargs handling.
Assume we have:
def f(** kwargs):
a = kwargs.get("difficult_name", 10)
b = kwargs.get("b", 20)
We run the usual risk of a user mispronouncing a parameter name which we would never know. Therefore we may improve upon the above with:
def f(**kwargs):
kwargs = Config(kwargs)
a = kwargs("difficult_name", 10)
b = kwargs("b", 20)
kwargs.done()
If now a user calls f with, say, config(difficlt_name=5) an error will be raised.
A more advanced pattern is to allow both config and kwargs function parameters. In this case, the user
can both provide a config or specify its parameters directory:
def f( config=None, **kwargs):
config = Config.config_kwargs(config,kwargs)
a = config("difficult_name", 10, int)
b = config("b", 20, int)
config.done()
Any of the following function calls are now valid:
f( Config(difficult_name=11, b=21) ) # use a Config
f( difficult_name=12, b=22 ) # use a kwargs
f( Config(difficult_name=11, b=21), b=22 ) # use both; kwargs overwrite config values
Dataclasses#
dataclasses rely on default values of any member being “frozen” objects, which most user-defined objects and
cdxcore.config.Config objects are not.
This limitation applies as well to flax modules.
To use non-frozen default values, use the
cdxcore.config.Config.as_field() function:
from cdxcore.config import Config
from dataclasses import dataclass
@dataclass
class Data:
data : Config = Config().as_field()
def f(self):
return self.data("x", 1, Int>0, "A positive integer")
d = Data() # default constructor used.
d.f()
Import#
from cdxcore.config import Config
Documentation#
Module Attributes
Value indicating no default is available for a given parameter. |
|
Allows to apply basic range conditions to |
|
Allows to apply basic range conditions to |
Functions
|
Default implementation for a usage pattern where the user can provide both a |
|
Assess whether a parameters is a |
Classes
|
A simple Config class for hierarchical dictionary-like configurations but with type checking, detecting missspelled parameters, and simple built-in help. |
Exceptions
|
Raised when |
|
Raised when |
|
Raised when |
- exception cdxcore.config.CastError(key, config_name, exception)[source]#
Bases:
RuntimeErrorRaised when
cdxcore.config.Config.__call__()could not cast a value provided by the user to the specified type.- Parameters:
- keystr
Key name of the parameter which failed to cast.
- config_namestr
Name of the
Config.- exception
Exception Orginal exception raised by the cast.
- config_name#
Hierarchical name of the config.
- key#
Key of the parameter which failed to cast.
- class cdxcore.config.Config(*args, config_name=None, **kwargs)[source]#
Bases:
OrderedDictA simple Config class for hierarchical dictionary-like configurations but with type checking, detecting missspelled parameters, and simple built-in help.
See
cdxcore.configfor an extensive discussion of features.- Parameters:
- *argslist
List of
Mappingto iteratively create a new config with.If the first element is a
Config, and no other parameters are passed, then this object will be a shallow copy of thatConfig. It then shares all usage recording. Seecdxcore.config.Config.shallow_copy().- config_namestr, optional
Name of the configuration for report_usage. Default is
"config".- ** kwargsdict
Additional key/value pairs to initialize the config with, e.g.``Config(a=1, b=2)``.
- __call__(key, default=<cdxcore.config._ID object>, cast=None, help=None, help_default=None, help_cast=None, mark_done=True, record=True)[source]#
Reads a parameter
keyfrom the config subject to casting withcast. If not found, returndefaultExamples:
config("key") # returns the value for 'key' or if not found raises an exception config("key", 1) # returns the value for 'key' or if not found returns 1 config("key", 1, int) # if 'key' is not found, return 1. If it is found cast the result with int(). config("key", 1, int, "A number" # also stores an optional help text. # Call usage_report() after the config has been read to a get a full # summary of all data requested from this config.
Use
cdxcore.config.Intandcdxcore.config.Floatto ensure a number is within a given range:config("positive_int", 1, Int>=1, "A positive integer") config("ranged_int", 1, (Int>=0)&(Int<=10), "An integer between 0 and 10, inclusive") config("positive_float", 1, Float>0., "A positive integerg"
Choices are implemented with lists:
config("difficulty", 'easy', ['easy','medium','hard'], "Choose one")
Alternative types are implemented with tuples:
config("difficulty", None, (None, ['easy','medium','hard']), "None or a level of difficulty") config("level", None, (None, Int>=0), "None or a non-negative level")
- Parameters:
- keystring
Keyword to read.
- defaultoptional
Default value. Set to
cdxcore.config.Config.no_defaultfor mandatory parameters without default. If then ‘key’ cannot be found aKeyErroris raised.- castCallable, optional
If
None, any value provided by the user will be acceptable.If not
None, the function will attempt to cast the value provided by the user withcast(). For example, ifcast = int, then the function will applyint(x)to the user’s inputx.This function also allows passing the following complex arguments:
A list, in which case it is assumed that the
keymust be from this list. The type of the first element of the list will be used tocast()values to the target type.cdxcore.config.Intandcdxcore.config.Floatallow defining constrained integers and floating point numbers, respectively.A tuple of types, in which case any of the types is acceptable. A
Nonehere means that the valueNoneis acceptabl (it does not mean that any value is acceptable).Any callable to validate a parameter.
- helpstr, optional
If provied adds a help text when self documentation is used.
- help_defaultstr, optional
If provided, specifies the default value in plain text. If not provided,
help_defaultis equal to the string representation of thedefaultvalue, if any. Use this for complex default values which are hard to read.- help_caststr, optional
If provided, specifies a description of the cast type. If not provided,
help_castis set to the string representation ofcast, orNoneifcast` is ``None. Complex casts are supported. Use this for cast types which are hard to read.- mark_donebool, optional
If true, marks the respective element as read once the function returned successfully.
- recordbool, optional
If True, records consistency usage of the key and validates that previous usage of the key is consistent with the current usage, e.g. that the default values are consistent and that if help was provided it is the same.
- Returns:
- Parameter value.
- Raises:
KeyError:If
keycould not be found.ValueError:For input errors.
cdxcore.config.InconsistencyError:If
KeyError:If
keycould not be found.ValueError:For input errors.
cdxcore.config.InconsistencyError:If
keywas previously accessed with differentdefault,help,help_defaultorhelp_castvalues. For all the help texts empty strings are not compared, i.e.__call__("x", default=1)will succeed even if a previous call was__call__("x", default=1, help="value for x").Note that
castis not validated.cdxcore.config.CastError:If an error occcurs casting a provided value.
- as_dict(mark_done=True)[source]#
Convert
selfinto a dictionary of dictionaries.- Parameters:
- mark_donebool
If True, then all members of this config will be considered “done” upon return of this function.
- Returns:
- Dictdict
Dictionary of dictionaries.
- as_field()[source]#
This function provides support for
dataclasses.dataclassfields withConfigdefault values.When adding a field with a non-frozen default value to a
@dataclassclass, adefault_factoryhas to be provided. The functionas_fieldreturns the correspondingdataclasses.Fieldelement by returning simply:def factory(): return self return dataclasses.field( default_factory=factory )
Usage is as follows:
from dataclasses import dataclass @dataclass class A: data : Config = Config(x=2).as_field() a = A() print(a.data['x']) # -> "2" a = A(data=Config(x=3)) print(a.data['x']) # -> "3"
- property children: OrderedDict#
Dictionary of the child configs of
self.
- clean_copy()[source]#
Make a copy of
self, and reset it to the original input state from the user.As an example, the following allows using different default values for config members of the same name:
base = Config() _ = base('a', 1) # read a with default 1 copy = base.copy() # copy will know 'a' as used with default 1 # 'b' was not used yet _ = base('b', 111) # read 'b' with default 111 _ = copy('b', 222) # read 'b' with default 222 -> ok _ = copy('a', 2) # use 'a' with default 2 -> ok
Use
cdxcore.config.Config.copy()for a making a copy which tracks prior usage information.See also the summary on various copy operations in
cdxcore.config.
- static config_kwargs(config, kwargs, config_name='kwargs')[source]#
Default implementation for a usage pattern where the user can provide both a
cdxcore.config.Configparameter and** kwargs.Example:
def f(config, **kwargs): config = Config.config_kwargs( config, kwargs ) ... x = config("x", 1, ...) config.done() # <-- important to do this here. Remembert that config_kwargs() calls 'detach'
and then one can use either of the following:
f(Config(x=1)) f(x=1)
Important:
config_kwargscallscdxcore.config.Config.detach()to obtain a copy ofconfig. This meanscdxcore.config.Config.done()must be called explicitly for the returned object even ifdone()will be called elsewhere for the sourceconfig.- Parameters:
- configConfig
A
Configobject orNone.- kwargsMapping
If
configis provided, the function will callcdxcore.config.Config.update()withkwargs.- config_namestr
A declarative name for the config if
configis not proivded.
- Returns:
- configConfig
A new config object. Please note that if
configwas provided, then this a copy obtained from callingcdxcore.config.Config.detach(), which means thatcdxcore.config.Config.done()must be called explicitly for this object to ensure no parameters were misspelled (it is not sufficient ifcdxcore.config.Config.done()is called forconfig.)
- copy()[source]#
Return a fully independent copy of
self.The copy has an independent “done” status of
self.The copy has an independent usage consistency status.
selfwill remain untouched. In particular, in contrast tocdxcore.config.Config.detach()it will not be set to “done”.
As an example, the following allows using different default values for config members of the same name:
base = Config() _ = base('a', 1) # read a with default 1 copy = base.copy() # copy will know 'a' as used with default 1 # 'b' was not used yet _ = base('b', 111) # read 'b' with default 111 _ = copy('b', 222) # read 'b' with default 222 -> ok _ = copy('a', 2) # use 'a' with default 2 -> will fail
Use
cdxcore.config.Config.clean_copy()for making a copy which discards any prior usage information.See also the summary on various copy operations in
cdxcore.config.
- delete_children(names)[source]#
Delete one or several children from
self.This function does not delete recorded consistency information (
defaultsandhelprecorded from prior uses ofcdxcore.config.Config.__call__()).
- detach()[source]#
Returns a copy of
self, and setsselfto “done”.The purpose of this function is to defer using a config (often a sub-config) to a later point, while maintaining consistency of usage.
The copy has the same “done” status as
selfat the time of callingdetach().The copy shares usage consistency checks with
self, i.e. if the same parameter is read with differentdefaultorhelpvalues an error is raised.The function flags
selfas “done” usingcdxcore.config.Config.mark_done().
For example:
class Example(object): def __init__( config ): self.a = config('a', 1, Int>=0, "'a' value") self.later = config.later.detach() # detach sub-config self._cache = None config.done() def function(self): if self._cache is None: self._cache = Cache(self.later) # deferred use of the self.later config. Cache() calls done() on self.later return self._cache
See also the summary on various copy operations in
cdxcore.config.- Returns:
- copyConfig
A copy of
self.
- done(include_children=True, mark_done=True)[source]#
Closes the config and checks that no unread parameters remain. This is used to detect typos in configuration files.
Raises a
cdxcore.config.NotDoneErrorif there are unused parameters inself.Consider this example:
config = Config() config.a = 1 config.child.b = 2 _ = config.a # read a child = config.child config.done() # error because config.child.b has not been read yet print( child.b )
This example raises an error because
config.child.bwas not read. If you wish to process the sub-configconfig.childlater, usecdxcore.config.Config.detach():config = Config() config.a = 1 config.child.b = 2 _ = config.a # read a child = config.child.detach() config.done() # no error, even though confg.child.b has not been read yet print( child.b ) child.done() # need to call done() for the child
By default this function also validates that all child configs were “done”.
See Also
cdxcore.config.Config.mark_done()marks all parameters as “done” (used).cdxcore.config.Config.reset_done()marks all parameters as “not done”.cdxcore.config.Config.clean_copy()makes a copy ofselfwithout any usage information.Introduction to the various copy operations in
cdxcore.config.
- Parameters:
- include_children: bool
Validate child configs, too. Stronly recommended default.
- mark_done:
Upon completion mark this config as ‘done’. This stops it being modified; that also means subsequent calls to done() will be successful.
- Raises:
cdxcore.config.NotDoneErrorIf not all elements were read.
- get(*kargs, **kwargs)[source]#
Returns
cdxcore.config.Config.__call__()(*kargs, **kwargs).
- get_default(*kargs, **kwargs)[source]#
Returns
cdxcore.config.Config.__call__()(*kargs, **kwargs).
- get_raw(key, default=<cdxcore.config._ID object>)[source]#
Reads the raw value for
keywithout any casting, nor marking the element as read, nor recording access to the element.Equivalent to using
cdxcore.config.Config.__call__()(key, default, mark_done=False, record=False )which, withoutdefault, is turn itself equivalent toself[key]
- get_recorded(key)[source]#
Returns the casted value returned for
keypreviously.If the parameter
keywas provided as part of the input data, this value is returned, subject to casting.If
keywas not part of the input data, and adefaultwas provided when the parameter was read withcdxcore.config.Config.__call__(), then return this default value, subject to casting.- Raises:
KeyError:If the key was not previously read successfully.
- input_dict(ignore_underscore=True)[source]#
Returns a
cdxcore.pretty.PrettyObjectof all inputs into this config.
- input_report(max_value_len=100)[source]#
Returns a report of all inputs in a readable format. Assumes that
str()converts all values into some readable format.- Parameters:
- max_value_lenint
Limits the length of
str()for each value tomax_value_lencharacters. Set toNoneto not limit the length.
- Returns:
- Reportstr
- property is_empty: bool#
Whether any parameters have been set, at parent level or at any child level.
- keys()[source]#
Returns the keys for the immediate parameters of this config. This call will not return the names of child config; use
cdxcore.config.Config.children.Use
cdxcore.config.Config.input_dict()to obtain the full hierarchy of input parameters.
- no_default = <cdxcore.config._ID object>#
- property not_done: dict#
Returns a dictionary of keys which were not read yet.
- Returns:
- not_done: dict
Dictionary of dictionaries: for value parameters, the respective entry is their
keyandFalse; for children thekeyis followed by theirnot_donedictionary.
- record_key(key)[source]#
Returns the fully qualified string key for
key.It has the form
config1.config['entry'].
- property recorder: SortedDict#
Returns the “recorder”, a
sortedcontainers.SortedDictwhich containskey,default,cast,help, and all other function parameters for all calls ofcdxcore.config.Config.__call__(). It is used to ensure consistency of parameter calls.Use for debugging only.
- reset()[source]#
Reset all usage information.
Use
cdxcore.config.Config.reset_done()to only reset the information whether a key was used, but to keep consistency information on previously used default and/or help values.
- reset_done()[source]#
Reset the internal list of which are “done” (used).
Typically “done” means that a parameter has been read using
cdxcore.config.Config.call().This function does not reset the consistency recording of previous uses of each key. This ensures consistency of default values between uses of keys. Use
cdxcore.config.Config.reset()to reset all “done” and reset all usage records.See also the summary on various copy operations in
cdxcore.config.
- shallow_copy()[source]#
Return a shallow copy of
selfwhich shares all usage tracking withselfgoing forward.The copy shares the “done” status of
self.The copy shares all consistency usage status of
self.selfwill not be flagged as ‘done’
- static to_config(kwargs, config_name='kwargs')[source]#
Assess whether a parameters is a
cdxcore.config.Config, and otherwise tries to convert it into one. Classic use case is to transform** kwargsto acdxcore.config.Configto allow type checking and prevent spelling errors.- Returns:
- configConfig
If
kwargsis already acdxcore.config.Configit is returned. Otherwise, create a newcdxcore.config.Configfromkwargsnamed usingconfig_name.
- unique_hash(*, unique_hash=None, debug_trace=None, input_only=True, **unique_hash_parameters)[source]#
Returns a unique hash key for this object - based on its provided inputs and not based on its usage.
This function allows both provision of an existing
unique_hashfunction or to specify one on the fly usingunique_hash_parameters. That means instead of:from cdxcore.uniquehash import UniqueHash self.unique_hash( unique_hash=UniqueHash(**p) )
we can directly call:
self.unique_hash( **p )
The purpose of this function is to allow indexing results of heavy computations which were configured with
Configwith a simple hash key. A typical application is caching of results based on the relevant user-configuration.An example for a simplistic cache:
from cdxcore.config import Config import tempfile as tempfile import pickle as pickle def big_function( cache_dir : str, config : Config = None, **kwargs ): assert not cache_dir[-1] in ["/","\\"], cache_dir config = Config.config_kwargs( config, kwargs ) uid = config.unique_hash(length=8) cfile = f"{cache_dir}/{uid}.pck" # attempt to read cache try: with open(cfile, "rb") as f: return pickle.load(f) except FileNotFoundError: pass # do something big... result = config("a", 0, int, "Value 'a'") * 1000 # write cache with open(cfile, "wb") as f: pickle.dump(result,f) return result cache_dir = tempfile.mkdtemp() # for real applications, use a permanent cache_dir. _ = big_function( cache_dir = cache_dir, a=1 ) print(_)
A more sophisticated framework which includes code versioning via
cdxcore.version.version()is implemented withcdxcore.subdir.SubDir.cache().Unique Hash Default Semantics
Please consult the documentation for
cdxcore.uniquehash.UniqueHashbefore using this functionality; in particular note that by default this function ignores config keys or children with leading underscores; setparse_underscoreto"protected"or"private"to change this behaviour.Why is “Usage” not Considered when Computing the Hash (by Default)
When using
Configto configure our environment, then we have not only the user’s input values but also the realized values in the form of defaults for those values the user has not provided. In most cases, these are the majority of values.By only considering actual input values when computing a hash, we stipulate that defaults are not part of the current unique characteristic of the environment.
That seems inconsistent: consider a program which reads a parameter
activationwith defaultrelu. The hash key will be different for the case where the user does not provide a value foractivation, and the case where its value is set toreluby the user. The effectiveactivationvalue in both cases isrelu– why would we not want this to be identified as the same environment configuration.The following illustrates this dilemma:
def big_function( config ): _ = config("activation", "relu", str, "Activation function") config.done() config = Config() big_function( config ) print( config.unique_hash(length=8) ) # -> 36e9d246 config = Config(activation="relu") big_function( config ) print( config.unique_hash(length=8) ) # -> d715e29c
Robustness
The key driver of using only input values for hashing is the prevalence of reading (child) configs close to the use of their parameters. That means that often config parameters are only read (and therefore their usage registered) if the respective computation is actually executed: even the
big_functionexample above shows this issue: the callconfig("a", 0, int, "Value 'a'")will only be executed if the cache could not be found.This can be rectified if it is ensured that all config parameters are read regardless of actual executed code. In this case, set the parameter
input_onlyforunique_hash()toFalse. Note that when usingcdxcore.config.Config.detach()you must make sure to have processed all detached configurations before callingunique_hash().- Parameters:
- unique_hash_parametersdict
If
unique_hashisNonethese parameters are passed tocdxcore.uniquehash.UniqueHash.__call__()to obtain the corrsponding hashing function.- unique_hashCallable
A function to return unique hashes, usally generated using
cdxcore.uniquehash.UniqueHash.- debug_trace
cdxcore.uniquehash.DebugTrace Allows tracing of hashing activity for debugging purposes. Two implementations of
DebugTraceare currently available:cdxcore.uniquehash.DebugTraceVerbosesimply prints out hashing activity to stdout.cdxcore.uniquehash.DebugTraceCollectcollects an array of tracing information. The object itself is an iterable which contains the respective tracing information once the hash function has returned.
- input_onlybool
Expert use only.
If True (the default) only user-provided inputs are used to compute the unique hash. If False, then the result of
cdxcore.config.Config.usage_value_dict()is used to generate the hash. Make sure you read and understand the discussion above on the topic.
- Returns:
- Unique hash, str
A unique hash of at most the length specified via either
unique_hashorunique_hash_parameters.
- update(other=None, **kwargs)[source]#
Overwrite values of ‘self’ new values. Accepts the two main formats:
update( dictionary ) update( config ) update( a=1, b=2 ) update( {'x.a':1 } ) # hierarchical assignment self.x.a = 1
- Parameters:
- otherdict, Config
Copy all content of
otherinto``self``.If
otheris a config: elements will be clean_copy()ed;otherwill not be marked as “read”.If
otheris a dictionary, then ‘.’ notation can be used for hierarchical assignments- **kwargs
Allows assigning specific values.
- Returns:
- selfConfig
- usage_report(with_values=True, with_help=True, with_defaults=True, with_cast=False, filter_path=None)[source]#
Generate a human readable report of all variables read from this config.
- Parameters:
- with_valuesbool, optional
Whether to also print values. This can be hard to read if values are complex objects
- with_help: bool, optional
Whether to print help
- with_defaults: bool, optional
Whether to print default values
- with_cast: bool, optional
Whether to print types
- filter_pathstr, optional
If provided, will match the beginning of the fully qualified path of all children vs this string. Most useful with
filter_path = self.config_namewhich ensures only children of this (child) config are shown.
- Returns:
- Reportstr
- usage_reproducer()[source]#
Returns a string representation of current usage, calling
repr()for each value.
- usage_value_dict()[source]#
Return a flat sorted dictionary of both “used” and, where not used, “input” values.
A “used” value has either been read from user input or was provided as a default. In both cases, it will have been subject to casting.
This function will raise a
RuntimeErrorin either of the following two cases:A key was marked as “done” (read), but no “value” was recorded at that time. A simple example is when
cdxcore.config.Config.detach()was called to create a child config, but that config has not yet been read.A key has not been read yet, but there is a record of a value being returned. An example of this happening is if
cdxcore.config.Config.reset_done()is called.
- cdxcore.config.Float = <cdxcore.config._CastCond object>#
Allows to apply basic range conditions to
floatparameters.For example:
timeout = config("timeout", 0.5, Float>=0., "Timeout")
In combination with
&we can limit a float to a range:probability = config("probability", 0.5, (Float>=0.) & (Float <= 1.), "Probability")
- exception cdxcore.config.InconsistencyError(key, config_name, message)[source]#
Bases:
RuntimeErrorRaised when
cdxcore.config.Config.__call__()used inconsistently between function calls for a given parameter.The
Configsemantics require that parameters are accessed used with consistent default and help values betweencdxcore.config.Config.__call__()calls.For raw access to any parameters, use
[].- config_name#
Hierarchical name of the config.
- key#
The offending parameter key.
- cdxcore.config.Int = <cdxcore.config._CastCond object>#
Allows to apply basic range conditions to
intparameters.For example:
num_steps = config("num_steps", 1, Int>0., "Number of steps")
In combination with
&we can limit an int to a range:bus_days_per_year = config(“bus_days_per_year”, 255, (Int > 0) & (Int < 365), “Business days per year”)
- exception cdxcore.config.NotDoneError(not_done, config_name, message)[source]#
Bases:
RuntimeErrorRaised when
cdxcore.config.Config.done()finds that some config parameters have not been read.The set of those arguments is accessible via
cdxcore.config.NotDoneError.not_done.- config_name#
Hierarchical name of the config.
- not_done#
The parameter keys which were not read when
cdxcore.config.Config.done()was called.
- cdxcore.config.config_kwargs(config, kwargs, config_name='kwargs')#
Default implementation for a usage pattern where the user can provide both a
cdxcore.config.Configparameter and** kwargs.Example:
def f(config, **kwargs): config = Config.config_kwargs( config, kwargs ) ... x = config("x", 1, ...) config.done() # <-- important to do this here. Remembert that config_kwargs() calls 'detach'
and then one can use either of the following:
f(Config(x=1)) f(x=1)
Important:
config_kwargscallscdxcore.config.Config.detach()to obtain a copy ofconfig. This meanscdxcore.config.Config.done()must be called explicitly for the returned object even ifdone()will be called elsewhere for the sourceconfig.- Parameters:
- configConfig
A
Configobject orNone.- kwargsMapping
If
configis provided, the function will callcdxcore.config.Config.update()withkwargs.- config_namestr
A declarative name for the config if
configis not proivded.
- Returns:
- configConfig
A new config object. Please note that if
configwas provided, then this a copy obtained from callingcdxcore.config.Config.detach(), which means thatcdxcore.config.Config.done()must be called explicitly for this object to ensure no parameters were misspelled (it is not sufficient ifcdxcore.config.Config.done()is called forconfig.)
- cdxcore.config.no_default = <cdxcore.config._ID object>#
Value indicating no default is available for a given parameter.
- cdxcore.config.to_config(kwargs, config_name='kwargs')#
Assess whether a parameters is a
cdxcore.config.Config, and otherwise tries to convert it into one. Classic use case is to transform** kwargsto acdxcore.config.Configto allow type checking and prevent spelling errors.- Returns:
- configConfig
If
kwargsis already acdxcore.config.Configit is returned. Otherwise, create a newcdxcore.config.Configfromkwargsnamed usingconfig_name.