cdxcore documentation#

This module contains a number of lightweight tools, developped for managing data analytica and machine learning projects.

Install using:

pip install -U cdxcore

Documentation can be found here: https://quantitative-research.de/docs/cdxcore.

Main Functionality#

  • cdxcore.dynaplot is a framework for simple dynamic graphs with matplotlib. It has a simple methodology for animated updates for graphs (e.g. during training runs), and allows generation of plot layouts without knowinng upfront the number of plots (e.g. for plotting a list of features).

    _images/dynaplot3D.gif
  • cdxcore.config allows robust management of configurations. It automates help, validation checking, and detects misspelled configuration arguments

    from cdxcore.config import Config, Int, Float
    class Network(object):
        def __init__( self, config ):
            self.depth      = config("depth", 1, Int>0, "Depth of the network")
            self.width      = config("width", 1, Int>0, "Width of the network")
            self.activation = config("activation", "selu", str, "Activation function")
            config.done() # see below
    
    config = Config()
    config.network.depth         = 10
    config.network.width         = 100
    config.network.activation    = 'relu'
    
    network = Network(config.network)
    config.done()
    
  • cdxcore.subdir wraps various file and directory functions into convenient objects. Useful if files have common extensions.

    from cdxcore.subdir import SubDir
    import numpy as np
    root   = SubDir("!")   # current temp directory
    subdir = root("test")  # sub-directory 'test'
    subdir.write("data", np.zeros((10,2)))
    data   = subdir.read("data")
    

    Caching: SubDir supports code-versioned file i/o which is used by cdxcore.subdir.SubDir.cache() for an efficient code-versioned caching protocol for functions and objects:

    from cdxcore.subdir import SubDir
    cache   = SubDir("!/.cache;*.bin")
    
    @cache.cache("0.1")
    def f(x,y):
       return x*y
    
    _ = f(1,2)    # function gets computed and the result cached
    _ = f(1,2)    # restore result from cache
    _ = f(2,2)    # different parameters: compute and store result
    
  • Code versioning is implemented in cdxcore.version.

    from cdxbasics.version import version
    
    @version("0.0.1")
    def f(x):
        return x
    
    print( f.version.full )   # -> 0.0.1
    
  • Hashing (which is used for caching above) is implemented in cdxcore.uniquehash:

    class A(object):
        def __init__(self, x):
            self.x = x
            self._y = x*2  # protected member will not be hashed by default
    
    from cdxcore.uniquehash import UniqueHash
    uniqueHash = UniqueHash(length=12)
    a = A(2)
    print( uniqueHash(a) ) # --> "2d1dc3767730"
    
  • cdxcore.pretty provides a PrettyObject class whose objects operate like dictionaries. This is for users who prefer attribute . notation over item access when building structured output.

    from cdxbasics.prettydict import PrettyObject
    pdct = PrettyObject(z=1)
    
    pdct.num_samples = 1000
    pdct.num_batches = 100
    pdct.method = "signature"
    

General purpose utilities#

  • cdxcore.verbose provides user-controllable context output for providing progress updates to users.

  • cdxcore.util offers a number of utility functions such as standard formatting for dates, big numbers, lists, dictionaries etc.

  • cdxcore.npio provides a low level binary i/o interface for numpy files.

  • cdxcore.npshm provides shared memory numpy arrays.

Contents#