Reslib Config (reslib.config)

This module facilitates reading configurations from file, and provides some defaults based on ‘best practices’ a la Cookiecutter Data Science.

The Config object contains all the configuration values for a given project. The values are accessable in python as dictionary objects as well as attributes. This is to say that the following little example will work:

config = Config()
assert(config['DATA_DIR'] == config.DATA_DIR)

NOTE: The config object is a Singleton, and is only ever initialized once. Subsequent initialization must happen manually, but will propegate across all instances of the config object. See below for a solution if you do not want this behavior.

Finding the config file:

The config file will be searched for by taking the pwd, and looking for the file_name all the way up the directory tree until it hits the root. If no file_name is found, the Config object uses some sane default values. These defaults can be seen in the code.

The file name of the config file defaults to reslib.config.[py|json].

Loading the config file:

The config can be loaded two ways (three if you count accepting defaults):

  1. JSON

    The config file can be written as a JSON file, whereby all CAPITAL keys will be imported into the Config object.

  2. Python

    The config file can be written as a Python file, which will be eval-ed and then all CAPITAL keys will be imported into the Config object.

Example config file:

An example setup for config is the following directory structure:

├── project_python_library
│   ├── globals.py
│   └── __init__.py
└── notebooks
    └── 0_imports.ipynb

Where the globals.py file contains:

import os
try:
    # If importing from reslib.config, config_path is in scope,
    # and points to this file
    __moduledir = os.path.dirname(os.path.abspath(config_path))
except NameError:
    # If config_path is missing, then this file is being imported directly,
    # so __file__ will exist.
    __moduledir = os.path.dirname(os.path.abspath(__file__))

ROOT_DIR = os.path.abspath(os.path.join(__moduledir, '../'*2))

DATA_DIR = os.path.join(ROOT_DIR, 'data')
DATA_DIR_EXTERNAL = os.path.join(DATA_DIR, 'external')
DATA_DIR_INTERIM = os.path.join(DATA_DIR, 'interim')
DATA_DIR_FINAL = os.path.join(DATA_DIR, 'final')

And the __init__.py file contains:

from reslib import config as __config
config = __config.Config('project_python_library/globals.py')

NOTE: Without the project_python_library/ in the config path, reslib won’t find the globals.py file if it is in the library. If you put the config file outside the library, then you just need Config(‘globals.py’).

Removing the Singleton functionality:

The Config object is a singleton, meaning there’s only one copy of its data in memory (effectively). Below is an example of what this means:

# fileA.py -- runs first
config = Config('myproject.json')
config['NEW_VAR'] = 'new value'

# fileB.py -- runs second
config = Config('other_name_ignored.py')
print(config['NEW_VAR'])
# --> 'new value'

# other_name_ignored.py -- doesn't get read
NEW_VAR = 'ignored var'

If you wish to have multiple configs for multiple parts of your program, I suggest two solutions:

  1. Manual prefixing:

    PARTA_ROOT_PATH = ‘folder for part A/data/’

  2. Subclassing:

    Make your own config object. Inherit this object with just def __init__, but omit the: self.__dict__ = self.__borg_data:

    class MultiConfig(reslib.config.Config):
        def __init__(self, config_name=None, config_path=None, **kwargs):
            dict.__init__(self, kwargs or {})
            self.config_path = config_path or self._get_config_path(config_name)
            self._populate_from_file(self.config_path, **kwargs)
    
copyright
  1. 2019 by Maclean Gaulin.

license

MIT, see LICENSE for more details.

class reslib.config.Config(config_name=None, config_path=None, **kwargs)[source]

Bases: object

The config object for a project, which has config values as both dictionary objects as well as attributes.

_get_config_path(config_name='reslib.config')[source]

Find the config file by name. The config file will be searched for by taking the current working directory, and looking for the config_name all the way up the directory tree until it hits the root.

If no config_name is provided, the name defaults to: ‘reslib.config’

Parameters

config_name – name of config file. Name can include or exclude .json or .py, both are tried (in that order) if config_name alone isn’t found.

Returns

Path of the found config file, or None.

_get_dict_from_file(config_path, **kwargs)[source]

Make a dictionary from a python or json file, based on extension. Only includes keys which are CAPITALIZED.

If using a Python config file, the config file path is added into global scope for the evaluation with the variable name config_file, meaning in the config.py file, the following will print the full path to the config.py file:

print(config_path)

NOTE: A python config file is eval-ed, so this is potentially an attack vector. Please don’t load a python config file you aren’t completely comfortable with.

Parameters
  • config_path – Full path of the config file.

  • **kwargs – Optional read-arguments passed to open.

Returns

Dictionary of KEY:value pairs where KEY is all CAPITALIZED

keys found in the Python/JSON file.

Return type

dict

Raises

ValueError – If the config file doesn’t have .json or .py extension.

_populate_from_dict(config_dict)[source]

Populate the Config from a dictionary.

NOTE: A python config file is eval-ed, so this is potentially an attack vector. Please don’t load a python config file you aren’t completely comfortable with.

Parameters
  • config_path – Full path of the config file. Default: config_path from the Config object.

  • **kwargs – Optional read-arguments passed to open.

Returns

Dictionary which was added to the Config object.

Return type

dict

_populate_from_file(config_path=None, silent=False, **kwargs)[source]

Populate the Config from a python or json file, based on extension. Only includes keys which are CAPITALIZED.

NOTE: A python config file is eval-ed, so this is potentially an attack vector. Please don’t load a python config file you aren’t completely comfortable with.

Parameters
  • config_path – Full path of the config file. Default: config_path from the Config object.

  • silent – Boolean flag for whether FileNotFoundError is raised if the config_path doesn’t exist.

  • **kwargs – Optional read-arguments passed to open.

Returns

Dictionary which was added to the Config object.

Return type

dict

Raises
  • FileNotFoundError – If file isn’t found, and silent is False.

  • ValueError – If the config file doesn’t have .json or .py extension.