Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #75939 > unrolled thread
| Started by | Fabien <fabien.maussion@gmail.com> |
|---|---|
| First post | 2014-08-09 13:48 +0200 |
| Last post | 2014-08-10 10:33 +0200 |
| Articles | 10 — 5 participants |
Back to article view | Back to comp.lang.python
The "right" way to use config files Fabien <fabien.maussion@gmail.com> - 2014-08-09 13:48 +0200
Re: The "right" way to use config files Ben Finney <ben+python@benfinney.id.au> - 2014-08-09 22:17 +1000
Re: The "right" way to use config files Fabien <fabien.maussion@gmail.com> - 2014-08-09 14:33 +0200
Re: The "right" way to use config files Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2014-08-09 12:16 -0400
Re: The "right" way to use config files Fabien <fabien.maussion@gmail.com> - 2014-08-09 19:17 +0200
Re: The "right" way to use config files Tim Chase <python.list@tim.thechases.com> - 2014-08-09 12:08 -0500
Re: The "right" way to use config files Terry Reedy <tjreedy@udel.edu> - 2014-08-09 13:29 -0400
Re: The "right" way to use config files Fabien <fabien.maussion@gmail.com> - 2014-08-09 20:14 +0200
Re: The "right" way to use config files Terry Reedy <tjreedy@udel.edu> - 2014-08-09 18:30 -0400
Re: The "right" way to use config files Fabien <fabien.maussion@gmail.com> - 2014-08-10 10:33 +0200
| From | Fabien <fabien.maussion@gmail.com> |
|---|---|
| Date | 2014-08-09 13:48 +0200 |
| Subject | The "right" way to use config files |
| Message-ID | <ls51q0$4ea$1@speranza.aioe.org> |
Folks,
I am not a computer scientist (just a scientist) and I'd like to ask
your opinion about a design problem I have. It's not that I can't get my
program to work, but rather that I have trouble to find an "elegant"
solution to the problem of passing information to my program's elements.
I have trouble to be clear in my request so my apologies for the long
text...
The tool I am developing is a classical data-analysis workflow. Ideally,
all the program's configurations are located in a single .cfg file which
I parse with ConfigObg. The file contains I/O informations
(path_to_input, path_to_output) as well as internal options
(use_this_function, dont_use_this_one, function1_paramx = y), etc...
Currently, my program is a "super-object" which is initialized once and
work schematically as follows:
main():
obj = init_superobj(config file)
obj.preprocess()
obj.process()
obj.write()
the superobj init routine parses the config files and reads the input data.
and a processing step can be, for example:
def process():
if self.configfile.as_bool('do_funcion1'):
params = config.parse_function1_params()
call_function1(self.data, params)
if self.configfile.as_bool('do_funcion2'):
params = config.parse_function2_params()
call_function2(self.data2, params)
The functions themselves do not know about the superobject or about the
configfile. They are "standalone" functions which take data and
parameters as input and produce output with it. I thought that the
standalone functions will be clearer and easier to maintain, since they
do not rely on some external data structure such as the configobj or
anything.
BUT, my "problem" is that several options really are "universal" options
to the program, such as the output directory for example. This
information (where to write their results) is given to most of the
functions as parameter.
So I had the idea to define a super-object which parses the config file
and input data and is given as a single parameter to the processing
functions, and the functions take the information they need from it.
This is tempting because there is no need for refactoring when I decide
to change something in the config, but I am afraid that the program may
become unmaintainable by someone else than myself. Another possibility
would be at least to give all the functions access to the configfile.
To get to the point: is it good practice to give all elements of a
program access to the configfile and if yes, how is it done "properly"?
I hope at least someone will understand what I mean ;-)
Cheers and thanks,
Fabien
[toc] | [next] | [standalone]
| From | Ben Finney <ben+python@benfinney.id.au> |
|---|---|
| Date | 2014-08-09 22:17 +1000 |
| Message-ID | <mailman.12792.1407586666.18130.python-list@python.org> |
| In reply to | #75939 |
Fabien <fabien.maussion@gmail.com> writes:
> So I had the idea to define a super-object which parses the config
> file and input data and is given as a single parameter to the
> processing functions, and the functions take the information they need
> from it.
That's not a bad idea, you could do that without embarrassment.
A better technique, though, is to make use of modules as namespaces.
Have one module of your application be responsible for the configuration
of the application::
# app/config.py
import configparser
parser = configparser.ConfigParser()
parser.read("app.conf")
and import that module everywhere else that needs it::
# app/wibble.py
from . import config
def frobnicate():
do_something_with(config.foo)
By using an imported module, the functions don't need to be
parameterised by application-wide configuration; they can simply access
the module from the global scope and thereby get access to that module's
attributes.
> To get to the point: is it good practice to give all elements of a
> program access to the configfile and if yes, how is it done
> "properly"?
There should be an encapsulation of the responsibility for parsing and
organising the configuration options, and the rest of the application
should access it only via that encapsulation.
Putting that encapsuation in a module is an appropriately Pythonic
technique.
--
\ “Now Maggie, I’ll be watching you too, in case God is busy |
`\ creating tornadoes or not existing.” —Homer, _The Simpsons_ |
_o__) |
Ben Finney
[toc] | [prev] | [next] | [standalone]
| From | Fabien <fabien.maussion@gmail.com> |
|---|---|
| Date | 2014-08-09 14:33 +0200 |
| Message-ID | <ls54fh$aoe$1@speranza.aioe.org> |
| In reply to | #75944 |
Hi Ben,
On 09.08.2014 14:17, Ben Finney wrote:
> Have one module of your application be responsible for the configuration
> of the application::
>
> # app/config.py
>
> import configparser
>
> parser = configparser.ConfigParser()
> parser.read("app.conf")
Thanks for the suggestion. This way to do is new to me, and I didn't
come to the idea myself. It seems like a good way to do this. But how to
give an argument to this config namespace? i.e I want "app.conf" to be
given as argument.
Currently my program starts like this:
def main():
# See if the user gave a configfile
if len(sys.argv) == 2:
# file was given as argument
cfg = str(sys.argv[1])
else:
# default file taken in the resource directory
cfg = os.path.abspath(os.path.join(os.path.dirname(__file__),
os.pardir,'res','default.cfg'))
obj = superobj(cfg)
obj.preprocess()
obj.process()
obj.write()
[toc] | [prev] | [next] | [standalone]
| From | Dennis Lee Bieber <wlfraed@ix.netcom.com> |
|---|---|
| Date | 2014-08-09 12:16 -0400 |
| Message-ID | <mailman.12794.1407601013.18130.python-list@python.org> |
| In reply to | #75945 |
On Sat, 09 Aug 2014 14:33:54 +0200, Fabien <fabien.maussion@gmail.com>
declaimed the following:
>Hi Ben,
>
>On 09.08.2014 14:17, Ben Finney wrote:
>> Have one module of your application be responsible for the configuration
>> of the application::
>>
>> # app/config.py
>>
>> import configparser
>>
>> parser = configparser.ConfigParser()
>> parser.read("app.conf")
>
>Thanks for the suggestion. This way to do is new to me, and I didn't
>come to the idea myself. It seems like a good way to do this. But how to
>give an argument to this config namespace? i.e I want "app.conf" to be
>given as argument.
>
Well, you could let the module access the command line arguments
directly (though I'd recommend against that). In effect, the bottom of the
imported module would have all the
sys.argv...
stuff followed by parsing the file provided file name (or a default set of
settings).
Better, in my view, is to have the import module set up default values
for everything, AND have a function at the bottom of the form
def initialize(fid=None):
if fid:
# parse file "fid" replacing the module level items
# this may require making a them all globals since
# assignments inside this function would be locals
And then your main program
import myconfig
...
myconfig.initialize(sys.argv[1])
--
Wulfraed Dennis Lee Bieber AF6VN
wlfraed@ix.netcom.com HTTP://wlfraed.home.netcom.com/
[toc] | [prev] | [next] | [standalone]
| From | Fabien <fabien.maussion@gmail.com> |
|---|---|
| Date | 2014-08-09 19:17 +0200 |
| Message-ID | <ls5l38$ita$1@speranza.aioe.org> |
| In reply to | #75948 |
Hi, On 09.08.2014 18:16, Dennis Lee Bieber wrote: > Better, in my view, is to have the import module set up default values > for everything, AND have a function at the bottom of the form > > def initialize(fid=None): > if fid: > # parse file "fid" replacing the module level items > # this may require making a them all globals since > # assignments inside this function would be locals > > And then your main program > > import myconfig > ... > myconfig.initialize(sys.argv[1]) Yes ok I think got it. Thanks! I like the idea and will implement it, this will avoid the useless superobject and allow to have to configfile available to anyone. Fabien
[toc] | [prev] | [next] | [standalone]
| From | Tim Chase <python.list@tim.thechases.com> |
|---|---|
| Date | 2014-08-09 12:08 -0500 |
| Message-ID | <mailman.12796.1407604196.18130.python-list@python.org> |
| In reply to | #75939 |
On 2014-08-09 13:48, Fabien wrote: > So I had the idea to define a super-object which parses the config > file and input data and is given as a single parameter to the > processing functions, and the functions take the information they > need from it. This is tempting because there is no need for > refactoring when I decide to change something in the config, but I > am afraid that the program may become unmaintainable by someone > else than myself. Another possibility would be at least to give all > the functions access to the configfile. > > To get to the point: is it good practice to give all elements of a > program access to the configfile and if yes, how is it done > "properly"? Though I don't like how it looks/feels to pass around the config in just about any function-call that needs it, I've found that doing so allows me to test more readily. The alternative (putting it in a global or some module) usually means that it's harder for me to test in isolation. -tkc
[toc] | [prev] | [next] | [standalone]
| From | Terry Reedy <tjreedy@udel.edu> |
|---|---|
| Date | 2014-08-09 13:29 -0400 |
| Message-ID | <mailman.12797.1407605431.18130.python-list@python.org> |
| In reply to | #75939 |
On 8/9/2014 7:48 AM, Fabien wrote: > BUT, my "problem" is that several options really are "universal" options > to the program, such as the output directory for example. This > information (where to write their results) is given to most of the > functions as parameter. If possible, functions should *return* their results, or yield their results in chunks (as generators). Let the driver function decide where to put results. Aside from separating concerns, this makes testing much easier. -- Terry Jan Reedy
[toc] | [prev] | [next] | [standalone]
| From | Fabien <fabien.maussion@gmail.com> |
|---|---|
| Date | 2014-08-09 20:14 +0200 |
| Message-ID | <ls5odq$qn9$1@speranza.aioe.org> |
| In reply to | #75952 |
On 09.08.2014 19:29, Terry Reedy wrote:
> If possible, functions should *return* their results, or yield their
> results in chunks (as generators). Let the driver function decide where
> to put results. Aside from separating concerns, this makes testing much
> easier.
I see. But then this is also true for parameters, right? And yet we
return to my original question ;-)
Let's say my configfile looks like this:
-----------------
### app/config.cfg
# General params
output_dir = '..'
input_file = '..'
# Func 1 params
[func1]
enable = True
threshold = 0.1
maxite = 1
-----------------
And I have a myconfig module which looks like:
-----------------
### app/myconfig.py
import ConfigObj
parser = obj() # parser will be instanciated by initialize
def initialize(cfgfile=None):
global parser
parser = ConfigObj(cfgfile, file_error=True)
-----------------
My main program could look like this:
-----------------
### app/mainprogram_1.py
import myconfig
def func1():
# the params are in the cfg
threshold = myconfig.parser['func1'].as_float('threshold')
maxite = myconfig.parser['func1'].as_long('maxite')
# dummy operations
score = 100.
ite = 1
while (score > threshold) and (ite < maxite):
score /= 10
ite += 1
# dummy return
return score
def main():
myconfig.initialize(sys.argv[1])
if myconfig.parser['func1'].as_bool('enable'):
results = func1()
if __name__ == '__main__':
main()
-----------------
Or like this:
-----------------
### app/mainprogram_2.py
import myconfig
def func1(threshold=None, maxite=None):
# dummy operations
score = 100.
ite = 1
while (score > threshold) and (ite < maxite):
score /= 10
ite += 1
# dummy return
return score
def main():
myconfig.initialize(sys.argv[1])
if myconfig.parser['func1'].as_bool('enable'):
# the params are in the cfg
threshold = myconfig.parser['func1'].as_float('threshold')
maxite = myconfig.parser['func1'].as_long('maxite')
results = func1(threshold=threshold, maxite=maxite)
if __name__ == '__main__':
main()
-----------------
In this case, program2 is easier to test/understand, but if the
parameters become numerous it could be a pain...
As always, I guess I'l have to decide on a case by case basis what is best.
[toc] | [prev] | [next] | [standalone]
| From | Terry Reedy <tjreedy@udel.edu> |
|---|---|
| Date | 2014-08-09 18:30 -0400 |
| Message-ID | <mailman.12802.1407623446.18130.python-list@python.org> |
| In reply to | #75955 |
On 8/9/2014 2:14 PM, Fabien wrote:
> On 09.08.2014 19:29, Terry Reedy wrote:
>> If possible, functions should *return* their results, or yield their
>> results in chunks (as generators). Let the driver function decide where
>> to put results. Aside from separating concerns, this makes testing much
>> easier.
>
> I see. But then this is also true for parameters, right? And yet we
> return to my original question ;-)
>
>
> Let's say my configfile looks like this:
>
> -----------------
> ### app/config.cfg
> # General params
> output_dir = '..'
> input_file = '..'
>
> # Func 1 params
> [func1]
> enable = True
> threshold = 0.1
> maxite = 1
> -----------------
>
> And I have a myconfig module which looks like:
>
> -----------------
> ### app/myconfig.py
>
> import ConfigObj
>
> parser = obj() # parser will be instanciated by initialize
Try parser = object() to actually run, but the line is not needed.
Instead put "parser: instantiated by initialize" in the docstring.
>
> def initialize(cfgfile=None):
> global parser
> parser = ConfigObj(cfgfile, file_error=True)
> -----------------
>
> My main program could look like this:
>
> -----------------
> ### app/mainprogram_1.py
>
> import myconfig
>
> def func1():
> # the params are in the cfg
> threshold = myconfig.parser['func1'].as_float('threshold')
> maxite = myconfig.parser['func1'].as_long('maxite')
>
> # dummy operations
> score = 100.
> ite = 1
> while (score > threshold) and (ite < maxite):
> score /= 10
> ite += 1
>
> # dummy return
> return score
>
> def main():
> myconfig.initialize(sys.argv[1])
>
> if myconfig.parser['func1'].as_bool('enable'):
> results = func1()
>
> if __name__ == '__main__':
> main()
> -----------------
The advantage of TDD is that it forces one to make code testable as you
do. Old code may not be designed to be so easily testable, as I have
learned trying to add tests to idlelib. For the above, I would consider
def func1_algo(threshhold, maxite): # possible separte file
score = 100.
ite = 1
while (score > threshold) and (ite < maxite):
score /= 10
ite += 1
return score
def func1(): # interface wrapper
threshold = myconfig.parser['func1'].as_float('threshold')
maxite = myconfig.parser['func1'].as_long('maxite')
return func1_algo(threshhold, maxite)
This is a slight bit of extra work, but now you can separately test (and
modify) the algorithm and the interfacing. Testing the algorithm is
easy, which encourages testing multiple i/o pairs.
for in, out in iopairs:
assert func1_algo(in) == out # or self.assertEqual, or ...
(or close enough for float outputs)
As for the interfacing: you can write and read multiple versions of
config.cfg (relatively slow), use something like unittest.mock to mock
the myconfig module, or write something fairly simple (py3 code).
class Entry(dict):
def as_bool(self, name):
s = self[name]
return True if s == 'True' else False if s == 'False' else None
def as_int(self, name):
return int(self[name])
as_long = as_int
def as_float(self, name):
return float(self[name])
class Config(object):
def initialize(self, argv):
pass
myconfig = Config() # a module is like a singleton class
myconfig.initialize('a') # test that does not raise
# In use for testing, uncomment the following two lines
# import mainprogram_1.py as mp1
# mp1.myconfig = myconfig
f1_cfg = Entry({
'enable': 'True',
'threshold': '0.1',
'maxite': '1',
})
myconfig.parser = {'func1': f1_cfg}
print(myconfig.parser['func1'].as_float('threshold') == 0.1)
print(myconfig.parser['func1'].as_long('maxite') == 1)
print(myconfig.parser['func1'].as_bool('enable') == True)
f1_cfg['maxite'] = 5
print(myconfig.parser['func1'].as_int('maxite') == 5)
# prints True 4 times
Notice that you inject the mock myconfig into the tested module just
one. After that, you can change anything within parser or replace parser
with a new dict.
> Or like this:
>
> -----------------
> ### app/mainprogram_2.py
>
> import myconfig
>
> def func1(threshold=None, maxite=None):
These should not have defaults; avoid extra work!
> # dummy operations
> score = 100.
> ite = 1
> while (score > threshold) and (ite < maxite):
> score /= 10
> ite += 1
>
> # dummy return
> return score
>
> def main():
> myconfig.initialize(sys.argv[1])
>
> if myconfig.parser['func1'].as_bool('enable'):
> # the params are in the cfg
> threshold = myconfig.parser['func1'].as_float('threshold')
> maxite = myconfig.parser['func1'].as_long('maxite')
> results = func1(threshold=threshold, maxite=maxite)
>
> if __name__ == '__main__':
> main()
> -----------------
>
> In this case, program2 is easier to test/understand, but if the
> parameters become numerous it could be a pain...
This is equivalent to what i wrote except for putting the wrapper inline
in main(). Testing is the same for either.
--
Terry Jan Reedy
[toc] | [prev] | [next] | [standalone]
| From | Fabien <fabien.maussion@gmail.com> |
|---|---|
| Date | 2014-08-10 10:33 +0200 |
| Message-ID | <ls7aob$hq$1@speranza.aioe.org> |
| In reply to | #75958 |
On 10.08.2014 00:30, Terry Reedy wrote: > The advantage of TDD is that it forces one to make code testable as you do. Thanks a lot, Terry, for your comprehensive example!
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web