Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #106456 > unrolled thread
| Started by | "Josh B." <jabronson@gmail.com> |
|---|---|
| First post | 2016-04-04 09:47 -0700 |
| Last post | 2016-04-06 11:46 +1000 |
| Articles | 9 — 5 participants |
Back to article view | Back to comp.lang.python
Best Practices for Internal Package Structure "Josh B." <jabronson@gmail.com> - 2016-04-04 09:47 -0700
Re: Best Practices for Internal Package Structure "Sven R. Kunze" <srkunze@mail.de> - 2016-04-04 19:03 +0200
Re: Best Practices for Internal Package Structure Michael Selik <michael.selik@gmail.com> - 2016-04-04 18:45 +0000
Re: Best Practices for Internal Package Structure Mark Lawrence <breamoreboy@yahoo.co.uk> - 2016-04-04 21:15 +0100
Re: Best Practices for Internal Package Structure Steven D'Aprano <steve@pearwood.info> - 2016-04-05 11:43 +1000
Re: Best Practices for Internal Package Structure Steven D'Aprano <steve@pearwood.info> - 2016-04-06 09:29 +1000
Re: Best Practices for Internal Package Structure Steven D'Aprano <steve@pearwood.info> - 2016-04-06 09:38 +1000
Re: Best Practices for Internal Package Structure Steven D'Aprano <steve@pearwood.info> - 2016-04-06 12:09 +1000
Re: Best Practices for Internal Package Structure Steven D'Aprano <steve@pearwood.info> - 2016-04-06 11:46 +1000
| From | "Josh B." <jabronson@gmail.com> |
|---|---|
| Date | 2016-04-04 09:47 -0700 |
| Subject | Best Practices for Internal Package Structure |
| Message-ID | <22a2f21d-7fd2-468a-9b6e-184837506157@googlegroups.com> |
My package, available at https://github.com/jab/bidict, is currently laid out like this:
bidict/
├── __init__.py
├── _bidict.py
├── _common.py
├── _frozen.py
├── _loose.py
├── _named.py
├── _ordered.py
├── compat.py
├── util.py
I'd like to get some more feedback on a question about this layout that I originally asked here: <https://github.com/jab/bidict/pull/33#issuecomment-193877248>:
What do you think of the code layout, specifically the use of the _foo modules? It seems well-factored to me, but I haven't seen things laid out this way very often in other projects, and I'd like to do this as nicely as possible.
It does kind of bug me that you see the _foo modules in the output when you do things like this:
>>> import bidict
>>> bidict.bidict
<class 'bidict._bidict.bidict'>
>>> bidict.KeyExistsError
<class 'bidict._common.KeyExistsError'>
In https://github.com/jab/bidict/pull/33#issuecomment-205381351 a reviewer agrees:
"""
Me too, and it confuses people as to where you should be importing things from if you want to catch it, inviting code like
```
import bidict._common
try:
...
except bidict._common.KeyExistsError:
...
```
ie. becoming dependent on the package internal structure.
I would be tempted to monkey-patch .__module__ = __name__ on each imported class to get around this. Maybe there are downsides to doing magic of that kind, but dependencies on the internals of packages are such a problem for me in our very large codebase, that I'd probably do it anyway in order to really explicit about what the public API is.
"""
Curious what folks on this list recommend, or if there are best practices about this published somewhere.
Thanks,
Josh
[toc] | [next] | [standalone]
| From | "Sven R. Kunze" <srkunze@mail.de> |
|---|---|
| Date | 2016-04-04 19:03 +0200 |
| Message-ID | <mailman.28.1459789399.32530.python-list@python.org> |
| In reply to | #106456 |
Hi Josh, good question. On 04.04.2016 18:47, Josh B. wrote: > My package, available at https://github.com/jab/bidict, is currently laid out like this: > > bidict/ > ├── __init__.py > ├── _bidict.py > ├── _common.py > ├── _frozen.py > ├── _loose.py > ├── _named.py > ├── _ordered.py > ├── compat.py > ├── util.py > > > I'd like to get some more feedback on a question about this layout that I originally asked here: <https://github.com/jab/bidict/pull/33#issuecomment-193877248>: > > What do you think of the code layout, specifically the use of the _foo modules? It seems well-factored to me, but I haven't seen things laid out this way very often in other projects, and I'd like to do this as nicely as possible. > > It does kind of bug me that you see the _foo modules in the output when you do things like this: > [code] we had a similar discussion internally. We have various packages requiring each other but have some internals that should not be used outside of them. The _ signifies that actually clearly but it looks weird within the package itself. We haven't found a solution so far. Maybe others do. Best, Sven
[toc] | [prev] | [next] | [standalone]
| From | Michael Selik <michael.selik@gmail.com> |
|---|---|
| Date | 2016-04-04 18:45 +0000 |
| Message-ID | <mailman.30.1459795553.32530.python-list@python.org> |
| In reply to | #106456 |
On Mon, Apr 4, 2016 at 6:04 PM Sven R. Kunze <srkunze@mail.de> wrote: > Hi Josh, > > good question. > > On 04.04.2016 18:47, Josh B. wrote: > > My package, available at https://github.com/jab/bidict, is currently > laid out like this: > > > > bidict/ > > ├── __init__.py > > ├── _bidict.py > > ├── _common.py > > ├── _frozen.py > > ├── _loose.py > > ├── _named.py > > ├── _ordered.py > > ├── compat.py > > ├── util.py > > > > > > I'd like to get some more feedback on a question about this layout that > I originally asked here: < > https://github.com/jab/bidict/pull/33#issuecomment-193877248>: > > > > What do you think of the code layout, specifically the use of the _foo > modules? It seems well-factored to me, but I haven't seen things laid out > this way very often in other projects, and I'd like to do this as nicely as > possible. > Using the _module.py convention for internals is fine, except that you have few enough lines of code that you could have far fewer files. Why create a package when you can just have a module, bidict.py? I find it easier to find the right section of my code when I have just a few files open rather than a dozen or so in different windows and tabs.
[toc] | [prev] | [next] | [standalone]
| From | Mark Lawrence <breamoreboy@yahoo.co.uk> |
|---|---|
| Date | 2016-04-04 21:15 +0100 |
| Message-ID | <mailman.35.1459800975.32530.python-list@python.org> |
| In reply to | #106456 |
On 04/04/2016 19:45, Michael Selik wrote: > On Mon, Apr 4, 2016 at 6:04 PM Sven R. Kunze <srkunze@mail.de> wrote: > >> Hi Josh, >> >> good question. >> >> On 04.04.2016 18:47, Josh B. wrote: >>> My package, available at https://github.com/jab/bidict, is currently >> laid out like this: >>> >>> bidict/ >>> ├── __init__.py >>> ├── _bidict.py >>> ├── _common.py >>> ├── _frozen.py >>> ├── _loose.py >>> ├── _named.py >>> ├── _ordered.py >>> ├── compat.py >>> ├── util.py >>> >>> >>> I'd like to get some more feedback on a question about this layout that >> I originally asked here: < >> https://github.com/jab/bidict/pull/33#issuecomment-193877248>: >>> >>> What do you think of the code layout, specifically the use of the _foo >> modules? It seems well-factored to me, but I haven't seen things laid out >> this way very often in other projects, and I'd like to do this as nicely as >> possible. >> > > Using the _module.py convention for internals is fine, except that you have > few enough lines of code that you could have far fewer files. Why create a > package when you can just have a module, bidict.py? > > I find it easier to find the right section of my code when I have just a > few files open rather than a dozen or so in different windows and tabs. > +1 -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve@pearwood.info> |
|---|---|
| Date | 2016-04-05 11:43 +1000 |
| Message-ID | <5703185f$0$1611$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #106456 |
On Tue, 5 Apr 2016 02:47 am, Josh B. wrote: > My package, available at https://github.com/jab/bidict, is currently laid > out like this: > > bidict/ > ├── __init__.py > ├── _bidict.py > ├── _common.py > ├── _frozen.py > ├── _loose.py > ├── _named.py > ├── _ordered.py > ├── compat.py > ├── util.py The purpose of packages isn't enable Java-style "one class per file" coding, especially since *everything* in the package except the top level "bidict" module itself is private. bidict.compat and bidict.util aren't flagged as private, but they should be, since there's nothing in either of them that the user of a bidict class should care about. (utils.py does export a couple of functions, but they should be in the main module, or possibly made into a method of BidirectionalMapping.) Your package is currently under 500 lines. As it stands now, you could easily flatten it to a single module: bidict.py Unless you are getting some concrete benefit from a package structure, you shouldn't use a package just for the sake of it. Even if the code doubles in size, to 1000 lines, that's still *far* below the point at which I believe a single module becomes unwieldy just from size. At nearly 6500 lines, the decimal.py module is, in my opinion, *almost* at the point where just size alone suggests splitting the file into submodules. Your module is nowhere near that point. It seems to me that you're paying the cost of the increased complexity needed to handle a package, but not (as far as I can tell) gaining any benefit from it. Certainly the *users* of your package aren't: all the public classes are pre-imported by the __init__.py file, so they don't even get the advantage of only needing to import classes that they actually use. So my recommendation would be to collapse the package to a single file. If you choose to reject that recommendation, or perhaps you are getting some benefit that I haven't spotted in a cursory look over the package, then my suggestion is to document that the *only* public interface is the top-level "import bidict", and that *all* submodules are private. Then drop the underscores, possibly re-arrange a bit: bidict/ ├── __init__.py # the only public part of the package ├── bidict.py ├── common.py # includes compat and util ├── frozen.py ├── loose.py ├── named.py ├── ordered.py If you're worried about the lack of underscores for private submodules, then consider this alternate structure: _bidict/ ├── __init__.py ├── bidict.py ├── common.py ├── frozen.py ├── loose.py ├── named.py ├── ordered.py bidict.py where "bidict.py" is effectively little more than: from _bidict import * > What do you think of the code layout, specifically the use of the _foo > modules? It seems well-factored to me, but I haven't seen things laid out > this way very often in other projects, and I'd like to do this as nicely > as possible. It's very pretty, and well-organised, but I'm not sure you're actually gaining any advantage from it. > It does kind of bug me that you see the _foo modules in the output when > you do things like this: > >>>> import bidict >>>> bidict.bidict > <class 'bidict._bidict.bidict'> >>>> bidict.KeyExistsError > <class 'bidict._common.KeyExistsError'> > > > In https://github.com/jab/bidict/pull/33#issuecomment-205381351 a reviewer > agrees: > > """ > Me too, and it confuses people as to where you should be importing things > from if you want to catch it, inviting code like > > ``` > import bidict._common That, at least, should be an obvious no-no, as it's including a single underscore private name. I wouldn't worry about that too much: if your users are so naive or obnoxious that they're ignoring your documentation and importing _private modules, there's nothing you can do: they'll find *some* way to shoot themselves in the foot, whatever you do. [Aside: my devs at work had reason to look inside a script written by one of their now long-moved away former colleagues yesterday, as it recently broke. After reading the script, the *first* thing they did was change the way it was called from: scriptname --opts steve # yes, he really did use my name as the mandatory argument to the script to scriptname --opts forfucksakeandrew where the name "andrew" has been changed to protect the guilty. Using my name as the script argument is, apparently, the *least* stupid thing the script does.] > try: > ... > except bidict._common.KeyExistsError: > ... > ``` > ie. becoming dependent on the package internal structure. > > I would be tempted to monkey-patch .__module__ = __name__ on each imported > class to get around this. Maybe there are downsides to doing magic of that > kind, but dependencies on the internals of packages are such a problem for > me in our very large codebase, that I'd probably do it anyway in order to > really explicit about what the public API is. """ Does his team not do internal code reviews? I hate code that lies about where it comes from. I certainly wouldn't do it in a futile attempt to protect idiots from themselves. > Curious what folks on this list recommend, or if there are best practices > about this published somewhere. > > Thanks, > Josh -- Steven
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve@pearwood.info> |
|---|---|
| Date | 2016-04-06 09:29 +1000 |
| Message-ID | <57044a48$0$1603$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #106482 |
On Wed, 6 Apr 2016 03:38 am, Sven R. Kunze wrote: > On 05.04.2016 03:43, Steven D'Aprano wrote: >> The purpose of packages isn't enable Java-style "one class per file" >> coding, especially since *everything* in the package except the top level >> "bidict" module itself is private. bidict.compat and bidict.util aren't >> flagged as private, but they should be, since there's nothing in either >> of them that the user of a bidict class should care about. >> >> (utils.py does export a couple of functions, but they should be in the >> main module, or possibly made into a method of BidirectionalMapping.) >> >> Your package is currently under 500 lines. As it stands now, you could >> easily flatten it to a single module: >> >> bidict.py > > I don't recommend this. > > The line is blurry but 500 is definitely too much. Those will simply not > fit on a 1 or 2 generous single screens anymore (which basically is our > guideline). Are you serious that you work under the restriction that the entire file must fit on a single screen? Or at most two pages? So basically you've moved the burden of scrolling up and down a single file to find the code you want into searching through the file system to find the code that you want. I don't see this as an advantage. > The intention here is to always have a bit more of a full > screen of code (no wasted pixels) while benefiting from switching to > another file (also seeing a full page of other code). I don't understand this benefit. How does exceeding 1-2 pages of code in a file prevent you from opening a second file in a separate window? As we speak, I have 28 text editor windows open. Not all of them are Python code, but among those which are, I have: 216 lines 157 lines 906 lines 687 lines 173 lines 373 lines 402 lines 6409 lines 509 lines 97 lines 163 lines 393 lines 1725 lines 25 lines I have no problem with working with multiple file despite exceeding the limit, so I don't understand the purpose of this guideline. -- Steven
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve@pearwood.info> |
|---|---|
| Date | 2016-04-06 09:38 +1000 |
| Message-ID | <57044c5c$0$1620$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #106482 |
On Wed, 6 Apr 2016 04:40 am, Ethan Furman wrote: > Well, there should be one more module: > > test.py > > So in total, two files > > bidict/ > |-- __init__.py > |-- test.py Your test code shouldn't necessarily be part of the package though. If I already have a package, then I will usually stick the test code inside it, but if I have a single module, I keep the test code in a separate file and don't bother installing it. It's there in the source repo for those who want it. > will do the trick. Oh, and you want a README, LICENSE, a doc file. And > that should do it. :) None of which ought to be part of the package itself. Well, perhaps the README. -- Steven
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve@pearwood.info> |
|---|---|
| Date | 2016-04-06 12:09 +1000 |
| Message-ID | <57046fc4$0$1604$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #106550 |
On Wed, 6 Apr 2016 09:54 am, Ethan Furman wrote: > On 04/05/2016 04:38 PM, Steven D'Aprano wrote: >> On Wed, 6 Apr 2016 04:40 am, Ethan Furman wrote: >> >>> Well, there should be one more module: >>> >>> test.py >>> >>> So in total, two files >>> >>> bidict/ >>> |-- __init__.py >>> |-- test.py >> >> >> Your test code shouldn't necessarily be part of the package though. If I >> already have a package, then I will usually stick the test code inside >> it, but if I have a single module, I keep the test code in a separate >> file and don't bother installing it. It's there in the source repo for >> those who want it. >> >> >>> will do the trick. Oh, and you want a README, LICENSE, a doc file. And >>> that should do it. :) >> >> None of which ought to be part of the package itself. Well, perhaps the >> README. > > If it's not part of the package, how does it get installed? And where? > (Honest question -- I find the packaging and distribution process > confusing at best.) Why do they need to be installed? They're part of the development environment, not the module. (Actually, I take it back about the README, since conventionally that's used to give installation instructions. Once the module is installed, you don't usually need the README any more. But the doc file might be more useful.) I don't know what life is like in the brave new world of wheels, eggs, pip and other new-fangled complications^W tools, but in the good old fashioned setuptools world, you have something like a zip file or tarball that contains your code, plus an installation script. The contents of the zip file will be something like this, taken from PyParsing version 2.1.1: build/ CHANGES dist/ docs/ examples/ htmldoc/ HowToUsePyparsing.html LICENSE MANIFEST.in PKG-INFO pyparsingClassDiagram.JPG pyparsingClassDiagram.PNG README pyparsing.egg-info/ pyparsing.py robots.txt setup.cfg setup.py You unzip the file, cd into that directory and run: python setup.py install which then magically installs the module somewhere that just works. In this specific case, it installs the module in a file /usr/local/lib/python3.3/site-packages/pyparsing-2.1.1-py3.3.egg which happens to be a vanilla zip file containing the pyparsing.py file, a byte-code compiled .pyc version, and a directory EGG-INFO that presumably allows pip or whatever to uninstall it again, or something, who the hell knows how this works. But the installer could just have easily have just installed the pyparsing.py file alone. The point is, the tests, documentation, etc aren't necessarily part of the module, so they don't necessarily need to be installed into your site-packages. If the user wants to read the HTML docs, they can browse into the expanded source tarball and read them there, or drag the docs into their home directory, or whatever they feel like. If they want to run the tests, they run them from the source directory, not site-packages. Of course, if you prefer to keep the tests inside the package, that's up to you. But you needn't feel that this is the best or only place for them. -- Steven
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve@pearwood.info> |
|---|---|
| Date | 2016-04-06 11:46 +1000 |
| Message-ID | <57046a87$0$1598$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #106482 |
On Wed, 6 Apr 2016 05:56 am, Michael Selik wrote: [Sven R. Kunze] >> If you work like in the 80's, maybe. Instead of scrolling, (un)setting >> jumppoints, or use splitview of the same file, it's just faster/easier to >> jump between separate files in todays IDEs if you need to jump between 4 >> places within 3000 lines of code. [Michael] > When you made that suggestion earlier, I immediately guessed that you were > using PyCharm. I agree that the decision to split into multiple files or > keep everything in just a few files seems to be based on your development > tools. I use IPython and SublimeText, so my personal setup is more suited > to one or a few files. How does PyCharm make the use of many files easier? -- Steven
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web