Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #106456 > unrolled thread

Best Practices for Internal Package Structure

Started by"Josh B." <jabronson@gmail.com>
First post2016-04-04 09:47 -0700
Last post2016-04-06 11:46 +1000
Articles 9 — 5 participants

Back to article view | Back to comp.lang.python


Contents

  Best Practices for Internal Package Structure "Josh B." <jabronson@gmail.com> - 2016-04-04 09:47 -0700
    Re: Best Practices for Internal Package Structure "Sven R. Kunze" <srkunze@mail.de> - 2016-04-04 19:03 +0200
    Re: Best Practices for Internal Package Structure Michael Selik <michael.selik@gmail.com> - 2016-04-04 18:45 +0000
    Re: Best Practices for Internal Package Structure Mark Lawrence <breamoreboy@yahoo.co.uk> - 2016-04-04 21:15 +0100
    Re: Best Practices for Internal Package Structure Steven D'Aprano <steve@pearwood.info> - 2016-04-05 11:43 +1000
      Re: Best Practices for Internal Package Structure Steven D'Aprano <steve@pearwood.info> - 2016-04-06 09:29 +1000
      Re: Best Practices for Internal Package Structure Steven D'Aprano <steve@pearwood.info> - 2016-04-06 09:38 +1000
        Re: Best Practices for Internal Package Structure Steven D'Aprano <steve@pearwood.info> - 2016-04-06 12:09 +1000
      Re: Best Practices for Internal Package Structure Steven D'Aprano <steve@pearwood.info> - 2016-04-06 11:46 +1000

#106456 — Best Practices for Internal Package Structure

From"Josh B." <jabronson@gmail.com>
Date2016-04-04 09:47 -0700
SubjectBest Practices for Internal Package Structure
Message-ID<22a2f21d-7fd2-468a-9b6e-184837506157@googlegroups.com>
My package, available at https://github.com/jab/bidict, is currently laid out like this:

bidict/
├── __init__.py
├── _bidict.py
├── _common.py
├── _frozen.py
├── _loose.py
├── _named.py
├── _ordered.py
├── compat.py
├── util.py


I'd like to get some more feedback on a question about this layout that I originally asked here: <https://github.com/jab/bidict/pull/33#issuecomment-193877248>:

What do you think of the code layout, specifically the use of the _foo modules? It seems well-factored to me, but I haven't seen things laid out this way very often in other projects, and I'd like to do this as nicely as possible.

It does kind of bug me that you see the _foo modules in the output when you do things like this:

>>> import bidict
>>> bidict.bidict
<class 'bidict._bidict.bidict'>
>>> bidict.KeyExistsError
<class 'bidict._common.KeyExistsError'>


In https://github.com/jab/bidict/pull/33#issuecomment-205381351 a reviewer agrees:

"""
Me too, and it confuses people as to where you should be importing things from if you want to catch it, inviting code like

```
import bidict._common

try:
    ...
except bidict._common.KeyExistsError:
    ...
```
ie. becoming dependent on the package internal structure.

I would be tempted to monkey-patch .__module__ = __name__ on each imported class to get around this. Maybe there are downsides to doing magic of that kind, but dependencies on the internals of packages are such a problem for me in our very large codebase, that I'd probably do it anyway in order to really explicit about what the public API is.
"""

Curious what folks on this list recommend, or if there are best practices about this published somewhere.

Thanks,
Josh

[toc] | [next] | [standalone]


#106457

From"Sven R. Kunze" <srkunze@mail.de>
Date2016-04-04 19:03 +0200
Message-ID<mailman.28.1459789399.32530.python-list@python.org>
In reply to#106456
Hi Josh,

good question.

On 04.04.2016 18:47, Josh B. wrote:
> My package, available at https://github.com/jab/bidict, is currently laid out like this:
>
> bidict/
> ├── __init__.py
> ├── _bidict.py
> ├── _common.py
> ├── _frozen.py
> ├── _loose.py
> ├── _named.py
> ├── _ordered.py
> ├── compat.py
> ├── util.py
>
>
> I'd like to get some more feedback on a question about this layout that I originally asked here: <https://github.com/jab/bidict/pull/33#issuecomment-193877248>:
>
> What do you think of the code layout, specifically the use of the _foo modules? It seems well-factored to me, but I haven't seen things laid out this way very often in other projects, and I'd like to do this as nicely as possible.
>
> It does kind of bug me that you see the _foo modules in the output when you do things like this:
> [code]

we had a similar discussion internally. We have various packages 
requiring each other but have some internals that should not be used 
outside of them.

The _ signifies that actually clearly but it looks weird within the 
package itself.

We haven't found a solution so far. Maybe others do.


Best,
Sven

[toc] | [prev] | [next] | [standalone]


#106460

FromMichael Selik <michael.selik@gmail.com>
Date2016-04-04 18:45 +0000
Message-ID<mailman.30.1459795553.32530.python-list@python.org>
In reply to#106456
On Mon, Apr 4, 2016 at 6:04 PM Sven R. Kunze <srkunze@mail.de> wrote:

> Hi Josh,
>
> good question.
>
> On 04.04.2016 18:47, Josh B. wrote:
> > My package, available at https://github.com/jab/bidict, is currently
> laid out like this:
> >
> > bidict/
> > ├── __init__.py
> > ├── _bidict.py
> > ├── _common.py
> > ├── _frozen.py
> > ├── _loose.py
> > ├── _named.py
> > ├── _ordered.py
> > ├── compat.py
> > ├── util.py
> >
> >
> > I'd like to get some more feedback on a question about this layout that
> I originally asked here: <
> https://github.com/jab/bidict/pull/33#issuecomment-193877248>:
> >
> > What do you think of the code layout, specifically the use of the _foo
> modules? It seems well-factored to me, but I haven't seen things laid out
> this way very often in other projects, and I'd like to do this as nicely as
> possible.
>

Using the _module.py convention for internals is fine, except that you have
few enough lines of code that you could have far fewer files. Why create a
package when you can just have a module, bidict.py?

I find it easier to find the right section of my code when I have just a
few files open rather than a dozen or so in different windows and tabs.

[toc] | [prev] | [next] | [standalone]


#106466

FromMark Lawrence <breamoreboy@yahoo.co.uk>
Date2016-04-04 21:15 +0100
Message-ID<mailman.35.1459800975.32530.python-list@python.org>
In reply to#106456
On 04/04/2016 19:45, Michael Selik wrote:
> On Mon, Apr 4, 2016 at 6:04 PM Sven R. Kunze <srkunze@mail.de> wrote:
>
>> Hi Josh,
>>
>> good question.
>>
>> On 04.04.2016 18:47, Josh B. wrote:
>>> My package, available at https://github.com/jab/bidict, is currently
>> laid out like this:
>>>
>>> bidict/
>>> ├── __init__.py
>>> ├── _bidict.py
>>> ├── _common.py
>>> ├── _frozen.py
>>> ├── _loose.py
>>> ├── _named.py
>>> ├── _ordered.py
>>> ├── compat.py
>>> ├── util.py
>>>
>>>
>>> I'd like to get some more feedback on a question about this layout that
>> I originally asked here: <
>> https://github.com/jab/bidict/pull/33#issuecomment-193877248>:
>>>
>>> What do you think of the code layout, specifically the use of the _foo
>> modules? It seems well-factored to me, but I haven't seen things laid out
>> this way very often in other projects, and I'd like to do this as nicely as
>> possible.
>>
>
> Using the _module.py convention for internals is fine, except that you have
> few enough lines of code that you could have far fewer files. Why create a
> package when you can just have a module, bidict.py?
>
> I find it easier to find the right section of my code when I have just a
> few files open rather than a dozen or so in different windows and tabs.
>

+1

-- 
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

[toc] | [prev] | [next] | [standalone]


#106482

FromSteven D'Aprano <steve@pearwood.info>
Date2016-04-05 11:43 +1000
Message-ID<5703185f$0$1611$c3e8da3$5496439d@news.astraweb.com>
In reply to#106456
On Tue, 5 Apr 2016 02:47 am, Josh B. wrote:

> My package, available at https://github.com/jab/bidict, is currently laid
> out like this:
> 
> bidict/
> ├── __init__.py
> ├── _bidict.py
> ├── _common.py
> ├── _frozen.py
> ├── _loose.py
> ├── _named.py
> ├── _ordered.py
> ├── compat.py
> ├── util.py


The purpose of packages isn't enable Java-style "one class per file" coding,
especially since *everything* in the package except the top level "bidict"
module itself is private. bidict.compat and bidict.util aren't flagged as
private, but they should be, since there's nothing in either of them that
the user of a bidict class should care about.

(utils.py does export a couple of functions, but they should be in the main
module, or possibly made into a method of BidirectionalMapping.)

Your package is currently under 500 lines. As it stands now, you could
easily flatten it to a single module:

bidict.py

Unless you are getting some concrete benefit from a package structure, you
shouldn't use a package just for the sake of it. Even if the code doubles
in size, to 1000 lines, that's still *far* below the point at which I
believe a single module becomes unwieldy just from size. At nearly 6500
lines, the decimal.py module is, in my opinion, *almost* at the point where
just size alone suggests splitting the file into submodules. Your module is
nowhere near that point.

It seems to me that you're paying the cost of the increased complexity
needed to handle a package, but not (as far as I can tell) gaining any
benefit from it. Certainly the *users* of your package aren't: all the
public classes are pre-imported by the __init__.py file, so they don't even
get the advantage of only needing to import classes that they actually use. 

So my recommendation would be to collapse the package to a single file.

If you choose to reject that recommendation, or perhaps you are getting some
benefit that I haven't spotted in a cursory look over the package, then my
suggestion is to document that the *only* public interface is the
top-level "import bidict", and that *all* submodules are private. Then drop
the underscores, possibly re-arrange a bit:


bidict/
├── __init__.py  # the only public part of the package
├── bidict.py
├── common.py  # includes compat and util
├── frozen.py
├── loose.py
├── named.py
├── ordered.py


If you're worried about the lack of underscores for private submodules, then
consider this alternate structure:


_bidict/
├── __init__.py
├── bidict.py
├── common.py
├── frozen.py
├── loose.py
├── named.py
├── ordered.py
bidict.py


where "bidict.py" is effectively little more than:

from _bidict import *



> What do you think of the code layout, specifically the use of the _foo
> modules? It seems well-factored to me, but I haven't seen things laid out
> this way very often in other projects, and I'd like to do this as nicely
> as possible.

It's very pretty, and well-organised, but I'm not sure you're actually
gaining any advantage from it.


> It does kind of bug me that you see the _foo modules in the output when
> you do things like this:
> 
>>>> import bidict
>>>> bidict.bidict
> <class 'bidict._bidict.bidict'>
>>>> bidict.KeyExistsError
> <class 'bidict._common.KeyExistsError'>
> 
> 
> In https://github.com/jab/bidict/pull/33#issuecomment-205381351 a reviewer
> agrees:
> 
> """
> Me too, and it confuses people as to where you should be importing things
> from if you want to catch it, inviting code like
> 
> ```
> import bidict._common


That, at least, should be an obvious no-no, as it's including a single
underscore private name. I wouldn't worry about that too much: if your
users are so naive or obnoxious that they're ignoring your documentation
and importing _private modules, there's nothing you can do: they'll find
*some* way to shoot themselves in the foot, whatever you do.

[Aside: my devs at work had reason to look inside a script written by one of
their now long-moved away former colleagues yesterday, as it recently
broke. After reading the script, the *first* thing they did was change the
way it was called from:

scriptname --opts steve
# yes, he really did use my name as the mandatory argument to the script

to

scriptname --opts forfucksakeandrew

where the name "andrew" has been changed to protect the guilty. Using my
name as the script argument is, apparently, the *least* stupid thing the
script does.]


> try:
>     ...
> except bidict._common.KeyExistsError:
>     ...
> ```
> ie. becoming dependent on the package internal structure.
> 
> I would be tempted to monkey-patch .__module__ = __name__ on each imported
> class to get around this. Maybe there are downsides to doing magic of that
> kind, but dependencies on the internals of packages are such a problem for
> me in our very large codebase, that I'd probably do it anyway in order to
> really explicit about what the public API is. """

Does his team not do internal code reviews? I hate code that lies about
where it comes from. I certainly wouldn't do it in a futile attempt to
protect idiots from themselves.



> Curious what folks on this list recommend, or if there are best practices
> about this published somewhere.
> 
> Thanks,
> Josh

-- 
Steven

[toc] | [prev] | [next] | [standalone]


#106549

FromSteven D'Aprano <steve@pearwood.info>
Date2016-04-06 09:29 +1000
Message-ID<57044a48$0$1603$c3e8da3$5496439d@news.astraweb.com>
In reply to#106482
On Wed, 6 Apr 2016 03:38 am, Sven R. Kunze wrote:

> On 05.04.2016 03:43, Steven D'Aprano wrote:
>> The purpose of packages isn't enable Java-style "one class per file"
>> coding, especially since *everything* in the package except the top level
>> "bidict" module itself is private. bidict.compat and bidict.util aren't
>> flagged as private, but they should be, since there's nothing in either
>> of them that the user of a bidict class should care about.
>>
>> (utils.py does export a couple of functions, but they should be in the
>> main module, or possibly made into a method of BidirectionalMapping.)
>>
>> Your package is currently under 500 lines. As it stands now, you could
>> easily flatten it to a single module:
>>
>> bidict.py
> 
> I don't recommend this.
> 
> The line is blurry but 500 is definitely too much. Those will simply not
> fit on a 1 or 2 generous single screens anymore (which basically is our
> guideline).

Are you serious that you work under the restriction that the entire file
must fit on a single screen? Or at most two pages?

So basically you've moved the burden of scrolling up and down a single file
to find the code you want into searching through the file system to find
the code that you want. I don't see this as an advantage. 


> The intention here is to always have a bit more of a full 
> screen of code (no wasted pixels) while benefiting from switching to
> another file (also seeing a full page of other code).

I don't understand this benefit. How does exceeding 1-2 pages of code in a
file prevent you from opening a second file in a separate window?

As we speak, I have 28 text editor windows open. Not all of them are Python
code, but among those which are, I have:

216 lines
157 lines
906 lines
687 lines
173 lines
373 lines
402 lines
6409 lines
509 lines
97 lines
163 lines
393 lines
1725 lines
25 lines

I have no problem with working with multiple file despite exceeding the
limit, so I don't understand the purpose of this guideline.


-- 
Steven

[toc] | [prev] | [next] | [standalone]


#106550

FromSteven D'Aprano <steve@pearwood.info>
Date2016-04-06 09:38 +1000
Message-ID<57044c5c$0$1620$c3e8da3$5496439d@news.astraweb.com>
In reply to#106482
On Wed, 6 Apr 2016 04:40 am, Ethan Furman wrote:

> Well, there should be one more module:
> 
>    test.py
> 
> So in total, two files
> 
> bidict/
> |-- __init__.py
> |-- test.py


Your test code shouldn't necessarily be part of the package though. If I
already have a package, then I will usually stick the test code inside it,
but if I have a single module, I keep the test code in a separate file and
don't bother installing it. It's there in the source repo for those who
want it.


> will do the trick.  Oh, and you want a README, LICENSE, a doc file.  And
> that should do it.  :)

None of which ought to be part of the package itself. Well, perhaps the
README.



-- 
Steven

[toc] | [prev] | [next] | [standalone]


#106557

FromSteven D'Aprano <steve@pearwood.info>
Date2016-04-06 12:09 +1000
Message-ID<57046fc4$0$1604$c3e8da3$5496439d@news.astraweb.com>
In reply to#106550
On Wed, 6 Apr 2016 09:54 am, Ethan Furman wrote:

> On 04/05/2016 04:38 PM, Steven D'Aprano wrote:
>> On Wed, 6 Apr 2016 04:40 am, Ethan Furman wrote:
>>
>>> Well, there should be one more module:
>>>
>>>     test.py
>>>
>>> So in total, two files
>>>
>>> bidict/
>>> |-- __init__.py
>>> |-- test.py
>>
>>
>> Your test code shouldn't necessarily be part of the package though. If I
>> already have a package, then I will usually stick the test code inside
>> it, but if I have a single module, I keep the test code in a separate
>> file and don't bother installing it. It's there in the source repo for
>> those who want it.
>>
>>
>>> will do the trick.  Oh, and you want a README, LICENSE, a doc file.  And
>>> that should do it.  :)
>>
>> None of which ought to be part of the package itself. Well, perhaps the
>> README.
> 
> If it's not part of the package, how does it get installed?  And where?
>   (Honest question -- I find the packaging and distribution process
> confusing at best.)

Why do they need to be installed? They're part of the development
environment, not the module.

(Actually, I take it back about the README, since conventionally that's used
to give installation instructions. Once the module is installed, you don't
usually need the README any more. But the doc file might be more useful.)


I don't know what life is like in the brave new world of wheels, eggs, pip
and other new-fangled complications^W tools, but in the good old fashioned
setuptools world, you have something like a zip file or tarball that
contains your code, plus an installation script. The contents of the zip
file will be something like this, taken from PyParsing version 2.1.1:


build/
CHANGES
dist/
docs/
examples/
htmldoc/
HowToUsePyparsing.html
LICENSE
MANIFEST.in
PKG-INFO
pyparsingClassDiagram.JPG
pyparsingClassDiagram.PNG
README
pyparsing.egg-info/
pyparsing.py
robots.txt
setup.cfg
setup.py

You unzip the file, cd into that directory and run:

python setup.py install

which then magically installs the module somewhere that just works. In this
specific case, it installs the module in a file

/usr/local/lib/python3.3/site-packages/pyparsing-2.1.1-py3.3.egg

which happens to be a vanilla zip file containing the pyparsing.py file, a
byte-code compiled .pyc version, and a directory EGG-INFO that presumably
allows pip or whatever to uninstall it again, or something, who the hell
knows how this works.

But the installer could just have easily have just installed the
pyparsing.py file alone.

The point is, the tests, documentation, etc aren't necessarily part of the
module, so they don't necessarily need to be installed into your
site-packages. If the user wants to read the HTML docs, they can browse
into the expanded source tarball and read them there, or drag the docs into
their home directory, or whatever they feel like. If they want to run the
tests, they run them from the source directory, not site-packages.

Of course, if you prefer to keep the tests inside the package, that's up to
you. But you needn't feel that this is the best or only place for them.



-- 
Steven

[toc] | [prev] | [next] | [standalone]


#106555

FromSteven D'Aprano <steve@pearwood.info>
Date2016-04-06 11:46 +1000
Message-ID<57046a87$0$1598$c3e8da3$5496439d@news.astraweb.com>
In reply to#106482
On Wed, 6 Apr 2016 05:56 am, Michael Selik wrote:

[Sven R. Kunze]
>> If you work like in the 80's, maybe. Instead of scrolling, (un)setting
>> jumppoints, or use splitview of the same file, it's just faster/easier to
>> jump between separate files in todays IDEs if you need to jump between 4
>> places within 3000 lines of code.

[Michael] 
> When you made that suggestion earlier, I immediately guessed that you were
> using PyCharm. I agree that the decision to split into multiple files or
> keep everything in just a few files seems to be based on your development
> tools. I use IPython and SublimeText, so my personal setup is more suited
> to one or a few files.

How does PyCharm make the use of many files easier?


-- 
Steven

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web