Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #78021 > unrolled thread
| Started by | chris.barker@noaa.gov |
|---|---|
| First post | 2014-09-18 09:19 -0700 |
| Last post | 2014-09-18 11:54 -0700 |
| Articles | 3 — 2 participants |
Back to article view | Back to comp.lang.python
Best practice for opening files for newbies? chris.barker@noaa.gov - 2014-09-18 09:19 -0700
Re: Best practice for opening files for newbies? Chris Angelico <rosuav@gmail.com> - 2014-09-19 02:37 +1000
Re: Best practice for opening files for newbies? chris.barker@noaa.gov - 2014-09-18 11:54 -0700
| From | chris.barker@noaa.gov |
|---|---|
| Date | 2014-09-18 09:19 -0700 |
| Subject | Best practice for opening files for newbies? |
| Message-ID | <0081d786-d218-495a-b79f-8882e786d7d0@googlegroups.com> |
Folks,
I'm in the position of teaching Python to beginners (beginners to Python, anyway).
I'm teaching Python2 -- because that is still what most of the code "in the wild" is in. I do think I"ll transition to Python 3 fairly soon, as it's not too hard for folks to back-port their knowledge, but for now, it's Py2 -- and I'm hoping not to have that debate on this thread.
But I do want to keep the 2->3 transition in mind, so where it's not too hard, want to teach approaches that will transition well to py3.
So: there are way too many ways to open a simple file to read or write a bit of text (or binary):
open()
file()
io.open()
codecs.open()
others???
I'm thinking that way to go now with modern Py2 is:
from io import open
then use open() .....
IIUC, this will give the user an open() that behaves the same way as py3's open() (identical?).
The only issue (so far) I've run into is this:
In [51]: f = io.open("test_file.txt", 'w')
In [52]: f.write("some string")
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-52-f874778a72a1> in <module>()
----> 1 f.write("some string")
TypeError: must be unicode, not str
I'm OK with that -- I think it's better for folks learning py2 now to get used to Unicode up front anyway.
But any other issues? Is this a good way to go?
By the way: I note that the default encoding for io.open on my system (OS-X) is utf-8, despite:
In [54]: sys.getdefaultencoding()
Out[54]: 'ascii'
How is that determined?
-CHB
[toc] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-09-19 02:37 +1000 |
| Message-ID | <mailman.14110.1411058265.18130.python-list@python.org> |
| In reply to | #78021 |
On Fri, Sep 19, 2014 at 2:19 AM, <chris.barker@noaa.gov> wrote: > So: there are way too many ways to open a simple file to read or write a bit of text (or binary): > > open() Personally, I'd just use this, all the way through - and not importing from io, either. But others may disagree. Be clear about what's text and what's bytes, everywhere. When you do make the jump to Py3, you'll have to worry about text files vs binary files, and if you need to support Windows as well as Unix, you need to get that right anyway, so just make sure you get the two straight. Going Py3 will actually make your job quite a bit easier, there; but even if you don't, save yourself a lot of trouble later on by keeping the difference very clear. And you can save yourself some more conversion trouble by tossing this at the top of every .py file you write: from __future__ import print_function, division, unicode_literals But mainly, just go with the simple open() call and do the job the easiest way you can. And go Py3 as soon as you can, because ... > because that is still what most of the code "in the wild" is in. ... this statement isn't really an obvious truth any more (it's hard to say what "most" code is), and it's definitely not going to remain so for the long-term future. For people learning Python today, unless they plan on having a really short career in programming, more of their time will be after 2020 than before it, and Python 3 is the way to go. Plus, it's just way WAY easier to get Unicode right in Py3 than in Py2. Save yourself the hassle! ChrisA
[toc] | [prev] | [next] | [standalone]
| From | chris.barker@noaa.gov |
|---|---|
| Date | 2014-09-18 11:54 -0700 |
| Message-ID | <6ff0bf76-73a9-44d1-9619-8226c52c8fb9@googlegroups.com> |
| In reply to | #78024 |
On Thursday, September 18, 2014 9:38:00 AM UTC-7, Chris Angelico wrote:
> On Fri, Sep 19, 2014 at 2:19 AM, <chris.barker@noaa.gov> wrote:
> > So: there are way too many ways to open a simple file to read or write a bit of text (or binary):
> > open()
>
> Personally, I'd just use this, all the way through - and not importing
>
> from io, either. But others may disagree.
well the trick there is that it's a serious trick to work with non-ascii compatible text files if you do that...
> Be clear about what's text and what's bytes, everywhere. When you do
> make the jump to Py3, you'll have to worry about text files vs binary
> files, and if you need to support Windows as well as Unix, you need to
> get that right anyway, so just make sure you get the two straight.
yup -- I've always emphasized that point, but from a py2 perspective (and with the built in open() file object, what is a utf-8 encoded file? text or bytes? It's bytes -- and you need to do the decoding yourself. Why make people do that?
In the past, I started with open(), ignored unicode for a while then when I introduced unicode, I pointed them to codecs.open() (I hadn't discovered io.open yet ). Maybe I should stick with this approach, but it feels like a bad idea.
> Save yourself a lot of trouble later on by keeping
> the difference very clear.
exactly -- but it's equally clear, and easier and more robust to have two types of files: binary and text, where text requires a known encoding. Rather than three types: binary, ascii text and encoded text, which is really binary, which you can then decode to make text....
Think of somethign as simple and common as loping through the lines in a file!
> And you can save yourself some more
> conversion trouble by tossing this at the top of every .py file you
>
> write:
>
> from __future__ import print_function, division, unicode_literals
yup -- I've been thinking of recommending that to my students as well -- particularly unicode_literal
> But mainly, just go with the simple open() call and do the job the
> easiest way you can. And go Py3 as soon as you can, because ...
A discussion for another thread....
Thanks,
-Chris
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web