Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #102276 > unrolled thread

Mimick tac with python.

Started byHongyi Zhao <hongyi.zhao@gmail.com>
First post2016-01-30 04:46 +0000
Last post2016-01-30 15:56 +1100
Articles 9 — 7 participants

Back to article view | Back to comp.lang.python


Contents

  Mimick tac with python. Hongyi Zhao <hongyi.zhao@gmail.com> - 2016-01-30 04:46 +0000
    Re: Mimick tac with python. Random832 <random832@fastmail.com> - 2016-01-29 23:58 -0500
      Re: Mimick tac with python. Christian Gollwitzer <auriocus@gmx.de> - 2016-01-30 07:03 +0100
        Re: Mimick tac with python. Jussi Piitulainen <jussi.piitulainen@helsinki.fi> - 2016-01-30 09:56 +0200
          Re: Mimick tac with python. Christian Gollwitzer <auriocus@gmx.de> - 2016-01-30 10:23 +0100
        Re: Mimick tac with python. Peter Otten <__peter__@web.de> - 2016-01-30 09:21 +0100
        Re: Mimick tac with python. Terry Reedy <tjreedy@udel.edu> - 2016-01-30 04:38 -0500
      Re: Mimick tac with python. Hongyi Zhao <hongyi.zhao@gmail.com> - 2016-01-30 06:18 +0000
    Re: Mimick tac with python. Chris Angelico <rosuav@gmail.com> - 2016-01-30 15:56 +1100

#102276 — Mimick tac with python.

FromHongyi Zhao <hongyi.zhao@gmail.com>
Date2016-01-30 04:46 +0000
SubjectMimick tac with python.
Message-ID<n8hf7c$lqe$1@aspen.stu.neva.ru>
Hi all,

I can use the following methods for mimicking tac command bellow:

awk '{a[NR]=$0} END {while (NR) print a[NR--]}' input_file
perl -e 'print reverse<>' input_file

Is it possible to do the same thing with python?

Regards
-- 
.: Hongyi Zhao [ hongyi.zhao AT gmail.com ] Free as in Freedom :.

[toc] | [next] | [standalone]


#102277

FromRandom832 <random832@fastmail.com>
Date2016-01-29 23:58 -0500
Message-ID<mailman.111.1454129921.2338.python-list@python.org>
In reply to#102276
On Fri, Jan 29, 2016, at 23:46, Hongyi Zhao wrote:
> Hi all,
> 
> I can use the following methods for mimicking tac command bellow:
> 
> awk '{a[NR]=$0} END {while (NR) print a[NR--]}' input_file
> perl -e 'print reverse<>' input_file

Well, both of those read the whole file into memory - tac is sometimes
smarter than that, but that makes for a more complex program. And python
doesn't really do "one-liners" like that, so it doesn't look quite as
nice. But combined with some shell constructs you can do:

python <(echo 'import sys;print("".join(reversed(list(sys.stdin))))')

[toc] | [prev] | [next] | [standalone]


#102280

FromChristian Gollwitzer <auriocus@gmx.de>
Date2016-01-30 07:03 +0100
Message-ID<n8hjis$gte$1@dont-email.me>
In reply to#102277
Am 30.01.16 um 05:58 schrieb Random832:
> On Fri, Jan 29, 2016, at 23:46, Hongyi Zhao wrote:
>> awk '{a[NR]=$0} END {while (NR) print a[NR--]}' input_file
>> perl -e 'print reverse<>' input_file
>
> Well, both of those read the whole file into memory - tac is sometimes
> smarter than that, but that makes for a more complex program.

Now I'm curious. How is it possible to output the first line as last 
again if not by remembering it from the every beginning? How could tac 
be implemented other than sucking up everything into memory?

	Christian

[toc] | [prev] | [next] | [standalone]


#102283

FromJussi Piitulainen <jussi.piitulainen@helsinki.fi>
Date2016-01-30 09:56 +0200
Message-ID<lf5vb6blcu8.fsf@ling.helsinki.fi>
In reply to#102280
Christian Gollwitzer writes:

> Am 30.01.16 um 05:58 schrieb Random832:
>> On Fri, Jan 29, 2016, at 23:46, Hongyi Zhao wrote:
>>> awk '{a[NR]=$0} END {while (NR) print a[NR--]}' input_file
>>> perl -e 'print reverse<>' input_file
>>
>> Well, both of those read the whole file into memory - tac is sometimes
>> smarter than that, but that makes for a more complex program.
>
> Now I'm curious. How is it possible to output the first line as last
> again if not by remembering it from the every beginning? How could tac
> be implemented other than sucking up everything into memory?

It may be possible to map the data into virtual memory so that the
program sees it as an array of bytes. The data is paged in when
accessed. The program just scans the array backwards, looking for
end-of-line characters. I believe they can be identified reliably, as
bytes, even in a backward scan of UTF-8-encoded data.

The data needs to be in a file. The keywords are something like "memory
mapping" and "mmap". I've only experimented with this briefly once in
Julia, so I don't really know more.

Oh. There's https://docs.python.org/3/library/mmap.html in Python.

[toc] | [prev] | [next] | [standalone]


#102289

FromChristian Gollwitzer <auriocus@gmx.de>
Date2016-01-30 10:23 +0100
Message-ID<n8hv9c$g2s$1@dont-email.me>
In reply to#102283
Am 30.01.16 um 08:56 schrieb Jussi Piitulainen:
> Christian Gollwitzer writes:
>
>> Am 30.01.16 um 05:58 schrieb Random832:
>>> On Fri, Jan 29, 2016, at 23:46, Hongyi Zhao wrote:
>>>> awk '{a[NR]=$0} END {while (NR) print a[NR--]}' input_file
>>>> perl -e 'print reverse<>' input_file
>>>
>>> Well, both of those read the whole file into memory - tac is sometimes
>>> smarter than that, but that makes for a more complex program.
>>
>> Now I'm curious. How is it possible to output the first line as last
>> again if not by remembering it from the every beginning? How could tac
>> be implemented other than sucking up everything into memory?
>
> It may be possible to map the data into virtual memory so that the
> program sees it as an array of bytes. The data is paged in when
> accessed. The program just scans the array backwards, looking for
> end-of-line characters. I believe they can be identified reliably, as
> bytes, even in a backward scan of UTF-8-encoded data.
>
> The data needs to be in a file.

If it's in a file, then I agree. I was thinking about the case where tac 
is used in a pipe - obviously here you can't reverse the file in 
constant memory.

	Christian

[toc] | [prev] | [next] | [standalone]


#102286

FromPeter Otten <__peter__@web.de>
Date2016-01-30 09:21 +0100
Message-ID<mailman.116.1454142096.2338.python-list@python.org>
In reply to#102280
Christian Gollwitzer wrote:

> Am 30.01.16 um 05:58 schrieb Random832:
>> On Fri, Jan 29, 2016, at 23:46, Hongyi Zhao wrote:
>>> awk '{a[NR]=$0} END {while (NR) print a[NR--]}' input_file
>>> perl -e 'print reverse<>' input_file
>>
>> Well, both of those read the whole file into memory - tac is sometimes
>> smarter than that, but that makes for a more complex program.
> 
> Now I'm curious. How is it possible to output the first line as last
> again if not by remembering it from the every beginning? How could tac
> be implemented other than sucking up everything into memory?

If the input file is seekable you can do blockwise reads:

import os
import sys


def tac(f, blocksize=1024):
    buf = b""
    f.seek(0, os.SEEK_END)
    size = f.tell()
    for start in reversed(range(0, size, blocksize)):
        f.seek(start)
        buf = f.read(blocksize) + buf
        lines = buf.splitlines(True)
        buf = lines.pop(0)
        yield from reversed(lines)
    yield buf


if __name__ == "__main__":
    for filename in sys.argv[1:]:
        with open(filename, "rb") as infile:
            sys.stdout.buffer.writelines(tac(infile))

This way you need to keep one block plus one line in memory.

[toc] | [prev] | [next] | [standalone]


#102290

FromTerry Reedy <tjreedy@udel.edu>
Date2016-01-30 04:38 -0500
Message-ID<mailman.119.1454146745.2338.python-list@python.org>
In reply to#102280
On 1/30/2016 1:03 AM, Christian Gollwitzer wrote:
> Am 30.01.16 um 05:58 schrieb Random832:
>> On Fri, Jan 29, 2016, at 23:46, Hongyi Zhao wrote:
>>> awk '{a[NR]=$0} END {while (NR) print a[NR--]}' input_file
>>> perl -e 'print reverse<>' input_file
>>
>> Well, both of those read the whole file into memory - tac is sometimes
>> smarter than that, but that makes for a more complex program.
>
> Now I'm curious. How is it possible to output the first line as last
> again if not by remembering it from the every beginning? How could tac
> be implemented other than sucking up everything into memory?

One could read the file by lines and make a list of start-of-line 
positions.  Reverse the list. Read each line.  Details omitted.

-- 
Terry Jan Reedy

[toc] | [prev] | [next] | [standalone]


#102281

FromHongyi Zhao <hongyi.zhao@gmail.com>
Date2016-01-30 06:18 +0000
Message-ID<n8hkjs$lqe$2@aspen.stu.neva.ru>
In reply to#102277
On Fri, 29 Jan 2016 23:58:38 -0500, Random832 wrote:

> python <(echo 'import sys;print("".join(reversed(list(sys.stdin))))')

Why do you write it as follows:

cat input_file | python -c 'import sys;print("".join(reversed(list
(sys.stdin))))'

Regards
-- 
.: Hongyi Zhao [ hongyi.zhao AT gmail.com ] Free as in Freedom :.

[toc] | [prev] | [next] | [standalone]


#102278

FromChris Angelico <rosuav@gmail.com>
Date2016-01-30 15:56 +1100
Message-ID<mailman.112.1454130103.2338.python-list@python.org>
In reply to#102276
On Sat, Jan 30, 2016 at 3:46 PM, Hongyi Zhao <hongyi.zhao@gmail.com> wrote:
> I can use the following methods for mimicking tac command bellow:
>
> awk '{a[NR]=$0} END {while (NR) print a[NR--]}' input_file
> perl -e 'print reverse<>' input_file
>
> Is it possible to do the same thing with python?

python -c 'import sys; print("".join(list(open(sys.argv[1]))[::-1]))' input_file

Python doesn't have a short-hand for grabbing the file named as an
argument, but it's easy enough to reverse stuff.

ChrisA

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web