Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #96254 > unrolled thread
| Started by | Gerald <schweiger.gerald@gmail.com> |
|---|---|
| First post | 2015-09-10 04:18 -0700 |
| Last post | 2015-09-10 19:41 +0000 |
| Articles | 8 — 7 participants |
Back to article view | Back to comp.lang.python
textfile: copy between 2 keywords Gerald <schweiger.gerald@gmail.com> - 2015-09-10 04:18 -0700
Re: textfile: copy between 2 keywords Steven D'Aprano <steve@pearwood.info> - 2015-09-10 22:10 +1000
Re: textfile: copy between 2 keywords Jussi Piitulainen <harvesting@makes.email.invalid> - 2015-09-10 16:47 +0300
Re: textfile: copy between 2 keywords Vlastimil Brom <vlastimil.brom@gmail.com> - 2015-09-10 16:33 +0200
Re: textfile: copy between 2 keywords Jussi Piitulainen <harvesting@makes.email.invalid> - 2015-09-10 18:48 +0300
Re: textfile: copy between 2 keywords Christian Gollwitzer <auriocus@gmx.de> - 2015-09-10 19:29 +0200
Re: textfile: copy between 2 keywords wxjmfauth@gmail.com - 2015-09-10 12:11 -0700
Re: textfile: copy between 2 keywords alister <alister.nospam.ware@ntlworld.com> - 2015-09-10 19:41 +0000
| From | Gerald <schweiger.gerald@gmail.com> |
|---|---|
| Date | 2015-09-10 04:18 -0700 |
| Subject | textfile: copy between 2 keywords |
| Message-ID | <13d875f0-ae8d-43de-85b4-c943a0e7f5e2@googlegroups.com> |
Hey, is there a easy way to copy the content between 2 unique keywords in a .txt file? example.txt 1, 2, 3, 4 #keyword1 3, 4, 5, 6 2, 3, 4, 5 #keyword2 4, 5, 6 ,7 Thank you very much
[toc] | [next] | [standalone]
| From | Steven D'Aprano <steve@pearwood.info> |
|---|---|
| Date | 2015-09-10 22:10 +1000 |
| Message-ID | <55f17326$0$1655$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #96254 |
On Thu, 10 Sep 2015 09:18 pm, Gerald wrote:
> Hey,
>
> is there a easy way to copy the content between 2 unique keywords in a
> .txt file?
>
> example.txt
>
> 1, 2, 3, 4
> #keyword1
> 3, 4, 5, 6
> 2, 3, 4, 5
> #keyword2
> 4, 5, 6 ,7
>
>
> Thank you very much
Copy in what sense? Write to another file, or just copy to memory?
Either way, your solution will look something like this:
* read each line from the input file, until you reach the first keyword;
* as soon as you see the first keyword, change to "copy mode" and start
copying lines in whatever way you feel is best;
* until you see the second keyword, then stop.
E.g.
with open("input.txt") as f:
# Skip lines as fast as possible.
for line in f:
if line == "START\n":
break
# Instead of copying, I'll just print the lines. That's sort of a copy.
for line in f: # This will pick up where the previous for loop ended.
if line == "STOP\n":
break
print(line)
# If you like, you can just finish now.
# Or, we can read the rest of the lines.
for line in f: # continue from just after the STOP keyword.
pass # This is a waste of time...
print("Done!")
--
Steven
[toc] | [prev] | [next] | [standalone]
| From | Jussi Piitulainen <harvesting@makes.email.invalid> |
|---|---|
| Date | 2015-09-10 16:47 +0300 |
| Message-ID | <lf57fnyo078.fsf@ling.helsinki.fi> |
| In reply to | #96254 |
Gerald writes:
> Hey,
>
> is there a easy way to copy the content between 2 unique keywords in a
> .txt file?
>
> example.txt
>
> 1, 2, 3, 4
> #keyword1
> 3, 4, 5, 6
> 2, 3, 4, 5
> #keyword2
> 4, 5, 6 ,7
Depending on your notion of easy, you may or may not like itertools.
The following code gets you the first keyword and the lines between but
consumes the second keyword. If I needed more control, I'd probably
write what Steven D'Aprano wrote but as a generator function, to get the
flexibility of deciding separately what kind of copy I want in the end.
And I'd be anxious about the possibility that the second keyword is not
there in the input at all. Steven's code and mine simply take every line
after the first keyword in that case. Worth a comment in the code, if
not an exception. Depends.
Code:
from itertools import dropwhile, takewhile
from sys import stdin
def notbeg(line): return line != '#keyword1\n'
def notend(line): return line != '#keyword2 \n' # sic!
if __name__ == '__main__':
print(list(takewhile(notend, dropwhile(notbeg, stdin))))
Output with your original mail as input in stdin:
['#keyword1\n', '3, 4, 5, 6\n', '2, 3, 4, 5\n']
[toc] | [prev] | [next] | [standalone]
| From | Vlastimil Brom <vlastimil.brom@gmail.com> |
|---|---|
| Date | 2015-09-10 16:33 +0200 |
| Message-ID | <mailman.318.1441895625.8327.python-list@python.org> |
| In reply to | #96254 |
2015-09-10 13:18 GMT+02:00 Gerald <schweiger.gerald@gmail.com>:
> Hey,
>
> is there a easy way to copy the content between 2 unique keywords in a .txt file?
>
> example.txt
>
> 1, 2, 3, 4
> #keyword1
> 3, 4, 5, 6
> 2, 3, 4, 5
> #keyword2
> 4, 5, 6 ,7
>
>
> Thank you very much
Hi,
just to add another possible approach, you can use regular expression
search for this task, e.g.
(after you have read the text content to an input string):
>>> import re
>>> input_txt ="""1, 2, 3, 4
... #keyword1
... 3, 4, 5, 6
... 2, 3, 4, 5
... #keyword2
... 4, 5, 6 ,7"""
>>> re.findall(r"(?s)(#keyword1)(.*?)(#keyword2)", input_txt)
[('#keyword1', '\n3, 4, 5, 6\n2, 3, 4, 5\n', '#keyword2')]
>>>
like in the other approaches, you might need to specify the details
for specific cases (no keywords, only one of them, repeated keywords
(possible in different order, overlapping or "crossed"), handling of
newlines etc.
hth,
vbr
[toc] | [prev] | [next] | [standalone]
| From | Jussi Piitulainen <harvesting@makes.email.invalid> |
|---|---|
| Date | 2015-09-10 18:48 +0300 |
| Message-ID | <lf5wpvymg12.fsf@ling.helsinki.fi> |
| In reply to | #96261 |
Vlastimil Brom writes: > just to add another possible approach, you can use regular expression Now you have three problems: whatever the two problems are that you are alleged to have whenever you decide to use regular expressions for anything at all, plus all the people piling on you to tell that a Jamie Zawinski once said that whenever you decide to use regular expressions to solve a problem, you end up with two problems. :)
[toc] | [prev] | [next] | [standalone]
| From | Christian Gollwitzer <auriocus@gmx.de> |
|---|---|
| Date | 2015-09-10 19:29 +0200 |
| Message-ID | <mssehl$pb8$1@dont-email.me> |
| In reply to | #96254 |
Am 10.09.15 um 13:18 schrieb Gerald: > Hey, > > is there a easy way to copy the content between 2 unique keywords in a .txt file? > > example.txt > > 1, 2, 3, 4 > #keyword1 > 3, 4, 5, 6 > 2, 3, 4, 5 > #keyword2 > 4, 5, 6 ,7 If "copying" does mean copy it to another file, and you are not obliged to use Python, this is unmatched in awk: Apfelkiste:Tests chris$ cat kw.txt 1, 2, 3, 4 #keyword1 3, 4, 5, 6 2, 3, 4, 5 #keyword2 4, 5, 6 ,7 Apfelkiste:Tests chris$ awk '/keyword1/,/keyword2/' kw.txt #keyword1 3, 4, 5, 6 2, 3, 4, 5 #keyword2 Consequently, awk '/keyword1/,/keyword2/' kw.txt > kw_copy.txt would write it out to kw_copy.txt Beware that between the two slashes there are regexps, so if you have metacharacters in your keywords, you need to quote them. Christian
[toc] | [prev] | [next] | [standalone]
| From | wxjmfauth@gmail.com |
|---|---|
| Date | 2015-09-10 12:11 -0700 |
| Message-ID | <a41d94a3-5be0-4b95-98ad-f7a00520b984@googlegroups.com> |
| In reply to | #96273 |
>>> s = """1, 2, 3, 4
... #keyword1
... 3, 4, 5, 6
... 2, 3, 4, 5
... #keyword2
... 4, 5, 6 ,7"""
>>> s[s.find('keyword1') + len('keyword1'):s.find('keyword2') - 1]
'\n3, 4, 5, 6\n2, 3, 4, 5\n'
>>> #or
>>> s[s.find('keyword1') + len('keyword1') + 1:s.find('keyword2') - 2]
'3, 4, 5, 6\n2, 3, 4, 5'
>>>
[toc] | [prev] | [next] | [standalone]
| From | alister <alister.nospam.ware@ntlworld.com> |
|---|---|
| Date | 2015-09-10 19:41 +0000 |
| Message-ID | <mssmcl$hgu$1@speranza.aioe.org> |
| In reply to | #96292 |
On Thu, 10 Sep 2015 12:11:55 -0700, wxjmfauth wrote:
>>>> s = """1, 2, 3, 4
> ... #keyword1 ... 3, 4, 5, 6 ... 2, 3, 4, 5 ... #keyword2 ... 4, 5, 6
> ,7"""
>>>> s[s.find('keyword1') + len('keyword1'):s.find('keyword2') - 1]
> '\n3, 4, 5, 6\n2, 3, 4, 5\n'
>>>> #or s[s.find('keyword1') + len('keyword1') + 1:s.find('keyword2') -
>>>> 2]
> '3, 4, 5, 6\n2, 3, 4, 5'
>>>>
split works well
as a simple 1 liner (well 2 if you include the string setup)
>>>a="crap word1 more crap word1 again word2 still more crap"
>>>a.split('word1',1)[1].split('word2')[0]
' more crap word1 again '
--
All bad precedents began as justifiable measures.
-- Gaius Julius Caesar, quoted in "The Conspiracy of
Catiline", by Sallust
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web