Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #106612

Re: Joining Strings

From Jussi Piitulainen <jussi.piitulainen@helsinki.fi>
Newsgroups comp.lang.python
Subject Re: Joining Strings
Date 2016-04-07 08:01 +0300
Organization A noiseless patient Spider
Message-ID <lf5oa9mt3kv.fsf@ling.helsinki.fi> (permalink)
References <CAOypoo5YpPLNeobY3uo8o_KjXtfEWcmqffsEUXrqsJ0ACpkBWA@mail.gmail.com> <mailman.51.1459989021.1197.python-list@python.org>

Show all headers | View raw


Emeka writes:

> Hello All,
>
> import urllib.request
> import re
>
> url = 'https://www.everyday.com/
>
>
>
> req = urllib.request.Request(url)
> resp = urllib.request.urlopen(req)
> respData = resp.read()
>
>
> paragraphs = re.findall(r'\[(.*?)\]',str(respData))
> for eachP in paragraphs:
>     print("".join(eachP.split(',')[1:-2]))
>     print("\n")
>
>
>
> I got the below:
> "Coke -  Yala Market Branch""NO. 113 IKU BAKR WAY YALA"""
> But what I need is
>
> 'Coke -  Yala Market Branch NO. 113 IKU BAKR WAY YALA'
>
> How to I achieve the above?

A couple of things you could do to understand your problem and work
around it: Change your code to print(eachP). Change your "".join to
"!".join to see where the commas were. Experiment with data of that form
in the REPL. Sometimes it's good to print repr(datum) instead of datum,
though not in this case.

But are you trying to extract and parse paragraphs from a JSON response?
Do not use regex for that at all. Use json.load or json.loads to parse
it properly, and access the relevant data by indexing:

x = json.loads('{"foo":[["Weather Forecast","It\'s Rain"],[]]}')

x ==> {'foo': [['Weather Forecast', "It's Rain"], []]}

x['foo'] ==> [['Weather Forecast', "It's Rain"], []]

x['foo'][0] ==> ['Weather Forecast', "It's Rain"]

Back to comp.lang.python | Previous | NextNext in thread | Find similar | Unroll thread


Thread

Re: Joining Strings Jussi Piitulainen <jussi.piitulainen@helsinki.fi> - 2016-04-07 08:01 +0300
  Re: Joining Strings Jussi Piitulainen <jussi.piitulainen@helsinki.fi> - 2016-04-08 08:02 +0300

csiph-web