Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.help > #2201

Re: regex puzzle

From "Peter J. Holzer" <hjp-usenet2@hjp.at>
Newsgroups comp.lang.java.help
Subject Re: regex puzzle
Date 2012-10-30 14:48 +0100
Organization LUGA
Message-ID <slrnk8vmks.svs.hjp-usenet2@hrunkner.hjp.at> (permalink)
References <olqt88d9p21pf9nau0j4pke7kmhq08u5o4@4ax.com> <f7a2eec3-eca9-468a-8d8a-3d8bf360a530@googlegroups.com> <pgiv881g37e73fek318423bvrmtncgto4e@4ax.com>

Show all headers | View raw


On 2012-10-30 12:59, Roedy Green <see_website@mindprod.com.invalid> wrote:
> On Mon, 29 Oct 2012 14:48:28 -0700 (PDT), Lew <lewbloch@gmail.com>
> wrote, quoted or indirectly quoted someone who said :
>>including the quotation marks?
>  
> I am scanning postable HTML trying to convert things surrounded in
> &quot; to a style, no embedded space allowed, but embedded entity
> allowed to be left intact.
>
> e.g.
> &quot;cat&quot; (hex  2671756F743B6361742671756F743B  in  ASCII.)
> to <span class="quoted">cat</span>  From there the style will be
> refined manually.

Java Regexps seem to be Perl-compatible, so 

s.replaceAll("&quot;(\S*?)&quot;", "<span class=\"quoted\">$1</span>");

should do the trick.

At least unless you have HTML code like this

<img src="cat.jpg" alt="image of a &quot;cat&quot;">

This would be translated into 

<img src="cat.jpg" alt="image of a <span class="quoted">cat</span>">

which isn't valid HTML.

It is possible to handle that in a regexp, but this would be really
cumbersome. If you want to process HTML, use a proper HTML parser.

	hp


-- 
   _  | Peter J. Holzer    | Fluch der elektronischen Textverarbeitung:
|_|_) | Sysadmin WSR       | Man feilt solange an seinen Text um, bis
| |   | hjp@hjp.at         | die Satzbestandteile des Satzes nicht mehr
__/   | http://www.hjp.at/ | zusammenpaßt. -- Ralph Babel

Back to comp.lang.java.help | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

regex puzzle Roedy Green <see_website@mindprod.com.invalid> - 2012-10-29 13:46 -0700
  Re: regex puzzle Lew <lewbloch@gmail.com> - 2012-10-29 14:48 -0700
    Re: regex puzzle Roedy Green <see_website@mindprod.com.invalid> - 2012-10-30 05:59 -0700
      Re: regex puzzle "Peter J. Holzer" <hjp-usenet2@hjp.at> - 2012-10-30 14:48 +0100
        Re: regex puzzle markspace <-@.> - 2012-10-30 14:16 -0700
          Re: regex puzzle "Peter J. Holzer" <hjp-usenet2@hjp.at> - 2012-10-31 09:54 +0100
            Re: regex puzzle markspace <-@.> - 2012-10-31 11:25 -0700
              Re: regex puzzle "Peter J. Holzer" <hjp-usenet2@hjp.at> - 2012-11-01 13:56 +0100
                Re: regex puzzle Roedy Green <see_website@mindprod.com.invalid> - 2012-11-01 18:46 -0700
          Re: regex puzzle Roedy Green <see_website@mindprod.com.invalid> - 2012-10-31 07:09 -0700
        Re: regex puzzle Roedy Green <see_website@mindprod.com.invalid> - 2012-10-31 07:11 -0700
          Re: regex puzzle Roedy Green <see_website@mindprod.com.invalid> - 2012-10-31 16:22 -0700
            Re: regex puzzle markspace <-@.> - 2012-10-31 17:29 -0700
              Re: regex puzzle Roedy Green <see_website@mindprod.com.invalid> - 2012-11-01 18:43 -0700
  Re: regex puzzle Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-10-30 16:39 -0700
    Re: regex puzzle Roedy Green <see_website@mindprod.com.invalid> - 2012-10-31 07:33 -0700

csiph-web