Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!selfless.tophat.at!newsfeed.xs4all.nl!newsfeed5.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Date: Thu, 09 Jun 2011 16:50:32 +0200 Message-ID: From: Wil Taphoorn User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.23) Gecko/20090812 Thunderbird/2.0.0.23 MIME-Version: 1.0 Newsgroups: comp.os.linux.development.system Subject: Re: extract data between two regex References: <1307627361.3593.4.camel@roddur> In-Reply-To: <1307627361.3593.4.camel@roddur> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Lines: 22 NNTP-Posting-Host: 83.163.46.61 X-Trace: 1307631049 news.xs4all.nl 49174 [::ffff:83.163.46.61]:19246 X-Complaints-To: abuse@xs4all.nl Xref: x330-a1.tempe.blueboxinc.net comp.os.linux.development.system:164 On 9-6-2011 15:49, Rudra Banerjee wrote: > Dear friends, > How can I extract data sandwiched between two regex ? Say, for a > file(snipped from gcstar export html) like this pasted below. > What I want to do is to extract the titke (sandwidched between > and ) and export it to latex(or other format). > hoping for your help. > > > > >
Band Theory and Electronic Properties of > Solids (Oxford Master Series in Condensed Matter Physics) > (9780198506447)
src="booklist_images/Band_Theory_and_Electronic_Properties_of_Solids__Oxford_Master_Series_in_Condensed_Matter_Physics___9780198506447__0.jpg" height="160" alt="Band Theory and Electronic Properties of Solids (Oxford Master Series in Condensed Matter Physics) (9780198506447)" title="Band Theory and Electronic Properties of Solids (Oxford Master Series in Condensed Matter Physics) (9780198506447)" border="0"/>
Try this: $ lua -e 'io.read("*all"):gsub("([^<]+)",function(x)print(x)end)'