Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #37919 > unrolled thread
| Started by | RichD <r_delaney2001@yahoo.com> |
|---|---|
| First post | 2013-01-29 20:55 -0800 |
| Last post | 2013-01-30 20:40 -0500 |
| Articles | 13 — 11 participants |
Back to article view | Back to comp.lang.python
security quirk RichD <r_delaney2001@yahoo.com> - 2013-01-29 20:55 -0800
Signal versus noise (was: security quirk) Ben Finney <ben+python@benfinney.id.au> - 2013-01-30 16:10 +1100
Re: security quirk Rodrick Brown <rodrick.brown@gmail.com> - 2013-01-30 00:14 -0500
Re: security quirk Chris Rebert <clp2@rebertia.com> - 2013-01-29 21:16 -0800
Re: security quirk Martin Musatov <marty.musatov@gmail.com> - 2013-01-30 06:37 -0800
Re: security quirk "Auric__" <not.my.real@email.address> - 2013-01-30 22:40 +0000
Re: security quirk Gandalf Parker <gandalf@the.dead.ISP.of.Community.net> - 2013-01-30 14:39 +0000
Re: security quirk RichD <r_delaney2001@yahoo.com> - 2013-01-30 11:39 -0800
Re: security quirk Joel Goldstick <joel.goldstick@gmail.com> - 2013-01-30 14:45 -0500
Re: security quirk alex23 <wuwei23@gmail.com> - 2013-01-30 17:16 -0800
Re: security quirk Gandalf Parker <gandalf@the.dead.ISP.of.Community.net> - 2013-01-31 14:07 +0000
Re: security quirk Big Bad Bob <BigBadBob-at-mrp3-dot-com@testing.local> - 2013-01-30 11:59 -0800
Re: security quirk Arne Vajhøj <arne@vajhoej.dk> - 2013-01-30 20:40 -0500
| From | RichD <r_delaney2001@yahoo.com> |
|---|---|
| Date | 2013-01-29 20:55 -0800 |
| Subject | security quirk |
| Message-ID | <b968c6c6-5aa9-4584-bd7a-5b097f17c54d@pu9g2000pbc.googlegroups.com> |
I read Wall Street Journal, and occasionally check articles on their Web site. It's mostly free, with some items available to subscribers only. It seems random, which ones they block, about 20%. Anywho, sometimes I use their search utility, the usual author or title search, and it blocks, then I look it up on Google, and link from there, and it loads! ok, Web gurus, what's going on? -- Rich
[toc] | [next] | [standalone]
| From | Ben Finney <ben+python@benfinney.id.au> |
|---|---|
| Date | 2013-01-30 16:10 +1100 |
| Subject | Signal versus noise (was: security quirk) |
| Message-ID | <mailman.1201.1359522634.2939.python-list@python.org> |
| In reply to | #37919 |
RichD <r_delaney2001@yahoo.com> writes: > Anywho, sometimes I use their search utility, the usual author > or title search, and it blocks, then I look it up on Google, and > link from there, and it loads! ok, Web gurus, what's going on? That evidently has nothing in particular to do with the topic of this forum: the Python programming language. If you want to just comment on arbitrary things with the internet at large, you have many other forums available. Please at least try to keep this forum on-topic. -- \ “Outside of a dog, a book is man's best friend. Inside of a | `\ dog, it's too dark to read.” —Groucho Marx | _o__) | Ben Finney
[toc] | [prev] | [next] | [standalone]
| From | Rodrick Brown <rodrick.brown@gmail.com> |
|---|---|
| Date | 2013-01-30 00:14 -0500 |
| Message-ID | <mailman.1202.1359522878.2939.python-list@python.org> |
| In reply to | #37919 |
[Multipart message — attachments visible in raw view] — view raw
On Tue, Jan 29, 2013 at 11:55 PM, RichD <r_delaney2001@yahoo.com> wrote: > I read Wall Street Journal, and occasionally check > articles on their Web site. It's mostly free, with some items > available to subscribers only. It seems random, which ones > they block, about 20%. > > Anywho, sometimes I use their search utility, the usual author > or title search, and it blocks, then I look it up on Google, and > link from there, and it loads! ok, Web gurus, what's going on? > > Its Gremlins! I tell you Gremlins!!! > > -- > Rich > -- > http://mail.python.org/mailman/listinfo/python-list >
[toc] | [prev] | [next] | [standalone]
| From | Chris Rebert <clp2@rebertia.com> |
|---|---|
| Date | 2013-01-29 21:16 -0800 |
| Message-ID | <mailman.1203.1359522966.2939.python-list@python.org> |
| In reply to | #37919 |
On Tue, Jan 29, 2013 at 8:55 PM, RichD <r_delaney2001@yahoo.com> wrote: > I read Wall Street Journal, and occasionally check > articles on their Web site. It's mostly free, with some items > available to subscribers only. It seems random, which ones > they block, about 20%. > > Anywho, sometimes I use their search utility, the usual author > or title search, and it blocks, then I look it up on Google, and > link from there, and it loads! ok, Web gurus, what's going on? http://www.google.com/search?btnG=1&pws=0&q=first+click+free BTW, this has absolutely jack squat to do with Python. Please direct similar future inquiries to a more relevant forum. Regards, Chris
[toc] | [prev] | [next] | [standalone]
| From | Martin Musatov <marty.musatov@gmail.com> |
|---|---|
| Date | 2013-01-30 06:37 -0800 |
| Message-ID | <2b0bd8ac-1aa1-4b42-8e60-f83b64201d8e@d8g2000pbm.googlegroups.com> |
| In reply to | #37919 |
On Jan 29, 8:55 pm, RichD <r_delaney2...@yahoo.com> wrote:
> I read Wall Street Journal, and occasionally check<NotepadPlus>
<UserLang name="MUSATOV" ext=".myl" udlVersion="2.0">
<Settings>
<Global caseIgnored="no" allowFoldOfComments="no"
forceLineCommentsAtBOL="no" foldCompact="yes" />
<Prefix Keywords1="no" Keywords2="no" Keywords3="no"
Keywords4="no" Keywords5="no" Keywords6="no" Keywords7="no"
Keywords8="no" />
</Settings>
<KeywordLists>
<Keywords name="Comments" id="0">00commentBegin 01comment
02commentEnd 03 04</Keywords>
<Keywords name="Numbers, additional" id="1"></Keywords>
<Keywords name="Numbers, prefixes" id="2"></Keywords>
<Keywords name="Numbers, extras with prefixes" id="3"></
Keywords>
<Keywords name="Numbers, suffixes" id="4"></Keywords>
<Keywords name="Operators1" id="5">();</Keywords>
<Keywords name="Operators2" id="6"></Keywords>
<Keywords name="Folders in code1, open" id="7">Open</
Keywords>
<Keywords name="Folders in code1, middle" id="8">middle</
Keywords>
<Keywords name="Folders in code1, close" id="9">Close</
Keywords>
<Keywords name="Folders in code2, open" id="10">Open</
Keywords>
<Keywords name="Folders in code2, middle" id="11">middle</
Keywords>
<Keywords name="Folders in code2, close" id="12">Close</
Keywords>
<Keywords name="Folders in comment, open" id="13">Open</
Keywords>
<Keywords name="Folders in comment, middle"
id="14">middle</Keywords>
<Keywords name="Folders in comment, close" id="15">Close</
Keywords>
<Keywords name="Keywords1" id="16">%%</Keywords>
<Keywords name="Keywords2" id="17"></Keywords>
<Keywords name="Keywords3" id="18"></Keywords>
<Keywords name="Keywords4" id="19"></Keywords>
<Keywords name="Keywords5" id="20"></Keywords>
<Keywords name="Keywords6" id="21"></Keywords>
<Keywords name="Keywords7" id="22"></Keywords>
<Keywords name="Keywords8" id="23"></Keywords>
<Keywords name="Delimiters" id="24"></Keywords>
</KeywordLists>
<Styles>
<WordsStyle name="DEFAULT" styleID="0" fgColor="FFFFFF"
bgColor="000000" fontName="Monotype Corsiva" fontStyle="7"
fontSize="14" nesting="0" />
<WordsStyle name="COMMENTS" styleID="1" fgColor="000000"
bgColor="FFFFFF" fontStyle="0" nesting="0" />
<WordsStyle name="LINE COMMENTS" styleID="2"
fgColor="000000" bgColor="FFFFFF" fontStyle="0" nesting="0" />
<WordsStyle name="NUMBERS" styleID="3" fgColor="000000"
bgColor="FFFFFF" fontStyle="0" nesting="0" />
<WordsStyle name="KEYWORDS1" styleID="4" fgColor="000000"
bgColor="FFFFFF" fontStyle="0" nesting="0" />
<WordsStyle name="KEYWORDS2" styleID="5" fgColor="000000"
bgColor="FFFFFF" fontStyle="0" nesting="0" />
<WordsStyle name="KEYWORDS3" styleID="6" fgColor="000000"
bgColor="FFFFFF" fontStyle="0" nesting="0" />
<WordsStyle name="KEYWORDS4" styleID="7" fgColor="000000"
bgColor="FFFFFF" fontStyle="0" nesting="0" />
<WordsStyle name="KEYWORDS5" styleID="8" fgColor="000000"
bgColor="FFFFFF" fontStyle="0" nesting="0" />
<WordsStyle name="KEYWORDS6" styleID="9" fgColor="000000"
bgColor="FFFFFF" fontStyle="0" nesting="0" />
<WordsStyle name="KEYWORDS7" styleID="10" fgColor="000000"
bgColor="FFFFFF" fontStyle="0" nesting="0" />
<WordsStyle name="KEYWORDS8" styleID="11" fgColor="000000"
bgColor="FFFFFF" fontStyle="0" nesting="0" />
<WordsStyle name="OPERATORS" styleID="12" fgColor="000000"
bgColor="FFFFFF" fontStyle="0" nesting="0" />
<WordsStyle name="FOLDER IN CODE1" styleID="13"
fgColor="FFFFFF" bgColor="000000" fontName="" fontStyle="7"
fontSize="10" nesting="0" />
<WordsStyle name="FOLDER IN CODE2" styleID="14"
fgColor="000000" bgColor="FFFFFF" fontStyle="0" nesting="0" />
<WordsStyle name="FOLDER IN COMMENT" styleID="15"
fgColor="FFFFFF" bgColor="000000" fontName="Times New Roman"
fontStyle="7" fontSize="8" nesting="0" />
<WordsStyle name="DELIMITERS1" styleID="16"
fgColor="000000" bgColor="FFFFFF" fontStyle="0" nesting="0" />
<WordsStyle name="DELIMITERS2" styleID="17"
fgColor="000000" bgColor="FFFFFF" fontStyle="0" nesting="0" />
<WordsStyle name="DELIMITERS3" styleID="18"
fgColor="000000" bgColor="FFFFFF" fontStyle="0" nesting="0" />
<WordsStyle name="DELIMITERS4" styleID="19"
fgColor="000000" bgColor="FFFFFF" fontStyle="0" nesting="0" />
<WordsStyle name="DELIMITERS5" styleID="20"
fgColor="000000" bgColor="FFFFFF" fontStyle="0" nesting="0" />
<WordsStyle name="DELIMITERS6" styleID="21"
fgColor="000000" bgColor="FFFFFF" fontStyle="0" nesting="0" />
<WordsStyle name="DELIMITERS7" styleID="22"
fgColor="000000" bgColor="FFFFFF" fontStyle="0" nesting="0" />
<WordsStyle name="DELIMITERS8" styleID="23"
fgColor="000000" bgColor="FFFFFF" fontStyle="0" nesting="0" />
</Styles>
</UserLang>
</NotepadPlus>
> articles on their Web site. It's mostly free, with some items
> available to subscribers only. It seems random, which ones
> they block, about 20%.
>
> Anywho, sometimes I use their search utility, the usual author
> or title search, and it blocks, then I look it up on Google, and
> link from there, and it loads! ok, Web gurus, what's going on?
>
> --
> Rich
[toc] | [prev] | [next] | [standalone]
| From | "Auric__" <not.my.real@email.address> |
|---|---|
| Date | 2013-01-30 22:40 +0000 |
| Message-ID | <XnsA1589F8B53EBauricauricauricauric@78.46.70.116> |
| In reply to | #37940 |
Martin Musatov wrote: > On Jan 29, 8:55 pm, RichD <r_delaney2...@yahoo.com> wrote: >> I read Wall Street Journal, and occasionally check<NotepadPlus> > <UserLang name="MUSATOV" ext=".myl" udlVersion="2.0"> [snip] > </UserLang> > </NotepadPlus> Ignoring the big ol' unneccessary crosspost... What the fuck? -- Oooh, I just learned a new euphemism.
[toc] | [prev] | [next] | [standalone]
| From | Gandalf Parker <gandalf@the.dead.ISP.of.Community.net> |
|---|---|
| Date | 2013-01-30 14:39 +0000 |
| Message-ID | <XnsA15843BB95369gandalfparker@78.46.70.116> |
| In reply to | #37919 |
RichD <r_delaney2001@yahoo.com> contributed wisdom to news:b968c6c6-5aa9- 4584-bd7a-5b097f17c54d@pu9g2000pbc.googlegroups.com: > Web gurus, what's going on? > That is the fault of the site itself. If they are going to block access to users then they should also block access to the automated spiders that hit the site to collect data.
[toc] | [prev] | [next] | [standalone]
| From | RichD <r_delaney2001@yahoo.com> |
|---|---|
| Date | 2013-01-30 11:39 -0800 |
| Message-ID | <badd4188-196b-45e3-ba8a-511d471282fa@nh8g2000pbc.googlegroups.com> |
| In reply to | #37941 |
On Jan 30, Gandalf Parker <gand...@the.dead.ISP.of.Community.net> wrote: > > Web gurus, what's going on? > > That is the fault of the site itself. > If they are going to block access to users then they should also block > access to the automated spiders that hit the site to collect data. well yeah, but what's going on, under the hood? How does it get confused? How could this happen? I'm looking for some insight, regarding a hypothetical programmimg glitch - -- Rich
[toc] | [prev] | [next] | [standalone]
| From | Joel Goldstick <joel.goldstick@gmail.com> |
|---|---|
| Date | 2013-01-30 14:45 -0500 |
| Message-ID | <mailman.1223.1359575133.2939.python-list@python.org> |
| In reply to | #37953 |
[Multipart message — attachments visible in raw view] — view raw
On Wed, Jan 30, 2013 at 2:39 PM, RichD <r_delaney2001@yahoo.com> wrote: > On Jan 30, Gandalf Parker <gand...@the.dead.ISP.of.Community.net> > wrote: > > > Web gurus, what's going on? > > > > That is the fault of the site itself. > > If they are going to block access to users then they should also block > > access to the automated spiders that hit the site to collect data. > > well yeah, but what's going on, under the hood? > How does it get confused? How could this > happen? I'm looking for some insight, regarding a > hypothetical programmimg glitch - > > > -- > Rich > -- > > As was pointed out, this really is off topic for this group. You might try googling. The NYTimes makes articles available by adding a parameter to the tail of the url I believe -- Joel Goldstick http://joelgoldstick.com
[toc] | [prev] | [next] | [standalone]
| From | alex23 <wuwei23@gmail.com> |
|---|---|
| Date | 2013-01-30 17:16 -0800 |
| Message-ID | <ee336fe9-fb4f-455c-a345-18a6751db5be@qi8g2000pbb.googlegroups.com> |
| In reply to | #37953 |
On Jan 31, 5:39 am, RichD <r_delaney2...@yahoo.com> wrote: > well yeah, but what's going on, under the hood? > How does it get confused? How could this > happen? I'm looking for some insight, regarding a > hypothetical programmimg glitch - As has been stated, this has nothing to do with Python, so please stop posting your questions here. However, here's an answer to get you to stop repeating yourself: it's not uncommon to find that content you're restricted from accessing via a site's own search is available to you through Google. This has to do with Google's policy of _requiring_ that pages that it is allowed to index _must_ be available for view. Any site that allows Google to index its pages that then blocks you from viewing them will swiftly find themselves web site-a non gratis in Google search. As most websites are attention whores, they'll do anything to ensure they remain within Google's indices.
[toc] | [prev] | [next] | [standalone]
| From | Gandalf Parker <gandalf@the.dead.ISP.of.Community.net> |
|---|---|
| Date | 2013-01-31 14:07 +0000 |
| Message-ID | <XnsA1593E485EC8Egandalfparker@46.4.102.18> |
| In reply to | #37953 |
RichD <r_delaney2001@yahoo.com> contributed wisdom to news:badd4188-196b- 45e3-ba8a-511d471282fa@nh8g2000pbc.googlegroups.com: > On Jan 30, Gandalf Parker <gand...@the.dead.ISP.of.Community.net> > wrote: >> > Web gurus, what's going on? >> >> That is the fault of the site itself. >> If they are going to block access to users then they should also block >> access to the automated spiders that hit the site to collect data. > > well yeah, but what's going on, under the hood? > How does it get confused? How could this > happen? I'm looking for some insight, regarding a > hypothetical programmimg glitch - (from alt.hacker) You dont understand. It is not in the code. It is in the site. It is as if someone comes and picks fruit off of your tree, and you are questioning the tree for how it bears fruit. The site creates web pages. Google collects web pages. The site needs to set things like robot.txt to tell Google to NOT collect the pages in the archives. Which is not an absolute protection but at least its an effort that works for most sites.
[toc] | [prev] | [next] | [standalone]
| From | Big Bad Bob <BigBadBob-at-mrp3-dot-com@testing.local> |
|---|---|
| Date | 2013-01-30 11:59 -0800 |
| Message-ID | <5e2dncvzz7Y55pTMnZ2dnUVZ_smdnZ2d@earthlink.com> |
| In reply to | #37919 |
On 01/29/13 20:55, RichD so wittily quipped: > I read Wall Street Journal, and occasionally check > articles on their Web site. It's mostly free, with some items > available to subscribers only. It seems random, which ones > they block, about 20%. > > Anywho, sometimes I use their search utility, the usual author > or title search, and it blocks, then I look it up on Google, and > link from there, and it loads! ok, Web gurus, what's going on? in my last post, I quoted an article from 'The Register' where they talk about how Facebook (literally) "broke" that feature. [this works in a LOT of places, but sometimes you have to enable cookies or javascript to actually see the content]
[toc] | [prev] | [next] | [standalone]
| From | Arne Vajhøj <arne@vajhoej.dk> |
|---|---|
| Date | 2013-01-30 20:40 -0500 |
| Message-ID | <5109cb8c$0$288$14726298@news.sunsite.dk> |
| In reply to | #37919 |
On 1/29/2013 11:55 PM, RichD wrote: > I read Wall Street Journal, and occasionally check > articles on their Web site. It's mostly free, with some items > available to subscribers only. It seems random, which ones > they block, about 20%. > > Anywho, sometimes I use their search utility, the usual author > or title search, and it blocks, then I look it up on Google, and > link from there, and it loads! ok, Web gurus, what's going on? WSJ want their articles to be findable from Google. So they open up for Google indexing them. If they require any type of registration to see an article, then Google will remove the link. So therefore WSJ (and many other web sites!) gives more access if you come from Google than if not. Arne
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web