Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #83425 > unrolled thread
| Started by | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| First post | 2015-01-09 21:56 +1100 |
| Last post | 2015-01-10 07:57 -0600 |
| Articles | 15 — 8 participants |
Back to article view | Back to comp.lang.python
Why do the URLs of posts here change? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-01-09 21:56 +1100
Re: Why do the URLs of posts here change? Skip Montanaro <skip.montanaro@gmail.com> - 2015-01-09 06:04 -0600
Re: Why do the URLs of posts here change? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-01-09 23:28 +1100
Re: Why do the URLs of posts here change? Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2015-01-10 16:21 +1300
Re: Why do the URLs of posts here change? Chris Angelico <rosuav@gmail.com> - 2015-01-10 14:53 +1100
Re: Why do the URLs of posts here change? albert@spenarnc.xs4all.nl (Albert van der Horst) - 2015-01-17 16:39 +0000
Re: Why do the URLs of posts here change? Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2015-01-18 11:31 +1300
Re: Why do the URLs of posts here change? Peter Otten <__peter__@web.de> - 2015-01-09 13:04 +0100
Re: Why do the URLs of posts here change? Rustom Mody <rustompmody@gmail.com> - 2015-01-09 06:09 -0800
Re: Why do the URLs of posts here change? Skip Montanaro <skip.montanaro@gmail.com> - 2015-01-09 08:15 -0600
Re: Why do the URLs of posts here change? Rustom Mody <rustompmody@gmail.com> - 2015-01-09 08:52 -0800
Re: Why do the URLs of posts here change? Chris Angelico <rosuav@gmail.com> - 2015-01-10 03:57 +1100
Re: Why do the URLs of posts here change? Rustom Mody <rustompmody@gmail.com> - 2015-01-09 09:19 -0800
Re: Why do the URLs of posts here change? Terry Reedy <tjreedy@udel.edu> - 2015-01-10 04:41 -0500
Re: Why do the URLs of posts here change? Skip Montanaro <skip.montanaro@gmail.com> - 2015-01-10 07:57 -0600
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2015-01-09 21:56 +1100 |
| Subject | Why do the URLs of posts here change? |
| Message-ID | <54afb3ed$0$12995$c3e8da3$5496439d@news.astraweb.com> |
I have come across this in the past, but today it annoyed me enough that I'm asking for an explanation. Posts on this newsgroup/mailing list are archived on the web, but the URLs seem to change, which leaves dead links if you search for things. For example, today I searched for a quote about floating point equality by William Kahan, and I came across this post by me: https://mail.python.org/pipermail/python-list/2008-February/468598.html But that's a dead link! Here's Google's cache of it: http://webcache.googleusercontent.com/search?client=opera&rls=en&q=cache:i0cWb0Tjxe0J:https://mail.python.org/pipermail/python-list/2008-February/468598.html%2Bfloating+point+superstition+equality&oe=utf-8&channel=suggest&gws_rd=ssl&hl=en&&ct=clnk And here is the actual URL, as it appears today: https://mail.python.org/pipermail/python-list/2008-February/481374.html Why has the URL changed? Surely this is a bug? Where can I report it? -- Steven
[toc] | [next] | [standalone]
| From | Skip Montanaro <skip.montanaro@gmail.com> |
|---|---|
| Date | 2015-01-09 06:04 -0600 |
| Message-ID | <mailman.17520.1420805082.18130.python-list@python.org> |
| In reply to | #83425 |
> Posts on this newsgroup/mailing list are archived on the web, but the URLs > seem to change, which leaves dead links if you search for things. Steven, It's a known issue, but one which appears to be somewhat unavoidable, at least in Mailman 2.x. The problem is that every now and then, postmaster@python.org gets a legitimate request from someone for a message to be deleted from the list archive. The way this is done, is that the message is removed from the underlying mbox file, and the archive regenerated. That changes the counter for every message after that point - or maybe every message in the generated archive. (I have no idea why the numerical basename of your subject message would have changed so much. Maybe there is just a single ever incrementing counter for a given Mailman installation.) >From a technical standpoint, these sorts of requests are pretty futile, since comp.lang.python/python-list@python.org is archived in so many places, but that doesn't make the requests any less legitimate. Consequently, when they arrive at the postmaster address, they are generally taken care of in short order. In my experience, they have generally fallen into two categories: 1. Safety. I recall one request where a woman accidentally posted using an otherwise private email address. She was being stalked by her ex-husband, and that address was unknown to him. 2. Defamation. There was a spate of recent messages (in Italian) defaming a couple people, accusing them of being Nazis or pedophiles. I will point out one class of messages which aren't deleted: those which demonstrate people's stupidity. People do dumb things - e.g., fly off the handle during a flame war - which they sometimes later realize reflects rather poorly on them (in future job searches and so forth). Those sorts of message deletion requests are rejected. That all said, I don't know if Mailman 3 (or some other archiver than pipermail) will improve on this problem. I suggest a post to mailman-users@python.org if you're curious about the Mailman state-of-the-art in this area. Skip
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2015-01-09 23:28 +1100 |
| Message-ID | <54afc96b$0$12985$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #83430 |
Skip Montanaro wrote: >> Posts on this newsgroup/mailing list are archived on the web, but the >> URLs seem to change, which leaves dead links if you search for things. [...] > That all said, I don't know if Mailman 3 (or some other archiver than > pipermail) will improve on this problem. I suggest a post to > mailman-users@python.org if you're curious about the Mailman > state-of-the-art in this area. Thanks for the explanation! -- Steven
[toc] | [prev] | [next] | [standalone]
| From | Gregory Ewing <greg.ewing@canterbury.ac.nz> |
|---|---|
| Date | 2015-01-10 16:21 +1300 |
| Message-ID | <chbk5cF7dv9U1@mid.individual.net> |
| In reply to | #83430 |
Skip Montanaro wrote: > The way this is done, is > that the message is removed from the underlying mbox file, and the > archive regenerated. That changes the counter for every message after > that point Would it help to replace the message with a stub instead of deleting it altogether? -- Greg
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2015-01-10 14:53 +1100 |
| Message-ID | <mailman.17551.1420862015.18130.python-list@python.org> |
| In reply to | #83487 |
On Sat, Jan 10, 2015 at 2:21 PM, Gregory Ewing <greg.ewing@canterbury.ac.nz> wrote: > Skip Montanaro wrote: >> >> The way this is done, is >> that the message is removed from the underlying mbox file, and the >> archive regenerated. That changes the counter for every message after >> that point > > > Would it help to replace the message with a stub > instead of deleting it altogether? I had the same thought, but apparently not, according to the page Peter Otten linked to: http://wiki.list.org/display/DEV/Stable+URLs ChrisA
[toc] | [prev] | [next] | [standalone]
| From | albert@spenarnc.xs4all.nl (Albert van der Horst) |
|---|---|
| Date | 2015-01-17 16:39 +0000 |
| Message-ID | <54ba9041$0$6961$e4fe514c@dreader36.news.xs4all.nl> |
| In reply to | #83489 |
In article <mailman.17551.1420862015.18130.python-list@python.org>, Chris Angelico <rosuav@gmail.com> wrote: >On Sat, Jan 10, 2015 at 2:21 PM, Gregory Ewing ><greg.ewing@canterbury.ac.nz> wrote: >> Skip Montanaro wrote: >>> >>> The way this is done, is >>> that the message is removed from the underlying mbox file, and the >>> archive regenerated. That changes the counter for every message after >>> that point >> >> >> Would it help to replace the message with a stub >> instead of deleting it altogether? > >I had the same thought, but apparently not, according to the page >Peter Otten linked to: > >http://wiki.list.org/display/DEV/Stable+URLs Knowing that the source is an mbox file, I don't need to follow that link to conclude that one is not very inventive. It suffices to replace the content of the message by a repetition of 'xxxx\n'. Maybe also the sender and the subject. > >ChrisA -- Albert van der Horst, UTRECHT,THE NETHERLANDS Economic growth -- being exponential -- ultimately falters. albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst
[toc] | [prev] | [next] | [standalone]
| From | Gregory Ewing <greg.ewing@canterbury.ac.nz> |
|---|---|
| Date | 2015-01-18 11:31 +1300 |
| Message-ID | <ci0662Fp9a3U1@mid.individual.net> |
| In reply to | #83931 |
Albert van der Horst wrote: > Knowing that the source is an mbox file, I don't need to follow > that link to conclude that one is not very inventive. > It suffices to replace the content of the message by > a repetition of 'xxxx\n'. Editing the mbox file isn't the problem. From what I gather, telling mailman to regenerate the web pages from the mbox file causes all the messages to be given new ID numbers, even if they remain in the same places in the mbox. So the web pages as well as the mbox would have to be edited by hand, instead of using the auto regen process. -- Greg
[toc] | [prev] | [next] | [standalone]
| From | Peter Otten <__peter__@web.de> |
|---|---|
| Date | 2015-01-09 13:04 +0100 |
| Message-ID | <mailman.17521.1420805425.18130.python-list@python.org> |
| In reply to | #83425 |
Steven D'Aprano wrote: > I have come across this in the past, but today it annoyed me enough that > I'm asking for an explanation. > > Posts on this newsgroup/mailing list are archived on the web, but the URLs > seem to change, which leaves dead links if you search for things. > > For example, today I searched for a quote about floating point equality by > William Kahan, and I came across this post by me: > > https://mail.python.org/pipermail/python-list/2008-February/468598.html > > But that's a dead link! Here's Google's cache of it: > > http://webcache.googleusercontent.com/search?client=opera&rls=en&q=cache:i0cWb0Tjxe0J:https://mail.python.org/pipermail/python-list/2008-February/468598.html%2Bfloating+point+superstition+equality&oe=utf-8&channel=suggest&gws_rd=ssl&hl=en&&ct=clnk > > And here is the actual URL, as it appears today: > > https://mail.python.org/pipermail/python-list/2008-February/481374.html > > > Why has the URL changed? Surely this is a bug? Where can I report it? This is a flaw of the mailman software. http://wiki.list.org/display/DEV/Stable+URLs suggests that the developers are aware of it. I don't know if there is a version available that has stable urls...
[toc] | [prev] | [next] | [standalone]
| From | Rustom Mody <rustompmody@gmail.com> |
|---|---|
| Date | 2015-01-09 06:09 -0800 |
| Message-ID | <28f970bd-188d-4455-b2b2-3320c546ad65@googlegroups.com> |
| In reply to | #83425 |
On Friday, January 9, 2015 at 4:26:58 PM UTC+5:30, Steven D'Aprano wrote: > I have come across this in the past, but today it annoyed me enough that I'm > asking for an explanation. > > Posts on this newsgroup/mailing list are archived on the web, but the URLs > seem to change, which leaves dead links if you search for things. > > For example, today I searched for a quote about floating point equality by > William Kahan, and I came across this post by me: > > https://mail.python.org/pipermail/python-list/2008-February/468598.html > > But that's a dead link! Here's Google's cache of it: > > http://webcache.googleusercontent.com/search?client=opera&rls=en&q=cache:i0cWb0Tjxe0J:https://mail.python.org/pipermail/python-list/2008-February/468598.html%2Bfloating+point+superstition+equality&oe=utf-8&channel=suggest&gws_rd=ssl&hl=en&&ct=clnk > > And here is the actual URL, as it appears today: > > https://mail.python.org/pipermail/python-list/2008-February/481374.html > > > Why has the URL changed? Surely this is a bug? Where can I report it? Theres a new app/service that should solve your problem: Its from google... and called groups <wink>
[toc] | [prev] | [next] | [standalone]
| From | Skip Montanaro <skip.montanaro@gmail.com> |
|---|---|
| Date | 2015-01-09 08:15 -0600 |
| Message-ID | <mailman.17526.1420812949.18130.python-list@python.org> |
| In reply to | #83440 |
[Multipart message — attachments visible in raw view] — view raw
On Fri, Jan 9, 2015 at 8:09 AM, Rustom Mody <rustompmody@gmail.com> wrote: > Theres a new app/service that should solve your problem: > Its from google... and called groups <wink> > It solves one problem (moving archive URLs) by, I think, ignoring the other (archive posts which should really be removed). Skip
[toc] | [prev] | [next] | [standalone]
| From | Rustom Mody <rustompmody@gmail.com> |
|---|---|
| Date | 2015-01-09 08:52 -0800 |
| Message-ID | <f7e24119-fd65-4525-b10c-063a7eef4ad1@googlegroups.com> |
| In reply to | #83441 |
On Friday, January 9, 2015 at 7:46:42 PM UTC+5:30, Skip Montanaro wrote: > On Fri, Jan 9, 2015 at 8:09 AM, Rustom Mody wrote: > > Theres a new app/service that should solve your problem: > > Its from google... and called groups <wink> > It solves one problem (moving archive URLs) by, I think, ignoring the other (archive posts which should really be removed). > > > Skip Is it? Ok lets test that. This is posted from google-groups. After posting I shall remove it
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2015-01-10 03:57 +1100 |
| Message-ID | <mailman.17537.1420822660.18130.python-list@python.org> |
| In reply to | #83457 |
On Sat, Jan 10, 2015 at 3:52 AM, Rustom Mody <rustompmody@gmail.com> wrote: > Is it? > Ok lets test that. > This is posted from google-groups. > After posting I shall remove it Remove it from GG, maybe, but I doubt very much it'll be removed from the python.org archive. It's virtually impossible to remove something from everywhere... you have to find every copy and hope none have been web-archived yet. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Rustom Mody <rustompmody@gmail.com> |
|---|---|
| Date | 2015-01-09 09:19 -0800 |
| Message-ID | <a79802e4-dd05-4886-ba24-25567f782d84@googlegroups.com> |
| In reply to | #83458 |
On Friday, January 9, 2015 at 10:27:53 PM UTC+5:30, Chris Angelico wrote: > On Sat, Jan 10, 2015 at 3:52 AM, Rustom Mody wrote: > > Is it? > > Ok lets test that. > > This is posted from google-groups. > > After posting I shall remove it > > Remove it from GG, maybe, but I doubt very much it'll be removed from > the python.org archive. It's virtually impossible to remove something > from everywhere... you have to find every copy and hope none have been > web-archived yet. > > ChrisA Precisely my point. Removing something from the web is really a meaningless activity [apart from some moral feel-good factor] If that gesture means something to you, GG provides it And to the best of my knowledge it does not screw up links like mailman. We can test in some limited way that but I dont know how to do any test which will be reasonably exhaustive
[toc] | [prev] | [next] | [standalone]
| From | Terry Reedy <tjreedy@udel.edu> |
|---|---|
| Date | 2015-01-10 04:41 -0500 |
| Message-ID | <mailman.17561.1420883105.18130.python-list@python.org> |
| In reply to | #83425 |
On 1/9/2015 7:04 AM, Skip Montanaro wrote: >> Posts on this newsgroup/mailing list are archived on the web, but the URLs >> seem to change, which leaves dead links if you search for things. > > Steven, > > It's a known issue, but one which appears to be somewhat unavoidable, > at least in Mailman 2.x. The problem is that every now and then, > postmaster@python.org gets a legitimate request from someone for a > message to be deleted from the list archive. The way this is done, is > that the message is removed from the underlying mbox file, The post could be replaced by a placeholder "This message deleted' and the > archive regenerated. That changes the counter for every message after A placeholder should avoid that. > that point - or maybe every message in the generated archive. (I have > no idea why the numerical basename of your subject message would have > changed so much. Maybe there is just a single ever incrementing > counter for a given Mailman installation.) -- Terry Jan Reedy
[toc] | [prev] | [next] | [standalone]
| From | Skip Montanaro <skip.montanaro@gmail.com> |
|---|---|
| Date | 2015-01-10 07:57 -0600 |
| Message-ID | <mailman.17564.1420898282.18130.python-list@python.org> |
| In reply to | #83425 |
On Sat, Jan 10, 2015 at 3:41 AM, Terry Reedy <tjreedy@udel.edu> wrote: > The post could be replaced by a placeholder "This message deleted' > > and the >> >> archive regenerated. That changes the counter for every message after > > > A placeholder should avoid that. I suspect (though don't know for certain) that just regenerating the archive without touching the mbox file will change the numbering. Steven's original post mentioned two very different basenames (468598 and 481374). As I indicated in an earlier response, those might be generated from an ever-growing counter, not just a shift as articles slide one closer to the first one of the month. So, you'd have to edit the mbox file carefully (might need to edit headers) and also edit the generated HTML for the message. Neither is an insurmountable task, but both are going to be more error-prone than just cutting out an entire message and regenerating the archive. I will pass along your suggestion to the postmaster folks (I don't get involved that that level - it's mostly the folks who directly maintain the Postfix setup who do this), though. They are a generally pretty responsive bunch. Skip
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web