Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.programmer > #2722 > unrolled thread

HTTP and Java

Started byRoedy Green <see_website@mindprod.com.invalid>
First post2011-04-01 23:35 -0700
Last post2011-04-04 22:18 -0700
Articles 15 — 8 participants

Back to article view | Back to comp.lang.java.programmer


Contents

  HTTP and Java Roedy Green <see_website@mindprod.com.invalid> - 2011-04-01 23:35 -0700
    Re: HTTP and Java Luuk <Luuk@invalid.lan> - 2011-04-02 10:55 +0200
    Re: HTTP and Java Lawrence D'Oliveiro <ldo@geek-central.gen.new_zealand> - 2011-04-02 21:58 +1300
    Re: HTTP and Java Tom Anderson <twic@urchin.earth.li> - 2011-04-02 14:22 +0100
    Re: HTTP and Java Patricia Shanahan <pats@acm.org> - 2011-04-02 06:26 -0700
      Re: HTTP and Java Roedy Green <see_website@mindprod.com.invalid> - 2011-04-02 23:17 -0700
        Re: HTTP and Java Luuk <Luuk@invalid.lan> - 2011-04-03 12:58 +0200
          Re: HTTP and Java Roedy Green <see_website@mindprod.com.invalid> - 2011-04-03 16:47 -0700
    Re: HTTP and Java Joshua Cranmer <Pidgeot18@verizon.invalid> - 2011-04-02 16:27 -0400
    Re: HTTP and Java Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2011-04-03 09:50 -0300
      Re: HTTP and Java Roedy Green <see_website@mindprod.com.invalid> - 2011-04-03 16:51 -0700
        Re: HTTP and Java Tom Anderson <twic@urchin.earth.li> - 2011-04-04 23:25 +0100
        Re: HTTP and Java Ian Shef <invalid@avoiding.spam> - 2011-04-04 23:06 +0000
          Re: HTTP and Java Roedy Green <see_website@mindprod.com.invalid> - 2011-04-04 22:17 -0700
          Re: HTTP and Java Roedy Green <see_website@mindprod.com.invalid> - 2011-04-04 22:18 -0700

#2722 — HTTP and Java

FromRoedy Green <see_website@mindprod.com.invalid>
Date2011-04-01 23:35 -0700
SubjectHTTP and Java
Message-ID<kogdp6l1id6c90b2fr1dlufr5k6ukj9ft3@4ax.com>
If Java sent an identical HTTP header, to that sent by a browser,
including User-Agent to a website, is there a plausible mechanism by
which a website would treat the requests differently, namely reject
Java and accept the browser request?
-- 
Roedy Green Canadian Mind Products
http://mindprod.com
Doing what the user expects with respect to navigation is absurdly important for user satisfaction.
~ anonymous Google Android developer

[toc] | [next] | [standalone]


#2729

FromLuuk <Luuk@invalid.lan>
Date2011-04-02 10:55 +0200
Message-ID<4d96e489$0$41113$e4fe514c@news.xs4all.nl>
In reply to#2722
On 02-04-2011 08:35, Roedy Green wrote:
> If Java sent an identical HTTP header, to that sent by a browser,
> including User-Agent to a website, is there a plausible mechanism by
> which a website would treat the requests differently, namely reject
> Java and accept the browser request?

How would this webserver know the difference, if there is none?

I dont think its possible to do something different if you send the same
information.

-- 
Luuk

[toc] | [prev] | [next] | [standalone]


#2731

FromLawrence D'Oliveiro <ldo@geek-central.gen.new_zealand>
Date2011-04-02 21:58 +1300
Message-ID<in6ofg$5b5$2@lust.ihug.co.nz>
In reply to#2722
In message <kogdp6l1id6c90b2fr1dlufr5k6ukj9ft3@4ax.com>, Roedy Green wrote:

> If Java sent an identical HTTP header, to that sent by a browser,
> including User-Agent to a website, is there a plausible mechanism by
> which a website would treat the requests differently, namely reject
> Java and accept the browser request?

Other than by pulling JavaScript tricks, you mean?

[toc] | [prev] | [next] | [standalone]


#2742

FromTom Anderson <twic@urchin.earth.li>
Date2011-04-02 14:22 +0100
Message-ID<alpine.DEB.2.00.1104021419470.16545@urchin.earth.li>
In reply to#2722
On Fri, 1 Apr 2011, Roedy Green wrote:

> If Java sent an identical HTTP header, to that sent by a browser, 
> including User-Agent to a website, is there a plausible mechanism by 
> which a website would treat the requests differently, namely reject Java 
> and accept the browser request?

For a single request, no.

A server might be able to tell the difference between a browser and Java 
by looking at previous interactions, eg if the browser had made some AJAX 
calls that the Java program did not because it was not running JavaScript.

IME, if i have a Java program and a browser behaving differently, it's 
because the request is different in some way i hadn't realised. Get 
Wireshark on the case immediately, and safe yourself some puzzlement.

tom

-- 
Freedom is the right of all sentient beings. -- Optimus Prime

[toc] | [prev] | [next] | [standalone]


#2744

FromPatricia Shanahan <pats@acm.org>
Date2011-04-02 06:26 -0700
Message-ID<NOydne1eIvmMuQrQnZ2dnUVZ_gIAAAAA@earthlink.com>
In reply to#2722
On 4/1/2011 11:35 PM, Roedy Green wrote:
> If Java sent an identical HTTP header, to that sent by a browser,
> including User-Agent to a website, is there a plausible mechanism by
> which a website would treat the requests differently, namely reject
> Java and accept the browser request?

I don't think it could tell a single request apart, without requiring
the requester to solve e.g. a text recognition problem. It might be able
to detect a difference in number and frequency of requests from a single
IP address, if the Java program sent multiple requests.

Patricia

[toc] | [prev] | [next] | [standalone]


#2796

FromRoedy Green <see_website@mindprod.com.invalid>
Date2011-04-02 23:17 -0700
Message-ID<tr3gp6t8tm4ts79qj55he6ib11u3ddv6tl@4ax.com>
In reply to#2744
On Sat, 02 Apr 2011 06:26:37 -0700, Patricia Shanahan <pats@acm.org>
wrote, quoted or indirectly quoted someone who said :

>I don't think it could tell a single request apart, without requiring
>the requester to solve e.g. a text recognition problem. It might be able
>to detect a difference in number and frequency of requests from a single
>IP address, if the Java program sent multiple requests.

I have sidestepped the problem by going to a different website that
has almost the same information.  That still leaves the puzzle.

The websites in question are www.ecs.com.tw and www.ecsusa.com

the 500 code I get back is supposedly a server side error, not my
fault.

In theory it could be a timeout, or something to do with sending
multiple gets. The first works. Or the timing between gets.  Yet my
experiments seemed to eliminate those causes.  I think I will have to
leave this. It a black hole without much reward for the solution other
than satisfying curiosity. I just hoped someone would think of some
factor I had not considered.



-- 
Roedy Green Canadian Mind Products
http://mindprod.com
Doing what the user expects with respect to navigation is absurdly important for user satisfaction.
~ anonymous Google Android developer

[toc] | [prev] | [next] | [standalone]


#2798

FromLuuk <Luuk@invalid.lan>
Date2011-04-03 12:58 +0200
Message-ID<4d9852c0$0$41117$e4fe514c@news.xs4all.nl>
In reply to#2796
On 03-04-2011 08:17, Roedy Green wrote:
> On Sat, 02 Apr 2011 06:26:37 -0700, Patricia Shanahan <pats@acm.org>
> wrote, quoted or indirectly quoted someone who said :
> 
>> I don't think it could tell a single request apart, without requiring
>> the requester to solve e.g. a text recognition problem. It might be able
>> to detect a difference in number and frequency of requests from a single
>> IP address, if the Java program sent multiple requests.
> 
> I have sidestepped the problem by going to a different website that
> has almost the same information.  That still leaves the puzzle.
> 
> The websites in question are www.ecs.com.tw and www.ecsusa.com
> 
> the 500 code I get back is supposedly a server side error, not my
> fault.

It you get an ERROR 500 from www.ecs.com.tw, than its your error....

luuk@opensuse:/tmp> wget -S -U "this is a weird browser"
http://www.ecs.com.tw/
--2011-04-03 12:56:29--  http://www.ecs.com.tw/
Resolving www.ecs.com.tw... 210.17.27.2
Connecting to www.ecs.com.tw|210.17.27.2|:80... connected.
HTTP request sent, awaiting response...
  HTTP/1.1 200 OK
  Content-Length: 123
  Content-Type: text/html
  Content-Location: http://www.ecs.com.tw/index.html
  Last-Modified: Mon, 17 May 2010 13:11:33 GMT
  Accept-Ranges: bytes
  ETag: "8ed7ab78c2f5ca1:aa4"
  Server: Microsoft-IIS/6.0
  X-Powered-By: ASP.NET
  Date: Sun, 03 Apr 2011 10:57:04 GMT
  Connection: keep-alive
Length: 123 [text/html]
Saving to: `index.html.12'



> 
> In theory it could be a timeout, or something to do with sending
> multiple gets. 

not, a timeout should give another error



-- 
Luuk

[toc] | [prev] | [next] | [standalone]


#2819

FromRoedy Green <see_website@mindprod.com.invalid>
Date2011-04-03 16:47 -0700
Message-ID<nl1ip6t86vuiv9nu2hh9bdhl9oelc4so6c@4ax.com>
In reply to#2798
On Sun, 03 Apr 2011 12:58:07 +0200, Luuk <Luuk@invalid.lan> wrote,
quoted or indirectly quoted someone who said :

>It you get an ERROR 500 from www.ecs.com.tw, than its your error....

according to http://www.checkupdown.com/status/E500.html

"This error can only be resolved by fixes to the Web server software.
It is not a client-side problem. It is up to the operators of the Web
server site to locate and analyse the logs which should give further
information about the error."

Yet clearly it was something I was doing as a client that triggered
it.
-- 
Roedy Green Canadian Mind Products
http://mindprod.com
Doing what the user expects with respect to navigation is absurdly important for user satisfaction.
~ anonymous Google Android developer

[toc] | [prev] | [next] | [standalone]


#2772

FromJoshua Cranmer <Pidgeot18@verizon.invalid>
Date2011-04-02 16:27 -0400
Message-ID<in80r9$dbh$2@dont-email.me>
In reply to#2722
On 04/02/2011 02:35 AM, Roedy Green wrote:
> If Java sent an identical HTTP header, to that sent by a browser,
> including User-Agent to a website, is there a plausible mechanism by
> which a website would treat the requests differently, namely reject
> Java and accept the browser request?

A plausible technique is the use of separating out the real content into 
things which require further loading--like requiring client-side 
scripting, cookies, use of CSS, iframes, images, embedded plugins (e.g., 
Flash or Silverlight), client-side redirects. You can then ensure that 
these things are also downloaded before accepting other page requests.

An additional factor could be if browsers actually use HTTP different 
from Java... e.g., if one pipelines and the other doesn't, or perhaps 
automatic SSL upgrading, etc. It all comes down to how many false 
negatives the server is willing to bear.

-- 
Beware of bugs in the above code; I have only proved it correct, not 
tried it. -- Donald E. Knuth

[toc] | [prev] | [next] | [standalone]


#2803

FromArved Sandstrom <asandstrom3minus1@eastlink.ca>
Date2011-04-03 09:50 -0300
Message-ID<y2_lp.5804$0s5.3652@newsfe17.iad>
In reply to#2722
On 11-04-02 03:35 AM, Roedy Green wrote:
> If Java sent an identical HTTP header, to that sent by a browser,
> including User-Agent to a website, is there a plausible mechanism by
> which a website would treat the requests differently, namely reject
> Java and accept the browser request?

When curling that first URL (www.ecs.com.tw), I get a

<meta http-equiv="Refresh"
content="0;url=http://www.ecs.com.tw/ECSWebSite/Index.aspx">

A curl on that gives me a 302 to

http://www.ecs.com.tw/ECSWebSite/Index.aspx?MenuID=0&amp;LanID=0

which when fetched (a curl -L on the 302-producing page) is a "real" page.

I think it's simply that your Java code is incorrect. Every request
above produces an <html>...</html> page.

AHS

-- 
That's not the recollection that I recall...All this information is
certainly in the hands of the auditor and we certainly await his report
to indicate what he deems has occurred.
-- Halifax, Nova Scotia mayor Peter Kelly, who is currently deeply in
the shit

[toc] | [prev] | [next] | [standalone]


#2821

FromRoedy Green <see_website@mindprod.com.invalid>
Date2011-04-03 16:51 -0700
Message-ID<cp1ip61i4414vgbvvnlm8cnlvvjaca0prq@4ax.com>
In reply to#2803
On Sun, 03 Apr 2011 09:50:38 -0300, Arved Sandstrom
<asandstrom3minus1@eastlink.ca> wrote, quoted or indirectly quoted
someone who said :

>When curling that first URL (www.ecs.com.tw), I get a

Sorry, that is the website where the problem is, but not the URL.  You
need to try a specific motherboard e.g. 
http://www.ecs.com.tw/ECSWebSite/Product/Product_Model.aspx?CategoryID=1&TypeID=68&MenuID=19&LanID=0#Socket%20AM3
then
http://www.ecs.com.tw/ECSWebSite/Product/Product_Detail.aspx?DetailID=1115&CategoryID=1&DetailName=Feature&MenuID=19&LanID=0
-- 
Roedy Green Canadian Mind Products
http://mindprod.com
Doing what the user expects with respect to navigation is absurdly important for user satisfaction.
~ anonymous Google Android developer

[toc] | [prev] | [next] | [standalone]


#2854

FromTom Anderson <twic@urchin.earth.li>
Date2011-04-04 23:25 +0100
Message-ID<alpine.DEB.2.00.1104042323360.5957@urchin.earth.li>
In reply to#2821
On Sun, 3 Apr 2011, Roedy Green wrote:

> On Sun, 03 Apr 2011 09:50:38 -0300, Arved Sandstrom
> <asandstrom3minus1@eastlink.ca> wrote, quoted or indirectly quoted
> someone who said :
>
>> When curling that first URL (www.ecs.com.tw), I get a
>
> Sorry, that is the website where the problem is, but not the URL.  You
> need to try a specific motherboard e.g.
> http://www.ecs.com.tw/ECSWebSite/Product/Product_Model.aspx?CategoryID=1&TypeID=68&MenuID=19&LanID=0#Socket%20AM3
> then
> http://www.ecs.com.tw/ECSWebSite/Product/Product_Detail.aspx?DetailID=1115&CategoryID=1&DetailName=Feature&MenuID=19&LanID=0

I can download the boths URLs perfectly fine with both curl and 
URLConnection. The data downloaded is the same for both means.

tom

-- 
Your words are mostly meaningless symbols -- Andrew, to Niall

[toc] | [prev] | [next] | [standalone]


#2859

FromIan Shef <invalid@avoiding.spam>
Date2011-04-04 23:06 +0000
Message-ID<Xns9EBDA3D0E3A77vaj4088ianshef@138.125.254.103>
In reply to#2821
Roedy Green <see_website@mindprod.com.invalid> wrote in
news:cp1ip61i4414vgbvvnlm8cnlvvjaca0prq@4ax.com: 

<snip>
> Sorry, that is the website where the problem is, but not the URL.  You
> need to try a specific motherboard e.g. 
> http://www.ecs.com.tw/ECSWebSite/Product/Product_Model.aspx?CategoryID=1&
> TypeID=68&MenuID=19&LanID=0#Socket%20AM3 then
> http://www.ecs.com.tw/ECSWebSite/Product/Product_Detail.aspx?DetailID=111
> 5&CategoryID=1&DetailName=Feature&MenuID=19&LanID=0 

Why do both?  The second one alone gets me the same result.

An HTTP GET without cookies or other mechanisms (e.g. Javascript, a session 
identifier in the URL, etc.) is stateless.

I have found WebScarab 

http://www.owasp.org/index.php/Category:OWASP_WebScarab_Project

useful for investigating these issues.  It's like using a cannon to kill a 
fly, but the cannon is free.

[toc] | [prev] | [next] | [standalone]


#2876

FromRoedy Green <see_website@mindprod.com.invalid>
Date2011-04-04 22:17 -0700
Message-ID<jc9lp6dh0eu9dcsh4u9g741evs0cttu1dv@4ax.com>
In reply to#2859
On Mon, 04 Apr 2011 23:06:13 GMT, Ian Shef <invalid@avoiding.spam>
wrote, quoted or indirectly quoted someone who said :

>Why do both?  The second one alone gets me the same result
To reproduce what I am doing. I read the first page and collect
information from it to create the individual motherboard URLs.

I suspect the problem has something to do with the order of reading
pages.
-- 
Roedy Green Canadian Mind Products
http://mindprod.com
Doing what the user expects with respect to navigation is absurdly important for user satisfaction.
~ anonymous Google Android developer

[toc] | [prev] | [next] | [standalone]


#2877

FromRoedy Green <see_website@mindprod.com.invalid>
Date2011-04-04 22:18 -0700
Message-ID<8g9lp6dd58ces8i4u2bkieefv2608g3efh@4ax.com>
In reply to#2859
On Mon, 04 Apr 2011 23:06:13 GMT, Ian Shef <invalid@avoiding.spam>
wrote, quoted or indirectly quoted someone who said :

>
>An HTTP GET without cookies or other mechanisms (e.g. Javascript, a session 
>identifier in the URL, etc.) is stateless.

It is supposed to be.  I suspect caching, reuse of connections ...
might mean it is not quite stateless.
-- 
Roedy Green Canadian Mind Products
http://mindprod.com
Doing what the user expects with respect to navigation is absurdly important for user satisfaction.
~ anonymous Google Android developer

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.java.programmer


csiph-web