ONJava.com -- The Independent Source for Enterprise Java
oreilly.comSafari Books Online.Conferences.

advertisement

AddThis Social Bookmark Button

URLs and URIs, Proxies and Passwords
Pages: 1, 2, 3, 4, 5

Resolving Relative URIs

The URI class has three methods for converting back and forth between relative and absolute URIs.

public URI resolve(URI uri)

This method compares the uri argument to this URI and uses it to construct a new URI object that wraps an absolute URI. For example, consider these three lines of code:

URI absolute = new URI("http://www.example.com/");
URI relative = new URI("images/logo.png");
URI resolved = absolute.resolve(relative);

After they've executed, resolved contains the absolute URI http://www.example.com/images/logo.png.

If the invoking URI does not contain an absolute URI itself, the resolve( ) method resolves as much of the URI as it can and returns a new relative URI object as a result. For example, take these three statements:

URI top = new URI("javafaq/books/");
URI relative = new URI("jnp3/examples/07/index.html");
URI resolved = top.resolve(relative);

After they've executed, resolved now contains the relative URI javafaq/books/jnp3/examples/07/index.html with no scheme or authority.

public URI resolve(String uri)

This is a convenience method that simply converts the string argument to a URI and then resolves it against the invoking URI, returning a new URI object as the result. That is, it's equivalent to resolve(newURI(str)). Using this method, the previous two samples can be rewritten as:

URI absolute = new URI("http://www.example.com/");
URI resolved = absolute.resolve("images/logo.png");
URI top = new URI("javafaq/books/");
resolved = top.resolve("jnp3/examples/07/index.html");

public URI relativize(URI uri)

It's also possible to reverse this procedure; that is, to go from an absolute URI to a relative one. The relativize( ) method creates a new URI object from the uri argument that is relative to the invoking URI. The argument is not changed. For example:

URI absolute = new URI("http://www.example.com/images/logo.png");
URI top = new URI("http://www.example.com/");
URI relative = top.relativize(absolute);

The URI object relative now contains the relative URI images/logo.png.

Utility Methods

The URI class has the usual batch of utility methods: equals(), hashCode( ), toString( ), and compareTo( ).

public boolean equals(Object o)

URIs are tested for equality pretty much as you'd expect. It's not a direct string comparison. Equal URIs must both either be hierarchical or opaque. The scheme and authority parts are compared without considering case. That is, http and HTTP are the same scheme, and www.example.com is the same authority as www.EXAMPLE.com. The rest of the URI is case-sensitive, except for hexadecimal digits used to escape illegal characters. Escapes are not decoded before comparing. http://www.example.com/A and http://www.example.com/%41 are unequal URIs.

public int hashCode( )

The hashCode( ) method is a usual hashCode( ) method, nothing special. Equal URIs do have the same hash code and unequal URIs are fairly unlikely to share the same hash code.

public int compareTo(Object o)

URIs can be ordered. The ordering is based on string comparison of the individual parts, in this sequence:

  • If the schemes are different, the schemes are compared, without considering case.

  • Otherwise, if the schemes are the same, a hierarchical URI is considered to be less than an opaque URI with the same scheme.

  • If both URIs are opaque URIs, they're ordered according to their scheme-specific parts.

  • If both the scheme and the opaque scheme-specific parts are equal, the URIs are compared by their fragments.

  • If both URIs are hierarchical, they're ordered according to their authority components, which are themselves ordered according to user info, host, and port, in that order.

  • If the schemes and the authorities are equal, the path is used to distinguish them.

  • If the paths are also equal, the query strings are compared.

  • If the query strings are equal, the fragments are compared.

URIs are not comparable to any type except themselves. Comparing a URI to anything except another URI causes a ClassCastException.

public String toString( )

The toString( ) method returns an unencoded string form of the URI. That is, characters like é and \ are not percent-escaped unless they were percent-escaped in the strings used to construct this URI. Therefore, the result of calling this method is not guaranteed to be a syntactically correct URI. This form is sometimes useful for display to human beings, but not for retrieval.

public String toASCIIString( )

The toASCIIString( ) method returns an encoded string form of the URI. Characters like é and \ are always percent-escaped whether or not they were originally escaped. This is the string form of the URI you should use most of the time. Even if the form returned by toString( ) is more legible for humans, they may still copy and paste it into areas that are not expecting an illegal URI. toASCIIString( ) always returns a syntactically correct URI.

Proxies

Many systems access the Web and sometimes other non-HTTP parts of the Internet through proxy servers. A proxy server receives a request for a remote server from a local client. The proxy server makes the request to the remote server and forwards the result back to the local client. Sometimes this is done for security reasons, such as to prevent remote hosts from learning private details about the local network configuration. Other times it's done to prevent users from accessing forbidden sites by filtering outgoing requests and limiting which sites can be viewed. For instance, an elementary school might want to block access to http://www.playboy.com. And still other times it's done purely for performance, to allow multiple users to retrieve the same popular documents from a local cache rather than making repeated downloads from the remote server.

Java programs based on the URL class can work through most common proxy servers and protocols. Indeed, this is one reason you might want to choose to use the URL class rather than rolling your own HTTP or other client on top of raw sockets.

System Properties

For basic operations, all you have to do is set a few system properties to point to the addresses of your local proxy servers. If you are using a pure HTTP proxy, set http.proxyHost to the domain name or the IP address of your proxy server and http.proxyPort to the port of the proxy server (the default is 80). There are several ways to do this, including calling System.setProperty() from within your Java code or using the -D options when launching the program. This example sets the proxy server to 192.168.254.254 and the port to 9000:

% java -Dhttp.proxyHost=192.168.254.254  -Dhttp.proxyPort=9000  
com.domain.Program

If you want to exclude a host from being proxied and connect directly instead, set the http.nonProxyHosts system property to its hostname or IP address. To exclude multiple hosts, separate their names by vertical bars. For example, this code fragment proxies everything except java.oreilly.com and xml.oreilly.com:

System.setProperty("http.proxyHost", "192.168.254.254");
System.setProperty("http.proxyPort", "9000");
System.setProperty("http.nonProxyHosts", "java.oreilly.com|xml.oreilly.com");

You can also use an asterisk as a wildcard to indicate that all the hosts within a particular domain or subdomain should not be proxied. For example, to proxy everything except hosts in the oreilly.com domain:

% java -Dhttp.proxyHost=192.168.254.254  -Dhttp.nonProxyHosts=*.oreilly.com  
com.domain.Program

If you are using an FTP proxy server, set the ftp.proxyHost, ftp.proxyPort, and ftp.nonProxyHosts properties in the same way.

Java does not support any other application layer proxies, but if you're using a transport layer SOCKS proxy for all TCP connections, you can identify it with the socksProxyHost and socksProxyPort system properties. Java does not provide an option for nonproxying with SOCKS. It's an all-or-nothing decision.

The Proxy Class

Java 1.5 allows more fine-grained control of proxy servers from within a Java program. Specifically, this allows you to choose different proxy servers for different remote hosts. The proxies themselves are represented by instances of the java.net.Proxy class. There are still only three kinds of proxies, HTTP, SOCKS, and direct connections (no proxy at all), represented by three constants in the Proxy.Type enum:

  • Proxy.Type.DIRECT

  • Proxy.Type.HTTP

  • Proxy.Type.SOCKS

Besides its type, the other important piece of information about a proxy is its address and port, given as a SocketAddress object. For example, this code fragment creates a Proxy object representing an HTTP proxy server on port 80 of proxy.example.com:

SocketAddress address = new InetSocketAddress("proxy.example.com", 80);
Proxy proxy = new Proxy(Proxy.Type.HTTP, address);

Although there are only three kinds of proxy objects, there can be many proxies of the same type for different proxy servers on different hosts.

The ProxySelector Class

Each running Java 1.5 virtual machine has a single java.net.ProxySelector object it uses to locate the proxy server for different connections. The default ProxySelector merely inspects the various system properties and the URL's protocol to decide how to connect to different hosts. However, you can install your own subclass of ProxySelector in place of the default selector and use it to choose different proxies based on protocol, host, path, time of day, or other criteria.

The key to this class is the abstract select( ) method:

public abstract List<Proxy> select(URI uri)

Java passes this method a URI object (not a URL object) representing the host to which a connection is needed. For a connection made with the URL class, this object typically has the form http://www.example.com/ or ftp://ftp.example.com/pub/files/, or some such. For a pure TCP connection made with the Socket class, this URI will have the form socket://host:port:, for instance, socket://www.example.com:80. The ProxySelector object then chooses the right proxies for this type of object and returns them in a List<Proxy>.

The second abstract method in this class you must implement is connectFailed( ):

public void connectFailed(URI uri, SocketAddress address, IOException ex)

This is a callback method used to warn a program that the proxy server isn't actually making the connection. Example 7-11 demonstrates with a ProxySelector that attempts to use the proxy server at proxy.example.com for all HTTP connections unless the proxy server has previously failed to resolve a connection to a particular URL. In that case, it suggests a direct connection instead.

As I already said, each running virtual machine has exactly one ProxySelector. To change the ProxySelector, pass the new selector to the static ProxySelector.setDefault( ) method, like so:

ProxySelector selector = new LocalProxySelector( ):
ProxySelector.setDefault(selector);

From this point forward, all connections opened by that virtual machine will ask the ProxySelector for the right proxy to use. You normally shouldn't use this in code running in a shared environment. For instance, you wouldn't change the ProxySelector in a servlet because that would change the ProxySelector for all servlets running in the same container.

Pages: 1, 2, 3, 4, 5

Next Pagearrow