Haskell Hierarchical Libraries (network package)Source codeContentsIndex
Network.URI
Portabilityportable
Stabilityprovisional
MaintainerGraham Klyne <gk@ninebynine.org>
Contents
The URI type
Parsing
Test for strings containing various kinds of URI
Relative URIs
Operations on URI strings
URI Normalization functions
Deprecated functions
Description

This module defines functions for handling URIs. It presents substantially the same interface as the older GHC Network.URI module, but is implemented using Parsec rather than a Regex library that is not available with Hugs. The internal representation of URI has been changed so that URI strings are more completely preserved when round-tripping to a URI value and back.

In addition, four methods are provided for parsing different kinds of URI string (as noted in RFC3986): parseURI, parseURIReference, parseRelativeReference and parseAbsoluteURI.

Further, four methods are provided for classifying different kinds of URI string (as noted in RFC3986): isURI, isURIReference, isRelativeReference and isAbsoluteURI.

The long-standing official reference for URI handling was RFC2396 [1], as updated by RFC 2732 [2], but this was replaced by a new specification, RFC3986 [3] in January 2005. This latter specification has been used as the primary reference for constructing the URI parser implemented here, and it is intended that there is a direct relationship between the syntax definition in that document and this parser implementation.

RFC 1808 [4] contains a number of test cases for relative URI handling. Dan Connolly's Python module uripath.py [5] also contains useful details and test cases.

Some of the code has been copied from the previous GHC implementation, but the parser is replaced with one that performs more complete syntax checking of the URI itself, according to RFC3986 [3].

References

  1. http://www.ietf.org/rfc/rfc2396.txt
  2. http://www.ietf.org/rfc/rfc2732.txt
  3. http://www.ietf.org/rfc/rfc3986.txt
  4. http://www.ietf.org/rfc/rfc1808.txt
  5. http://www.w3.org/2000/10/swap/uripath.py
Synopsis
data URI = URI {
uriScheme :: String
uriAuthority :: (Maybe URIAuth)
uriPath :: String
uriQuery :: String
uriFragment :: String
}
data URIAuth = URIAuth {
uriUserInfo :: String
uriRegName :: String
uriPort :: String
}
nullURI :: URI
parseURI :: String -> Maybe URI
parseURIReference :: String -> Maybe URI
parseRelativeReference :: String -> Maybe URI
parseAbsoluteURI :: String -> Maybe URI
isURI :: String -> Bool
isURIReference :: String -> Bool
isRelativeReference :: String -> Bool
isAbsoluteURI :: String -> Bool
isIPv6address :: String -> Bool
isIPv4address :: String -> Bool
relativeTo :: URI -> URI -> Maybe URI
nonStrictRelativeTo :: URI -> URI -> Maybe URI
relativeFrom :: URI -> URI -> URI
uriToString :: (String -> String) -> URI -> ShowS
isReserved :: Char -> Bool
isUnreserved :: Char -> Bool
isAllowedInURI :: Char -> Bool
isUnescapedInURI :: Char -> Bool
escapeURIChar :: (Char -> Bool) -> Char -> String
escapeURIString :: (Char -> Bool) -> String -> String
unEscapeString :: String -> String
normalizeCase :: String -> String
normalizeEscape :: String -> String
normalizePathSegments :: String -> String
parseabsoluteURI :: String -> Maybe URI
escapeString :: String -> (Char -> Bool) -> String
reserved :: Char -> Bool
unreserved :: Char -> Bool
scheme :: URI -> String
authority :: URI -> String
path :: URI -> String
query :: URI -> String
fragment :: URI -> String
The URI type
data URI

Represents a general universal resource identifier using its component parts.

For example, for the URI

   foo://anonymous@www.haskell.org:42/ghc?query#frag

the components are:

Constructors
URI
uriScheme :: String
foo:
uriAuthority :: (Maybe URIAuth)
//anonymous@www.haskell.org:42
uriPath :: String
/ghc
uriQuery :: String
?query
uriFragment :: String
#frag
show/hide Instances
data URIAuth
Type for authority value within a URI
Constructors
URIAuth
uriUserInfo :: String
anonymous@
uriRegName :: String
www.haskell.org
uriPort :: String
:42
show/hide Instances
nullURI :: URI
Blank URI
Parsing
parseURI :: String -> Maybe URI

Turn a string containing a URI into a URI. Returns Nothing if the string is not a valid URI; (an absolute URI with optional fragment identifier).

NOTE: this is different from the previous network.URI, whose parseURI function works like parseURIReference in this module.

parseURIReference :: String -> Maybe URI
Parse a URI reference to a URI value. Returns Nothing if the string is not a valid URI reference. (an absolute or relative URI with optional fragment identifier).
parseRelativeReference :: String -> Maybe URI
Parse a relative URI to a URI value. Returns Nothing if the string is not a valid relative URI. (a relative URI with optional fragment identifier).
parseAbsoluteURI :: String -> Maybe URI
Parse an absolute URI to a URI value. Returns Nothing if the string is not a valid absolute URI. (an absolute URI without a fragment identifier).
Test for strings containing various kinds of URI
isURI :: String -> Bool
Test if string contains a valid URI (an absolute URI with optional fragment identifier).
isURIReference :: String -> Bool
Test if string contains a valid URI reference (an absolute or relative URI with optional fragment identifier).
isRelativeReference :: String -> Bool
Test if string contains a valid relative URI (a relative URI with optional fragment identifier).
isAbsoluteURI :: String -> Bool
Test if string contains a valid absolute URI (an absolute URI without a fragment identifier).
isIPv6address :: String -> Bool
Test if string contains a valid IPv6 address
isIPv4address :: String -> Bool
Test if string contains a valid IPv4 address
Relative URIs
relativeTo :: URI -> URI -> Maybe URI
Compute an absolute URI for a supplied URI relative to a given base.
nonStrictRelativeTo :: URI -> URI -> Maybe URI

Returns a new URI which represents the value of the first URI interpreted as relative to the second URI. For example:

 "foo" `relativeTo` "http://bar.org/" = "http://bar.org/foo"
 "http:foo" `nonStrictRelativeTo` "http://bar.org/" = "http://bar.org/foo"

Algorithm from RFC3986 [3], section 5.2.2

relativeFrom :: URI -> URI -> URI

Returns a new URI which represents the relative location of the first URI with respect to the second URI. Thus, the values supplied are expected to be absolute URIs, and the result returned may be a relative URI.

Example:

 "http://example.com/Root/sub1/name2#frag"
   `relativeFrom` "http://example.com/Root/sub2/name2#frag"
   == "../sub2/name2#frag"

There is no single correct implementation of this function, but any acceptable implementation must satisfy the following:

 (uabs `relativeFrom` ubase) `relativeTo` ubase == uabs

For any valid absolute URI. (cf. http://lists.w3.org/Archives/Public/uri/2003Jan/0008.html http://lists.w3.org/Archives/Public/uri/2003Jan/0005.html)

Operations on URI strings
Support for putting strings into URI-friendly escaped format and getting them back again. This can't be done transparently in all cases, because certain characters have different meanings in different kinds of URI. The URI spec [3], section 2.4, indicates that all URI components should be escaped before they are assembled as a URI: "Once produced, a URI is always in its percent-encoded form"
uriToString :: (String -> String) -> URI -> ShowS

Turn a URI into a string.

Uses a supplied function to map the userinfo part of the URI.

The Show instance for URI uses a mapping that hides any password that may be present in the URI. Use this function with argument id to preserve the password in the formatted output.

isReserved :: Char -> Bool
Returns True if the character is a "reserved" character in a URI. To include a literal instance of one of these characters in a component of a URI, it must be escaped.
isUnreserved :: Char -> Bool
Returns True if the character is an "unreserved" character in a URI. These characters do not need to be escaped in a URI. The only characters allowed in a URI are either "reserved", "unreserved", or an escape sequence (% followed by two hex digits).
isAllowedInURI :: Char -> Bool
Returns True if the character is allowed in a URI.
isUnescapedInURI :: Char -> Bool
Returns True if the character is allowed unescaped in a URI.
escapeURIChar :: (Char -> Bool) -> Char -> String
Escape character if supplied predicate is not satisfied, otherwise return character as singleton string.
escapeURIString
:: (Char -> Bool)a predicate which returns False if the character should be escaped
-> Stringthe string to process
-> Stringthe resulting URI string
Can be used to make a string valid for use in a URI.
unEscapeString :: String -> String
Turns all instances of escaped characters in the string back into literal characters.
URI Normalization functions
normalizeCase :: String -> String
Case normalization; cf. RFC3986 section 6.2.2.1 NOTE: authority case normalization is not performed
normalizeEscape :: String -> String
Encoding normalization; cf. RFC3986 section 6.2.2.2
normalizePathSegments :: String -> String
Path segment normalization; cf. RFC3986 section 6.2.2.4
Deprecated functions
parseabsoluteURI :: String -> Maybe URI
escapeString :: String -> (Char -> Bool) -> String
reserved :: Char -> Bool
unreserved :: Char -> Bool
scheme :: URI -> String
authority :: URI -> String
path :: URI -> String
query :: URI -> String
fragment :: URI -> String
Produced by Haddock version 0.8