Skip to content

Latest commit

 

History

History
218 lines (157 loc) · 11.4 KB

2014-03-24-nsurl.md

File metadata and controls

218 lines (157 loc) · 11.4 KB
title author category excerpt revisions status
NSURL /<br/>NSURLComponents
Mattt Thompson
Cocoa
Of all the one-dimensional data types out there, URIs reign supreme. Here, in a single, human-parsable string, is every conceivable piece of information necessary to encode the location of any piece of information that has, does, and will ever exist on a computer.
2014-03-24 2015-09-08
Original publication.
Added Swift translation & section on `stringByAddingPercentEncodingWithAllowedCharacters`.
swift reviewed
2.0
September 9, 2015

There is a simple beauty to—let's call them "one-dimensional data types": numbers or strings formatted to contain multiple values, retrievable through a mathematical operator or parsing routine. Like how the hexadecimal color #EE8262 can have red, green, and blue components extracted by masking and shifting its bits, or how regular expressions can match and capture complex patterns in just a few characters.

Of all the one-dimensional data types out there, URIs reign supreme. Here, in a single, human-parsable string, is every conceivable piece of information necessary to encode the location of any piece of information that has, does, and will ever exist on a computer.

In its most basic form, a URI is comprised of a scheme name and a hierarchical part, with an optional query and fragment:

<scheme name> : <hierarchical part> [ ? <query> ] [ # <fragment> ]

Many protocols, including HTTP, specify a regular structure for information like the username, password, port, and path in the hierarchical part:

![URL Structure]({{ site.asseturl }}/nsurl.png)

A solid grasp of network programming is rooted in an unshakeable familiarity with URL components. As a software developer, this means having a command over the URI functionality in your programming language's standard library.

If a programming language does not have a URI module in its standard library, run, don't walk, to a real language that does.


In Foundation, URLs are represented by NSURL.

NSURL instances are created using the NSURL(string:) initializer:

let url = NSURL(string: "http://example.com/")
NSURL *url = [NSURL URLWithString:@"http://example.com"];

If the passed string is not a valid URL, this initializer will return nil.

NSString also has some vestigial functionality for path manipulation, as described a few weeks back, but that's being slowly migrated over to NSURL. While the extra conversion from NSString to NSURL is not the most convenient step, it's always worthwhile. If a value is a URL, it should be stored and passed as an NSURL; conflating the two types is reckless and lazy API design.

NSURL also has the initializer NSURL(string:relativeToURL:), which can be used to construct a URL from a string relative to a base URL. The behavior of this method can be a source of confusion, because of how it treats leading /'s in relative paths.

For reference, here are representative examples of how this method works:

let baseURL = NSURL(string: "http://example.com/v1/")

NSURL(string: "foo", relativeToURL: baseURL)
// http://example.com/v1/foo

NSURL(string:"foo?bar=baz", relativeToURL: baseURL)
// http://example.com/v1/foo?bar=baz

NSURL(string:"/foo", relativeToURL: baseURL)
// http://example.com/foo

NSURL(string:"foo/", relativeToURL: baseURL)
// http://example.com/v1/foo/

NSURL(string:"/foo/", relativeToURL: baseURL)
// http://example.com/foo/

NSURL(string:"http://example2.com/", relativeToURL: baseURL)
// http://example2.com/
NSURL *baseURL = [NSURL URLWithString:@"http://example.com/v1/"];

[NSURL URLWithString:@"foo" relativeToURL:baseURL];
// http://example.com/v1/foo

[NSURL URLWithString:@"foo?bar=baz" relativeToURL:baseURL];
// http://example.com/v1/foo?bar=baz

[NSURL URLWithString:@"/foo" relativeToURL:baseURL];
// http://example.com/foo

[NSURL URLWithString:@"foo/" relativeToURL:baseURL];
// http://example.com/v1/foo/

[NSURL URLWithString:@"/foo/" relativeToURL:baseURL];
// http://example.com/foo/

[NSURL URLWithString:@"http://example2.com/" relativeToURL:baseURL];
// http://example2.com/

URL Components

NSURL provides accessor methods for each of the URL components as defined by RFC 2396:

  • absoluteString
  • absoluteURL
  • baseURL
  • fileSystemRepresentation
  • fragment
  • host
  • lastPathComponent
  • parameterString
  • password
  • path
  • pathComponents
  • pathExtension
  • port
  • query
  • relativePath
  • relativeString
  • resourceSpecifier
  • scheme
  • standardizedURL
  • user

The documentation and examples found in NSURL's documentation are a great way to familiarize yourself with all of the different components.

Although username & password can be stored in a URL, consider using NSURLCredential when representing user credentials, or persisting them to the keychain.

NSURLComponents

Quietly added in iOS 7 and OS X Mavericks was NSURLComponents, which can best be described by what it could have been named instead: NSMutableURL.

NSURLComponents instances are constructed much in the same way as NSURL, with a provided NSString and optional base URL to resolve against (NSURLComponents(string:) & NSURLComponents(URL:resolvingAgainstBaseURL:)). It can also be initialized without any arguments to create an empty storage container, similar to NSDateComponents.

The difference here, between NSURL and NSURLComponents, is that component properties are read-write. This provides a safe and direct way to modify individual components of a URL:

  • scheme
  • user
  • password
  • host
  • port
  • path
  • query
  • fragment

Attempting to set an invalid scheme string or negative port number will throw an exception.

In addition, NSURLComponents also has read-write properties for percent-encoded versions of each component.

  • percentEncodedUser
  • percentEncodedPassword
  • percentEncodedHost
  • percentEncodedPath
  • percentEncodedQuery
  • percentEncodedFragment

Getting these properties retains any percent encoding these components may have. Setting these properties assumes the component string is already correctly percent encoded. Attempting to set an incorrectly percent encoded string will cause an exception. Although ';' is a legal path character, it is recommended that it be percent-encoded for best compatibility with NSURL (String's stringByAddingPercentEncodingWithAllowedCharacters(_:) will percent-encode any ';' characters if you pass the URLPathAllowedCharacterSet).

More recently, NSURLComponents gained a queryItems array, giving easy access to any key-value pairs encoded in a URL's query string as an array of NSURLQueryItem.

Percent-Encoding

Speaking of percent-encoding...

OS X 10.9 and iOS 7 brought new String/NSString APIs for handling percent-encoding that improve upon (and deprecate) the old encoding-based NSString methods and the CFURL-related C APIs. To percent-encode a string, use the stringByAddingPercentEncodingWithAllowedCharacters(_:) with the relevant NSCharacterSet:

let title = "NSURL / NSURLComponents"
if let escapedTitle = title.stringByAddingPercentEncodingWithAllowedCharacters(NSCharacterSet.URLQueryAllowedCharacterSet()) {
    components.query = "title=\(escapedTitle)"
}
NSString *title = @"NSURL / NSURLComponents";
NSString *escapedTitle = [title stringByAddingPercentEncodingWithAllowedCharacters:[NSCharacterSet URLQueryAllowedCharacterSet]];
components.query = [NSString stringWithFormat:@"title=%@", escapedTitle];

Reversing the encoding is even easier, accomplished on an encoded string with stringByRemovingPercentEncoding. New character sets have been added for each of the six URL components that might require percent-encoding:

// Predefined character sets for the six URL components and subcomponents which allow percent encoding. These character sets are passed to -stringByAddingPercentEncodingWithAllowedCharacters:.
extension NSCharacterSet {
    public class func URLUserAllowedCharacterSet() -> NSCharacterSet
    public class func URLPasswordAllowedCharacterSet() -> NSCharacterSet
    public class func URLHostAllowedCharacterSet() -> NSCharacterSet
    public class func URLPathAllowedCharacterSet() -> NSCharacterSet
    public class func URLQueryAllowedCharacterSet() -> NSCharacterSet
    public class func URLFragmentAllowedCharacterSet() -> NSCharacterSet
}
// Predefined character sets for the six URL components and subcomponents which allow percent encoding. These character sets are passed to -stringByAddingPercentEncodingWithAllowedCharacters:.
@interface NSCharacterSet (NSURLUtilities)
+ (NSCharacterSet *)URLUserAllowedCharacterSet;
+ (NSCharacterSet *)URLPasswordAllowedCharacterSet;
+ (NSCharacterSet *)URLHostAllowedCharacterSet;
+ (NSCharacterSet *)URLPathAllowedCharacterSet;
+ (NSCharacterSet *)URLQueryAllowedCharacterSet;
+ (NSCharacterSet *)URLFragmentAllowedCharacterSet;
@end

Bookmark URLs

One final topic worth mentioning are bookmark URLs, which can be used to safely reference files between application launches. Think of them as a persistent file descriptor.

A bookmark is an opaque data structure, enclosed in an NSData object, that describes the location of a file. Whereas path- and file reference URLs are potentially fragile between launches of your app, a bookmark can usually be used to re-create a URL to a file even in cases where the file was moved or renamed.

You can read more about bookmark URLs in "Locating Files Using Bookmarks" from Apple's File System Programming Guide


Forget jet packs and flying cars—my idea of an exciting future is one where everything has a URL, is encoded in Markdown, and stored in Git. If your mind isn't blown by the implications of a universal resource locator, then I would invite you to reconsider.

Like hypertext, universal identification is a philosophical concept that both pre-dates and transcends computers. Together, they form the fabric of our information age: a framework for encoding our collective understanding of the universe as a network of individual facts, in a fashion that is hauntingly similar to how the neurons in our brains do much the same.

We are just now crossing the precipice of a Cambrian Explosion in physical computing. In an internet of things, where every object of our lives has a URL and embeds an electronic soul, a digital consciousness will emerge. Not to say, necessarily, that the singularity is near, but we're on the verge of something incredible.

Quite lofty implications for a technology used most often to exchange pictures of cats.