this is totally gonna work…

Cocoa’s Ways of Talking

June 24th, 2009

Getting objects to talk to one another in Objective-C is a easy as passing a message from one to the other. These messages are typically passed through the message-invocation mechanism of using the square-braces to bind a message and arguments to a receiver. Most of the time this is a perfectly reasonable way to communicate. However there are times when you need objects to communicate without having explicit knowledge of one another.

In this post we’ll look at three ways to allow your objects to communicate with each other in a highly-decoupled manner using the Cocoa frameworks. Cocoa provides three common ways of connecting objects for messaging: Notifications, Key-Value Observation and delegate pattern.

Notifications

Both Cocoa and Cocoa Touch provide a notifications framework built on two classes: NSNotificationCenter and NSNotification. The former contains a singleton instance, (via the defaultCenter method) with which observers register themselves and observables post notifications. The latter is the “envelope” for the notification which can contain an NSDictionary instance of customized data as its payload.

NSNotificationCenter.png

When an object registers for notifications, it specifies a notification to listen for (as a NSString) and, optionally, a target object. If the target is set to an object, only notifications posted by that object will be received. If the target object is set to nil, any object posting that notification will be sent to the receiver. Senders can optionally include a NSDictionary instance filled with domain-specific objects. You can use this as a way to further parameterize a notification, or eschew it altogether.

Consider the example below:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
@implementation Foo
- (void)init {
  if (self = [super init]) {
    [[NSNotificationCenter defaultCenter] addObserver:self
                                             selector:@selector(didReceiveNote:)
                                                 name:@"TheBigNotification"
                                               object:nil];
  }
  return self;
}
 
- (void)didReceiveNote:(NSNotification *)notification {
  NSLog(@"Hey, I got a notification with user info: %@", [notification userInfo]);
}
@end@implementation Bar
- (void)doSomething {
  NSLog(@"I'm up to something!");
  NSDictionary *stuff = [NSDictionary dictionaryWithObjectsAndKeys:@"foo", @"bar", nil];
  [[NSNotificationCenter defaultCenter] postNotificationName:@"TheBigNotification"
                                                      object:nil
                                                    userInfo:stuff];
}
@end

Here the Foo class registers itself as a notification receiver for the notification named “TheBigNotification” on any object. The Bar class will post the same notification when the doSomething method is invoked.

We could have been much more specific about which object to observe, but I think this violates one of the key features of notifications which is that the sender and receiver are decoupled. To me if the receiver is going to go to the trouble of listening for notifications from a specific object, you’d be better off declaring a custom protocol and using the delegate pattern (explained below).

While at the 2009 WWDC, I had a chance to pick the brains of some Apple engineers and one of them told me that notifications are really intended for one-to-many broadcasts. I think that’s a great rule of thumb, because the code can really feel like overkill for single object-to-object messages. However, I’d add one more clause to that rule which is that notifications work for one-to-one messages when you would otherwise have to pass instances around of delegates just to connect objects.

As an example, consider an iPhone application with a stack of view controllers in a UINavigationController instance. If you have a view controller several steps removed from another view controller (i.e. one is a further up the stack in a tab bar controller) and it needs to call some method back to the view controller lower down in the stack, each intermediate object would need a reference to the delegate class just to pass it down the line. This seems like the worst kind of encapsulation breakage since you’re forcing a bunch of otherwise-unrelated classes to have knowledge of a class they don’t even use.

Maybe it’s my long track-record with Java, but I pay attention to the classes I import and I like to keep that list as short as possible. The more classes know about each other, the more difficult it is to modify them.

There’s one final thing worth knowing about the NSNotificationCenter, and that is how it works with threads. When one object posts a notification, that call blocks while the notification is delivered to all targets. This means that you want your notification-handling code in the receiving object to be as quick as possible. If you can live with deferred notifications and want to avoid the synchronous nature of notification delivery, you can use the NSNotificationQueue instead.

Key-Value Observation

When I first read about, and used Key-Value Observation (KVO) it seemed like magic. You simply point one object at another and say “I’d like to know when this attribute changes” and Cocoa handles all of the notifications and threading automatically. Genius!

KVO also allows you to use a special syntax when specifying the attributes you want to observe so that you can register with an object and traverse its object graph. Let’s look at the classic customer/orders/line items example. If you want to know when changes are made to any line items in an order you would observe changes in the order like so:

1
2
3
4
5
Order *order = [self anyOrder];
[order addObserver:self
        forKeyPath:@"lineItems"
           options:NSKeyValueObservingOptionNew | NSKeyValueObservingOptionOld
           context:nil];

Whenever you register for KVO, you must implement the method - observeValueForKeyPath:ofObject:change:context:. Unlike notifications, you don’t get to specify a selector. This means that if you’re object is observing several other objects, you will have to inspect the object given in the callback method and dispatch appropriately.

Key paths have a ton of flexibility including the ability to traverse deep object graphs and observe aggregate functions on a collection (e.g. observe the max date of all of a customer’s orders). Check out the Key-Value Programming Guide for details, which is an indispensable reference.

KVO works like notifications in that the thread that invokes the change will block until all observers have been notified and their callback methods have been invoked. Unlike notifications, there is no built-in asynchronous KVO-triggering mechanism unless you explicitly mutate properties in another thread using something like the detachNewThreadSelector:toTarget:withObject: class method of NSThread.

For classes to be key-value observable they must be, what the documentation calls, “key-value compliant”. This generally means that each observable property exposes certain methods for accessing and mutating them. Again, the KVO guide is your go-to reference.

KVO is structurally similar to notifications, but is really intended for fine-grained messaging, generally around your model classes. Notifications are intended for broader application-level events. Both mechanisms give you a way to decouple classes that would otherwise need more intimate knowledge of one another.

Consider the simple case of managing a tabular view of line items in an order. When a user adds a line item, removes a line item or updates the quantity of a line item we want a field showing the total price to be updated. Without KVO we would need to add extra logic in our event-handing for the table view to also update the total field. With KVO our event-handling logic can simply focus on updating the underlying model. The total field will simply observe the total cost of the order and be updated appropriately. Now if we want to remove the total field or perhaps add another view of total information, we don’t have to touch the original event-handing code.

Delegates

If you’ve spent any time with the Cocoa docs you’ll run across The Delegate Pattern. This thing is like the quark—it is the fundamental building block of the Cocoa APIs. Conceptually, it’s really pretty simple. A delegate is simply an object that provides custom behavior for another object, often by implementing a specific protocol.

Let’s return to our running example. A table view of line items in an order has to handle a lot of things such as rendering each cell, handling scrolling, managing user-generated events and so on. These features are common enough that they are simply part of the table view class itself. What is specific to your application is the data that goes in those cells and the actions to be taken for user-generated events. Cocoa solves this by providing protocols for delegating that behavior to custom classes. In the case of providing data, a data-oriented delegate would be queried by the table view for the total number of rows, appropriate object to put in each row, the column headers, etc. In the case of event-handling, the table view will forward events to the event-handling delegate.

Notifications use the NSNotificationCenter (or by proxy, the NSNotificationQueue) as an intermediary between message-senders and message-receivers. KVO uses the observed object as the intermediary. In the delegate model there is no intermediate object, but you still have relatively decoupled code because the object that invokes methods on the delegate has no knowledge of what class of object it’s dealing with. Put another way, you could remove all of the classes in your project that implement a particular delegate protocol, and the class that uses the delegate would still compile correctly.

The delegate pattern isn’t so much about decoupling which objects are communicating with each other, as much as decoupling the classes of those objects. This doesn’t make it any less powerful than the other two methods. In a lot of cases a simple delegate model is much more straightforward and preferable the alternatives.

The delegate pattern is Cocoa’s version of dependency injection. The object receiving the injected dependency has no idea of the type (and by extension, the implementation) of the object it is receiving, only what it’s capabilities are via a contract specified as a protocol. This is basic polymorphism where the interface and the implementation are separated to reduce coupling.

Delegating is a popular alternative to sub-classing. It is a preferred alternative because it’s less likely to break encapsulation. If customization is done through subclassing it can be difficult to know which methods to override and implement, and easy to break the parent class. This is why Java provides all sorts of safety features like marking methods abstract and final. However that is simply more of a headache to deal with and using polymorphism would be a much better design.

So there you have it, three ways to keep your code flexible and loosely-coupled!

Resources

The iPhone and Web APIs

May 26th, 2009

The iPhone Ecosystem

We live in a very interesting time for application development. The distinctions between desktop, browser and mobile applications are blurring more and more every day. “Ubiquitous platform” applications, like Evernote, are where the future is because it reduces the major hurdle of access for users. As time goes on users are going to expect applications to be available in a bunch of different ways that all need to work together.

The common way to do this is base the application on shared-state accessed via an HTTP-based API. This makes it relatively easy to create apps for various devices as well as offer up a public API for others to use. An interesting side-effect of this is that the API, the shared-state and data behind it become the differentiator between products. The end-user client software is simply the packaging, turning the stand-alone application market into something more akin to the “razors and razor-blades model”.

Okay, none of this should be earth-shattering news to anyone. You should be prepared to write more apps that are integrated with one (or possibly more) “backend services”. These services will, as likely as not, be connected over HTTP. As an iPhone developer you should know how to do this in your sleep.

The Evri API is absolutely crucial to the EvriVerse phone application. Without it, the app does nothing of interest on its own. Our iPhone application is merely one kind of presentation of our web APIs. I don’t think that’s unique. What good would Tweetie be without Twitter?

Dealing with Web APIs

An unfortunate fact of integrating with web APIs is that they are, relatively speaking, slow. We don’t want the user to get frustrated and give up on our app because they have their flow continually disrupted by waiting for data.

There are some things you can do to mitigate the latency costs, if you control the API such as using some sensible caching parameters. However, especially if you don’t control the web API, the bulk of what you should focus on is concurrency within your application. The biggest arrow in your quiver is figuring out how to turn unnecessarily serial operations into parallel ones. Now keep in mind that the iPhone doesn’t support true multi-core concurrency. What I’m talking about is taking advantage of the I/O wait times that are inherently part of working with a high-latency I/O source (the web).

Coming from a Java background, using threads seemed like a pretty natural approach and after reading the docs on the NSThreadclass I felt pretty confident I knew how to make it work. However, I was surprised by how quickly the code complicated. There is a lot to like about Objective-C, but I sorely miss Ruby’s blocks and even Java’s inner-classes. In Objective-C you connect methods together dynamically by handing objects selectors. The problem with this is that it’s difficult to tell the high-level methods from the low-level callback methods just by looking at your class definition. Yes, you can use special comments and the like, but if you’ve worked in a language that has structural support for this differentiation, moving to a language without it feels like losing an arm.

So instead of managing your own threads, let Cocoa Touch do the work for you. The NSURLConnection class is inherently asynchronous—you can fire a request and immediately move one with your life. Now you just need a little structure for handling the response and getting it back into your views.

The EvriApi

We tried to push as much of the mechanics of URL-handling into a separate code-base we call EvriApi. A good bit of the design was inspired by Matt Gemmell’s MGTwitterEngine, so I can’t claim that I came up with the whole idea myself.

We wanted the API to be as simple to work with as possible. Each API call has the same structure:

  • Each API method takes zero-or-more domain-specific parameters, a selector and a target object.
  • Each API method returns a unique request identifier
  • The target object will have the given selector called when the request completes, and will be given an instance of the EvriAPIResponse class.
  • If the response is successful (check [response success]), the payload of the request is retrieved as a domain-object
  • If the response is unsuccessful, check the error message on the response
  • Requests can be cancelled by handing the request identifier back to the API
  • Attempts to cancel already-completed requests do nothing

The NSURLConnection class makes asynchronous requests by default, and uses the familiar delegate model for handling the response. When a client invokes an API method, it returns immediately with a unique identifier (as a NSString instance) and the request is initiated on another thread which is handled automatically by NSURLConnection. The client is responsible for keeping track of that and using it to cancel any outstanding requests.

When the request completes, the body of the response is parsed using the NSXMLParser class. The parser is given one of our specialized XML handlers as a delegate. These handlers deal with the XML event-stream and produce a model object or collection of model objects. The framework then packages this up into an EvriAPIResponse instance and invokes the selector given in the original request on the target given in the original request.

Wiring it Up

The EvriVerse app is primarily a “productivity” style application. In Interface Guidelines-parlance this means that we use a lot of hierarchical navigation. In general we have no more than one or two API calls associated with a particular view controller.

The common pattern for handling a user-initiated action that requires an API call is:

  1. A new view-controller is instantiated and displayed, typically with an activity indicator visible
  2. User-interaction is disabled on the new view controller
  3. The new view controller is displayed (usually by pushing it on the navigation controller stack).
  4. A request is submitted to the EvriApi, setting the new view controller as the target and some special method as the target selector.
  5. The request ID from the EvriApi is given to the newly-created view-controller
  6. When the API request finishes, the handler method on the new view-controller executes, handling both success and failure cases. It also re-enabled user interaction on the view.
  7. If the user navigates back before the request has completed, any outstanding requests are cancelled. This is usually handled in the viewWillDisapper:(BOOL)animated method of the view controller.

Stepping back, the actors and interactions look something like this:
evri-api.png

In code (snippets) it might look something like this:

1
2
3
4
5
6
7
8
9
10
11
12
// our parent view controller
@implementation ParentViewController
- (void)didSelectEntity {
  ChildViewController *cvc = [[[ChildViewController alloc] init] autorelease];
 
  [self.navigationController pushController:cvc animated:NO];
 
  NSString *requestId = [EvriApi fetchEntityByURI:entity.uri
                                  performSelector:@(didReceiveEntityResponse:)
                                         onTarget:cvc];
}
@end
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
@interface ChildViewController : UITableViewController {
  NSString *requestId;
}
 
@property (nonatomic, retain) NSString *requestId;
@end
 
@implementation ChildViewController
 
@synthesize requestId;
 
- (void)didReceiveEntityResponse:(EvriAPIResponse *)response {
  // TODO: hide the activity indicator
  // TODO: re-enable user interaction
 
  if ([response success]) {
    Entity *e = (Entity *)[success result];
	 // TODO: redisplay as necessary
  }
  else {
    // TODO: show an alert
  }
}
 
- (void)viewWillDisappear {
  [EvriApi cancelRequest:requestId];
}
 
@end

So there you have it, one developer’s way of integrating the iPhone with web APIs. More to come. Stay tuned.

Posted in iPhone | 1 Comment »

Peepcode meets MacRuby

May 15th, 2009

I am pleased to announce the release of Peepcode’s latest screencast about MacRuby written by yours truly with Geoffrey Grossenbach and technical editing by Laurent Sansonetti.

Ever since I saw Laurent’s and Rich Kilmer’s presentations about MacRuby and HotCocoa at RubyConf ‘08, I’ve been jazzed about what MacRuby brings to the world of Cocoa development. So check out the screencast for the coolest new Ruby on the block.

Incremental Find on the iPhone

May 14th, 2009

This is the first in a series of posts I’ll be writing about my experiences developing the EvriVerse iPhone application for work. Since the end of 2008 I’ve been getting more and more into Cocoa for both the iPhone and Mac. The more I work on it, the more fun I have. I’ve got several side-projects in mind so I’ll be spending a lot more time in Cocoa-land and, as a result, blogging and tweeting about it a lot more.

I wanted to kick this off with a little appetizer. We’re going to look at how we implemented incremental find in EvriVerse. Evri has a huge database of structured entity data (people, places and things) and our users want to be able to search for any of them. After putting the initial prototype for EvriVerse on a few folks’ phones, a lot of them wanted a search feature that worked more like our website.

Picture 5.png

In the original implementation a user would type in the search field, then hit the “Search” button and then get their results back in the table view. This takes too many steps, and we wanted something that felt faster and more responsive.

Picture 6.png

So our incremental find solution needed the following properties:

  • It should only send a search request when there is a measurable pause in input
  • It should enable some kind of activity indicator while a search request is running
  • A user should be able to modify the contents of the search field while a search request is in-progress
  • It should not preclude the existing two-step search process
  • If a search request is in process when another one starts, the first one should be cancelled and the new one should be the only request in-process

At first I tried managing my own NSThread instance while tracking the timestamps of touch events in the searchBar:textDidChange: method (part of the UISearchBarDelegate protocol). If the the gap between time samples was big enough I’d cancel any outstanding request and start a new one. This seemed like a good approach at the whiteboard, but really fell apart in the implementation. I had race-conditions left and right and I had the sneaking suspicion that I was swimming upstream against “the Cocoa way”.

As an aside, if there’s one thing I’ve learned in my brief time with Cocoa, it’s that mucking about with your own threads is a pretty crummy way to do things asynchronously. Usually there’s a much more elegant, built-in solution in the Cocoa APIs that will handle the threading for you. I’ll riff on this theme in a later post.

So back to the problem at hand. What to do? When I get stuck like this I often start thinking of how I’d do this in another language. This got me to thinking about how one would go about implementing this sort of thing in Javascript. This is standard Web-2.0 kind of stuff—right after drop-shadows and rounded corners, you learn about type-ahead search at AJAX School.

See, Javascript has this very nifty little function named setTimeout where you hand it a function and tell it how long you want to wait until that function is executed, and it returns you a reference identifier. There is a companion function, clearTimeout that allows you to cancel any previously-created “timeout” function by handing it the reference identifier you got earlier from setTimeout.

In the common AJAX-ified dynamic search that you often see, an event-handler observes the search field in a web form for key presses. Each key press cancels any existing timeout function, and starts a new one that will execute in whatever time-interval was chosen as the “quiescence period”. The trick is that because you’re scheduling future tasks, and you cancel any outstanding ones when you kick off a new one, you get the “measurable pause” feature in a simple, elegant solution.

Cocoa provides a very similar mechanism, through the NSTimer class. In my view-controller, the code ended up looking something like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
@implementation FindViewController
// much implementation code has been elided for demonstration purposes
 
// observe text-field changes and fire the scheduled task
- (void)searchBar:(UISearchBar *)searchBar textDidChange:(NSString *)searchText {
  if ([searchText length] > 2) {
    if (timer) {
      [timer invalidate];
      self.timer = nil;
    }
    NSDictionary *info = [NSDictionary dictionaryWithObjectsAndKeys:searchText, 
                          @"Text", 
                          nil];
    self.timer = [NSTimer scheduledTimerWithTimeInterval:1
                                                  target:self
                                                selector:@selector(submitSearchRequest:)
                                                userInfo:info
                                                 repeats:NO];
  }
}
 
// The method that runs in the NSTimer--equivalent to the function 
// we'd pass to Javascript's "setTimeout" function
- (void)submitSearchRequest:(NSTimer *)aTimer {
  NSDictionary *info = [aTimer userInfo];
  NSString *text = [info objectForKey:@"Text"];
  [activityIndicator startAnimating];
  self.requestId = [EvriApi fetchEntitiesMatchingPrefix:text
                                        performSelector:@selector(didReceiveEntitiesDynamically:)
                                               onTarget:self];
}
 
// The callback from our API class, included for completeness
- (void)didReceiveEntitiesDynamically:(EvriApiResponse *)response {
  [timer invalidate];
  self.timer = nil;
  [activityIndicator stopAnimating];
  if ([response success]) {
    [entities removeAllObjects];
    [entities addObjectsFromArray:(NSArray *)[response responseObject]];
    [tableView reloadData];
  }
  else {
    UIAlertView *error = [[UIAlertView alloc] 
                          initWithTitle:@"Uh oh…"
                          message:@"Sorry, we were unable to execute your request. Please try again.
"
                          delegate:nil
                          cancelButtonTitle:@"OK"
                          otherButtonTitles:nil];
    [error show];
    [error release];
  }
}
 
@end

In the searchBar:textDidChange: method we check to see if our instance field, timer, points at an existing NSTimer instance. If it does we cancel it, clear the reference, and start a new timer instance. That timer is configured to run in one second and will execute our submitSearchRequest: method. This makes a call to our API and designates the didReceiveEntitiesDynamically: method as the callback for that asynchronous operation.

Finally, the didReceivedEntitiesDynamically: method handles the UI by disabling the wait indicator, dealing with any error cases and populating our table view if our request was successful. I’ll write another post later about how the EvriApi class works and its design.

So that’s it. Simple and elegant. One of the things I love about working on several different programming languages is figuring out how to apply the best tricks of one language to another.

That’s all for now, but stay tuned! More iPhone posts to come…

Introducing the EvriVerse

May 12th, 2009

graph.jpgAfter several months of work, I’m pleased to announce the release of Evri’s EvriVerse for the iPhone. We’ve tried to capture the unique things we do at Evri into the awesomeness that is the iPhone. I won’t spill all the details here, so check out out the “official” company post.

I’ve been working on this thing for a few months, mostly on a part-time basis. We have a 20% time program at Evri where devs can pitch ideas they want to spend one day a week on. I was dying to write an iPhone app and jumped at the opportunity to turn it into a 20% project. After some early prototypes I got enough attention that I was given a solid two-week sprint to finish it off.

This was a ton of fun to work on and I have five or six posts on some things I discovered while working on it.

It’s free in the App Store now. We’d love to hear any feedback you may have, which you can post on the Evri Support Page.

The Great Git Setup

May 7th, 2009

One of the best ways to really learn the ins and outs of anything is to immerse yourself in all the gory details. Not only do you learn what works, what doesn’t, what’s elegant and what sucks, but you also start to grok the inner-workings. I just spent the last two weeks getting our own internal Git infrastructure up and running at work and I feel like Git and I have a new level of intimacy that we previously lacked. What follows is a review of the process and the solution that we’ve implemented.

For most of us at work, our primary Git experience is with GitHub. This is a good way to learn how Git’s distributed model works, but GitHub provides some nice infrastructure that you simply don’t have with a stock Git install. You have to cobble-together the rest from a variety of sources. So GitHub guys—if you’re listening—it would be fantastic if you provided real white-label GitHub solution for organizations to install locally. You guys have really spoiled us. (oops, nevermind. Checkout GitHub:fi). But until then, we have to roll our own…

The Goals

First, let’s review what exactly we were trying to accomplish. We wanted a solution that:

  • easily converted existing Subversion repositories to Git
  • provided a usable web view over all repos that was equal to, or exceeded, the stock SVN/DAV model
  • allowed us to designate a set of “authoritative” repositories
  • provided for “developer” repositories to allow developers to publish their changes
  • allowed for patch reviews (if teams decide to do that)

The Solution

Right away we knew that we wanted two different types of repositories: authoritative and developer. Authoritative repos are ones that host the “official” source code and from which we create release builds from. When a developer wants to start working on a project, these are the repos they clone to get started.

The other type of repository belongs to an individual developer. Developers have read/write access to their own repos, but others only have read-only access. This allows developers to share changes in a pull-request model similar to how GitHub works.

Permissions & Connectivity

Developers connect to these repositories in two different ways. We use the built-in git:// protocol (with git-daemon) for read-only access and ssh:// for read-write access. Developers clone authoritative repos and other developers’ repos using the read-only git:// protocol. Developers have a remote reference to their developer repository using ssh.

For each project, we designated one or more maintainers that have write-access to the authoritative repositories. These folks accept patches from other developers, test and integrate them on their local machines, and push final changes to the authoritative repositories. We manage the authoritative repos using gitosis. Check out this great tutorial for details on setting up gitosis.

gitosis

On our central “git box” we create a “git” user which owns both the gitosis-admin repository as well as all the authoritative repositories. Gitosis is bootstrapped with a single SSH public key that gives the associated user the ability to write to that Git repo over SSH. That user can then add more keys and configuration allowing other users to manage projects. Gitosis works a bit like the CVSROOT project in CVS—it is a special Git repo that allows us to configure other Git repositories. We use gitosis to configure our designated maintainers for each project.

Our authoritative repos are housed in /home/git/repositories. Our developer repositories are located in /opt/repos, where each developer has their own directory to put Git repos in.

git daemon

We run git daemon to allow for anonymous read-only access. We configured git daemon to serve up all Git repositories found in the /opt/repos directory with read-only access. We also symlink the authoritative repositories (/home/git/repositories) into this directory so that we can serve up read-only access to authoritative repositories also.

gitweb

For all the power Git has, there sure are a lot of half-baked web interfaces for it. We did our best to vet the tools listed on the Git site. A good number of the them were either abandoned or simply didn’t work. In the end we went with the venerable old gitweb.

Gitweb has a ton of functionality—including all kinds of search capability, different views (by tag, by commit, by tree) and even serves up an Atom feed of changes—but it is definitely a designed-by-developers-for-developers. (NB: In case you missed the markup, that is considered an epithet on my team).

Frankly I was amazed that somebody hadn't built a Rails app to manage this stuff. Yes, we looked at gitorious and even tried to set it up. After three hours I still didn't have a working installation and my patience had run out. While gitorious appears to be a nice turn-key solution, I don't think it's actually that great of a fit for us. So gitweb it is. Sigh.

The Flow

With me so far? Maybe not. Let's look at a picture then…
98EE3AE7-C4F4-4D1D-B49B-DCC230EA7459.jpg

In this picture the authoritative and developer repos are drawn in separate boxes (since they're logically separate), but they are located on the same machine in reality. Let's cover a couple of common scenarios:

New Developer, Old Project

A developer starts by cloning an authoritative repo using the git:// protocol (remember, read-only). This will create, by default, a remote reference named origin that points back to the authoritative repo.

The next step is for them to create a developer repo in their directory for the same project on the central git machine. They also want to a remote reference to their developer repo so that they can push changes to that. That would looks something like:


git remote add alex ssh://git/opt/repos/alex/circus/clown-car

If this user is also a maintainer, they need to add a third remote reference which gives them read/write access to the authoritative repository. The way that gitosis works is by accepting SSH keys on behalf of the git user. So if you're properly configured, you can push changes over SSH by logging in as the git user. A maintainer would add a read/write reference like so:


git remote add main ssh://git@git/home/git/repositories/circus/clown-car

Because of the way gitosis configures the SSH keys in /home/git/.ssh/authorized_keys and the fact that the default protocol is ssh, this remote reference could also be added like this:


git remote add main git@git:circus/clown-car

Now because there are number of steps involved here, we wrote a little command-line tool (as a RubyGem) that takes care of these steps all in one go. Right now, it's very specific to our setup at Evri, but I can see how we might extract a common tool (oh boy…another side-project…)

Old Developer, New Project

When it's time to create a new project, a developer usually starts out by creating a new Git repo on their local box while they're building out the initial version. Once it's time to share, that developer can create a new developer repo on our central Git machine just by SSH'ing in and making the appropriate directory, adding the remote ref to their local repo and pushing changes. Again, our internal tools sets this up all in one go.

Once that project is ready to have an authoritative repo, a gitosis-enabled user will pull the latest changes for the gitosis-admin project. They'll edit the gitosis.conf file to add the new project (and members) and commit the changes. Then the developer adds a new remote ref (a read/write one as the git user) and pushes changes out to the main repo.

Sharing Patches

The most common flow is when developers share patches with each other. There are a couple of ways to do it. Developers keep their developer repository up-to-date and email requests to their teammates to pull changes from their repo (a la GitHub). Developers may also choose to share patches via email using git format-patch, git send-mail and git am.

Ultimately the maintainers are responsible for gathering patches from developers and integrating them back into the authoritative repositories.

git-cycle.png

For folks used to version control systems like CVS or Subversion this feels like an awful lot of hoop-jumping. One part that I can't diagram or explain as a series of technical bullet-points is the stewardship the pro-Git folks have to take on. People who are switching to Git get frustrated by the byzantine nomenclature and steep learning curve, so a big part of the change and setup is simply helping people out when they get stuck.

Posted in Git | 2 Comments »

Using Git as a Safety Net

March 14th, 2009

I spent the last week on a top-secret iPhone application at work. It has been a blast, in part, because it’s been so fun to learn so much new information so quickly. That has meant trying out lots of ideas and, more often than not, rolling them back and trying again. The problem is that doing this kind of experimentation can be an absolute productivity-killer in terms of managing your changes—unless you have a good tool to manage large chunks of changes, you can spend a lot of time trying to do it manually (and probably screwing it up in the process).

While doing Java development, I’ve often used Eclipse’s file history feature to roll things back. The downside is that it’s file-by-file and not easy to tag the current state of your entire project in one go. It appears that Xcode has this feature, but since I’m using Git for my SCM, I figured why not use that instead?

So my new flow has been to stage changes to the index whenever I make any progress and the app is still in a working state. This is different from a commit, which I still like to think of as a succinct, whole change around a particular feature. The incremental staging is more like dribbling micro-changes to the next stage prior to committing. It can take several attempts to get to a real, first-class commit.

I like using magit, so I keep Emacs running in the background. When I get to a good checkpoint, I stage hunks or entire files to the index. When I get enough to make a full commit, I commit them. If you’re running git exclusively on the command-line, this would be the equivalent of using git add -i.

When I first started using Git, I couldn’t see the value in Git’s stage-commit-push workflow. Now I get it. Like most Git tricks, this one is probably obvious to a lot of folks, but for me it’s been a real life-saver on this project. I’ve been able to be much more cavalier with experimentation because I can easily revert changes with a single keystroke. Nifty!

Posted in Git | No Comments »

What Jersey Means To Java

March 4th, 2009

In the last few days at work I’ve been migrating a home-grown REST framework over to the Jersey project (the reference implementation of JSR-311 or, JAX-RS). Previously I had done some work moving JRuby into the VM and launching Merb. It was satisfying to figure out how to do that, but involved an awful lot of wiring and special-casing.

As anyone who has read this blog recently knows, the luster is coming off of Java for me in a big way. These days my goals are simply to co-exist with it in a way that keeps me happy. I’m not going to be able to toss Java overboard so my working-life becomes a question of how to be happy with the situation.

While our home-grown framework has worked well for us, I’d much rather see us using something with wider adoption and use. So I took a look at Jersey and came away pretty impressed. Looking at how the API is built and what you need to do to build REST-ful web services in Java I couldn’t help but feel like Java has learned some lessons from the rest of the world.

The original Servlet API is what could be termed a “classic” Java API, wherein consistency and type-safety are the rules. In the Servlet API, you interact with requests and responses as monolithic objects. Each type of HTTP method (i.e. GET, POST, PUT and DELETE) has their own method which takes a ServletRequest and a ServletResponse object.

It’s a reasonable first solution, but it breaks down quickly. The primary issue is that by packing all possible request-related data into a single request object, and all possible response-related data into a single response object the classes themselves start to feel bloated. A more sinister second-order effect is that these classes, especially the response object, have some hidden state and “gotchas” because they are so general.

As an example, consider the case where you want to return a response with a non-200 response code, an entity body and some headers. The order in which you set these on the response object is important because of how it’s implemented. You need to set the status and headers before you write any output to the stream. This make sense because you want a stream-oriented interface which means not storing the entire response in memory and then flushing. But, due to the strict order of HTTP response messages, you need to flush the header information prior to sending the body. It’s not a particularly difficult thing to remember, but it is unnecessary mental overhead that is an artifact of the implementation.

The request object is its own special brand of fun to deal with. Again its broad coverage makes for a rather clunky API to deal with. You want request parameters? You have to grovel through String arrays if you want to capture all of them. When implementing a Servlet, often what you want is some combination of request parameters, cookies and header; rarely do you need all of them at once. However what you get is one big über-object that has everything. Enjoy!

One final beef with the ServletRequest class is that getting path parameters out is an absolute nightmare. If you’re building REST resources you really really care about the path as it is the way to identify resources. The poor support the ServletRequest class provides for this is simply shocking. Here’s a String — you parse it and figure out what the hell the segments are.

In contrast, Jersey has a much looser philosophy with how requests and responses are handled. First, the monolithic request and response objects go away. Instead your methods provide the narrowest possible interface, expressing only what they need in exacting terms. This is done by making extensive use of Java annotations to mark up simple method parameters. For example, if you have a resource that needs a request parameter, use the @QueryParam annotation. Need a header? Just use @HeaderParam. Interested in cookies? Use the @CookieParam annotation. Here’s an example:

1
2
3
public MyResult getMyResult(@QueryParam("name") String name) {}

This does away with a tremendous amount of “busy-work”. You want the “name” parameter? By god you’re going to get it—no intermediate objects to reach into and pull stuff out of.

A really nice side-effect of this is that writing tests for your resources become so much cleaner than using the more general Servlet API. Instead of setting up the monolithic request and response objects, you simply pass in the bits you want.

What about the response side? I think that it’s in this area that we really see a fundamental philosophical shift emerge. In an earlier time the textbook answer to how to design an API like this would be to have each request method return some kind of superclass or interface. This would allow us to use polymorphism to vary the response, but keep our type-safety.

Jersey takes a different approach based on real-world needs. Whereas the Servlet API is intended to keep everyone equally happy by making sure everyone suffers the same amount of pain, Jersey lets you vary what you return—no special markers, no special configuration. If you have a simple case where you want to serialize an object as the entity response, just return the object.

What if you need to fiddle with the response some more? Maybe set some headers or alter the status code? In this case your method returns a Response instance. Now this may sound monolithic and perhaps, under the covers, it is. What saves it from degenerating into the Servlet API is the fact that you build a Response object with only as much as you need.

In the Servlet API you might have to set a moderately complicated response like this:

1
2
3
4
5
6
7
8
9
10
public void doGet(HttpServletRequest req, HttpServletResponse resp) {
  // do some stuff
  resp.addHeader("Expires", computeExpires());
  resp.setStatus(201);
  writeResponse(resp.getWriter(), new MyStuff());
}
 
private void writeResponse(Writer writer, MyStuff stuff) {
  // do whatever you have to do to serialize your object
}

In Jersey it looks like this:

1
2
3
4
public Response getMyStuff() {
  MyStuff stuff = new MyStuff();
  return Response.created(stuff).expires(computeExpires()).build();
}

The amount of code probably comes out to be the same, but the fact that you can build the response in a single line feels really good to me. The amount of vertical space dedicated to response-building is much more proportional to it’s conceptual space in the method in Jersey than the Servlet API.

Now I realize that comparing Jersey to the Servlet API is perhaps unfair. The Servlet API is really designed to operate a layer below where Jersey plays. But I think the same arguments apply to a majority of the other popular web frameworks. The models are all essentially the same.

So here’s where the philosophical sea-change comes in. To vary the response type to do the right thing means that after a decade of existence, people are finally embracing Java’s reflection capabilities. Up until pretty recently this was considered the domain of the lunatic fringe.

The primary argument against reflection has always been that it’s too slow. While it’s true that their can be a penalty under certain circumstances, the modern perspective is that it’s an acceptable price to pay for a kinder, gentler API. Simply put, this level of dynamism is essential for reducing the amount of boilerplate and busywork.

I think if other frameworks like Rails, Django and Merb hadn’t gained so much traction in the last few years Java would have continued on it merry way. However I think the popularity of those alternatives has forced the Java greybeards to learn a thing or two. Strict static type-safety can become a burden and using reflection to vary behavior instead of polymorphism can make for some very clean APIs.

After spending a few days with Jersey I find myself a little re-energized to work with Java again. We’re even considering Groovy as a way to cut down Java’s chubby syntax for another small productivity win. Of course, my personal preference would be to do this JRuby since I, personally, don’t find Groovy to be terribly compelling on its own. But since Jersey is so dependent on annotations, using JRuby is a non-starter.

Who knows? This could be a complete fiasco and another frustrated attempt at tilting windmills. I’m already a little suspicious of the “enterprisey” way this thing gets configured. However if nothing else, it’s refreshing to see some new ideas make their way into Java after all these years.

Posted in Java, REST | No Comments »

You Put Merb In My Jetty!

February 11th, 2009

In the latest update of The Chronicles of Stuff Alex Figures out at Work, our intrepid hero figures out how to run Merb inside an embedded Jetty instance!

Now you may ask yourself, “for the love of God, why would you want to do something like this?” Well, at work we do a lot of internal web services. For my particular team, we’ve found a real sweet-spot by using an embedded Jetty server sitting right next to a BDB instance. There are no extra processes or packages to manage (e.g. apache or a RDBMS). However we were becoming dissatisfied with our current web layer which is a homegrown REST framework that sits on top of the Servlet API. So in a fit of rage, I decided to see if I could stuff Merb in the middle of this mess.

You may also be asking yourself, “why not use the jruby-rack gem directly?” The answer is that the jruby-rack gem makes a lot of assumptions about how you want to run your application. First it assumes that you’re cool with packaging things up as a WAR (which I’m not) and, secondly, that your application is primarily a Rails/Merb application. In my case, for better or worse, our app is really a BDB application with a Merb app glommed onto the side for web visibility.

The Solution

I can’t take complete credit for this solution. If I hadn’t found Jan Berkel’s post on putting Rails in Jetty I would have never figured out how to stuff Merb in there. To give yourself some context, take a look at that post first. Then take a look at the “Merb-ified” version of the same recipe below. Both solutions assume that you’re configuring Jetty within JRuby.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
server = org.mortbay.jetty.Server.new
thread_pool = org.mortbay.thread.QueuedThreadPool.new
thread_pool.min_threads  = 5  # adjust as needed
thread_pool.max_threads  = 50
server.set_thread_pool(thread_pool)
context = Context.new(nil, "/", Context::NO_SESSIONS)
context.add_filter("org.jruby.rack.RackFilter", "/*", Handler::DEFAULT)
context.set_resource_base(Environment.resolve)
context.add_event_listener(MerbServletContextListener.new)
context.set_init_params(java.util.HashMap.new('merb.root'=>; Environment.resolve,
    'merb.environment' => 'production',
    'public.root' => Environment.resolve('public'),
    'gem.path' => Environment.resolve('gems'),
    'org.mortbay.jetty.servlet.Default.relativeResourceBase' => '/public',
    'jruby.max.runtimes' => '1'))
context.add_servlet(ServletHolder.new(DefaultServlet.new), "/")
server.set_handler(context)
server.start

Tweaking

At first blush our performance seemed to be pretty lacking. This required two tweaks: putting Merb in “production” mode and dealing with poor I/O due to logging. In the previous snippet you will notice that we set the merb.environment to production. Yes we lose the quick dev turnaround, but since there is a lot of Java in this project we usually have to recompile anyway which requires a restart anyway (phooey).

As for the I/O issue, a little digging revealed that shutting up Merb as much as possible would help reduce the amount of JRuby-level IO. In our config/init.rb we configure logging like so:

1
2
3
4
5
6
7
8
9
10
11
12
13
Merb::Config.use { |c|
c[:environment]         = 'production',
c[:framework]           = {},
c[:log_level]           = :warn,
c[:log_file]            = Merb.root / "logs" / "merb.log",
c[:use_mutex]           = false,
c[:session_store]       = 'cookie',
c[:session_id_key]      = '_facet-store_session_id',
c[:session_secret_key]  = '49411912879b879e13f89a9280c0f6aaa2e3ab58',
c[:exception_details]   = true,
c[:reload_classes]      = false,
c[:reload_templates]    = false
}

Here we set the environment to “production” again (yes, you need to do both). Also we upped the log level to “warn” which significantly reduced the amount of logging merb does. With these tweaks in place we found that the Merb port of our service was operating within about 80% of the level of performance we were getting from our pure-Java solution.

Benchmarking was done by running httperf tests against the resources we expose and comparing both the number of requests per second and the average response time. Given that the options for generating XML, HTML and JSON were all so much easier than what we were doing in the servlet version, we were willing to live with the performance hit.

Rewriting History with Git

January 31st, 2009

This past week I spent some quality time with git’s history-rewriting capabilities. Over the past few weeks I had been working on a rather long-lived branch full of JRuby and Merb patches. Some of the fixes and changes were ready to go in the next release, others were still a wee bit experimental and so my plan was to split the patches in two. The ones that were ready would get pushed upstream while the not-ready-for-primetime stuff stayed on a local branch.

That seemed like a great plan until I realized just how tangled some of the patches were. It isn’t difficult for this to happen. As you extend the functionality of a system, you often refine earlier work. This was especially true in my case since we were introducing JRuby to a previously Java-only project and so there was a lot of experimentation and refinement. What I was really trying to do was re-write my commit history to separate the changes. Small cleanup commits could be collapsed with others to make the entire commit-set something that my teammates could easily understand.

Getting Started

The most basic kind of re-writing you can do is simply amending your last commit. On the command line this is accomplished with git commit --amend. The bigger rewrite-hammer is git rebase -i <ref>. This command will pop off all of your commits to the point specified by <ref>, provide you with a control file to edit and then re-apply your patches as directed.

The control file (a term I just made up) looks kinda like this:

pick 1cea777 Initial introduction of Merb.
pick 5fe7a19 Favor XML over HTML as the "default" content-type.
pick 6791134 Cleaned up the 'views' directory, out with the old, in with the new.
pick ae6adac Put lots of back-navigation links to make it nice 'n' easy to use.
pick 1da31ca Removed last vestiges of the RestServlet-related configuration and code.
pick b9fe3b9 Added error views, updated error-handling and improvised content dispatch.
pick bf92292 Routing cleanup. This is much more pleasant to read.
pick ec74c63 An attempt to get RSpec working with Maven and JRuby.

# Rebase 92ddc87..ec74c63 onto 92ddc87
#
# Commands:
#  p, pick = use commit
#  e, edit = use commit, but stop for amending
#  s, squash = use commit, but meld into previous commit
#
# If you remove a line here THAT COMMIT WILL BE LOST.
# However, if you remove everything, the rebase will be aborted.
#

The top section lists all of your patches, one per line. You can edit this section to do any of the following:

  • Reorder the commits (just move lines up or down)
  • Edit a commit (replace “pick” with “edit” or simply “e”)
  • Remove a commit (just remove the line)
  • Merge with a previous commit (replace “pick” with “squash” or “s”)

Think of this list like a program of execution. Once you save the file and return control to git, it will attempt to execute this program. I say attempt here because sometimes git runs into conflicts when it comes to re-ordering patches.

Editing Commits

When you mark a commit for editing (with “edit” or “e”) git will attempt to apply all patches up to, and including that commit, and stop. This threw me at first because for some reason I had the unreasonable expectation that somehow the changes in the last commit would only be applied to the working tree and, perhaps, the index. Instead, that commit is fully applied, but all commits past that are pending. To add to my confusion, there isn’t an easily accessible marker to indicate that you are in the middle of rebase (unless you frequently scan the .git directory as a matter of habit). At this point you could amend the commit (with git commit --amend) or insert other commits.

Sometimes I use this just to fix up the commit message. When I’m working on a new feature that has a lot of trial-and-error I tend to mark the first commit message with “WIP” to remind myself to review that patch and the ones following to see if I can clean them up. This is an area where git really shines. In a system like Subversion your audience (other coders who look at your changes) end up walking through whatever little mini-epic you’ve composed as you try things out, add things and remove things. This makes for some difficult reading for consumers of those patches and so a lot of folks tend to stack up big changes and then send in the big über-commit.

The problem with building the mega-patch is that you don’t have a lot of scaffolding under you while you are building it. If you go off and explore something that doesn’t turn out right, you generally have to do a lot of manual reverting. This simply sucks and is a waste of time. With git I can work in lots of incremental commits, then go back and edit them into something sensible once I’m ready to push changes.

Once you’ve finished doing whatever edit you want to do for that commit, simply type git rebase --continue and the remainder of the “script” will execute. If any other “edits” are in the pipeline, the process will repeat itself until the rebase is completed.

Squashing Commits

I looooove squashing commits in git. For any moderately complicated work, it’s rare that I get it right the first time. Usually as I go along I find some mistake I made or find a refinement that cleans up the original work. As often as not, it’s usually a couple of commits away so amending the last commit isn’t going to help me. But hey, no worries! I simply commit the change and leave a log message for myself to merge it with another commit. Then, when I run the interactive rebase, I can simply move this commit up the list and change it from “pick” to “squash” (or “s”) and let git merge the two commits.

When rebasing encounters a “squash” it will apply the changes in both commits. If successful, git will prompt you with a commit message file that includes the original commit messages of both commits. You can choose either, both or neither of these and save the file to continue.

Resolving Conflicts

Sometimes when applying a commit, git will run into conflicts and bail out on a merge. Whenever this happens to me I always have a little moment of panic as if I’ve broken something, but fear not—you can fix this. When this happens git tries to stage as many of the changes as it can. Any conflicts are left unstaged and need to be edited (look for the standard conflict markers), and then re-staged into the index. When you commit, git should show you the original commit message of the patch it was attempting to apply, which you can edit or keep. Once the commit succeeds, rebasing should automatically continue.

The Big Red Button

Sometimes you may just give up on a rebase. Either you can’t commit the time to it, you don’t really want to go through with it, or your patience has reached its limit. At any point, before the rebase has completed, you can execute git rebase --abort and your working tree, index and commit-log will all be returned to the state prior to starting the rebase. Think of this as the big red “stop” button for rebasing.

Playing Fast and Loose with Branches

If you come from other SCM systems like Subversion, you tend to treat branches as expensive and something you only use on occasion. With git, branches are cheap and easy like drinks in Tijuana. Be fearless! Not sure about some exploration? Make a branch!

If rebasing makes you nervous, and you’re not entirely confident that aborting will save you, I suggest creating a branch on which to rebase. Simply create a new branch from where you’re at with git checkout -b <branch>. You’ll be immediately switched to that branch with your commit log looking exactly like the branch you came from. Here you can rebase, edit, remove, add and generally go crazy with your commits.

Once you’ve got your commits where you want them it’s simply a matter of getting them back into whatever you consider your “main” branch to be and pushing any changes upstream. This works really well when you have rendez-vous point like a Github repo or an SVN server (we do a lot of git on top of Subversion at work). Assuming that you’ve been adding changes on your master branch, you might create a new branch off of master that you might call exp. On the exp branch you might rebase and wreak all sorts of havoc on the commit log. Once you have your commits where you want them, switch back to your master branch and reset it to point to the head of your upstream repo. If you’re using git all the way, simply merge your exp branch over. If you’re running git on top of Subversion, simply cherry-pick the commits on the exp branch (in order!!!) and push the changes upstream.

It’s hard to stress how important it is to shift your mindset into thinking in terms of sharing patches. Treat your commits like individual, digestible changes. Even if you’re working by yourself or in a small team, I still think there’s value in being disciplined with your commits.