Team API Development

Just for a laugh, and in my spare time, I am going to develop an API for our VCE CTO team.  I’m going to do this “in the open” which means making live edits to this page.   That means you’ll see mistakes and missing bits, but that’s ok.  Feel free to give me pointers because if you’ve taken the time to read this then I’ll surely take the time to listen to you.

If you’re wondering why I’m doing this, let me quickly explain – this isn’t an official company project, it’s just an exercise to explore the development of an API using a real-world case that I am familiar with.  If it works, the end result might well be useful; if not, nobody gets hurt and I might be just a little bit wiser, and every little bit helps.

The structure of this page is simple – the greyed-out ones haven’t been started yet!

  • The development approach
  • The design assertions
  • Some code (yay!)
If you don’t want to listen to my prattling dialogue below, which is kinda just brain dumped out with what little personal filter I have, then I will be summarising each section in sub-pages so you can get straight to the meat’n'potatoes.  Later.

Development approach

From reading a lot of material on API design, I’ve come to the conclusion (solidified by reading Apigee/Biran Mulloy’s API Facade pattern) that I need to follow three simple steps:

  1. Design the front end of the API – the URLs, request parameters, responses, payloads, headers, query parameters etc.  This should be self-consistent.
  2. Implement the design with data stubs so you can test drive it without the complexity of connecting to the back end systems.
  3. Integrate between the API facade and the real back end systems and resources.

Visually, it looks like this:

From one big problem into three little problems

However I’m going to add a step before this and that’s to take note of industry experts and their writing and go through a process to flush out as many considerations and assertions as possible  This way I reduce the risk (a bit) of me doing something daft early on, combined with the pragmatic Occam’s Razor / OOP approach of only doing what I need to do at first, marking other important-but-not-urgent things for later.  I don’t know what I don’t know, so best get knowing.

Design assertions

There are people out there that have been “doing” APIs for years, and it makes sense to listen to these folks before embarking on this voyage.

Take the inimitable George Reese, for example: he’s been both consuming and producing APIs for years through his work as enStratus CTO.  In this (beautiful) age of sharing and openness George took the time out to pen his candid thoughts in his book The REST API Design Handbook.  This isn’t George saying “I know best, so STFU and do what I say”, instead it was George just sharing his thoughts and views and, well, if you don’t like it then by inference you can go boil your head but we’re still friends, even though your head has been shrivelled up.

Really, though, the main reason I’m starting with George as a guide is I don’t want to end up in this chapter he wrote:

Here are a lot of pieces from George’s book that I am using as design assertions (thank you, George! And for the REST of you, go buy his book!).

I’m using content from other sources too and will name them as much as possible because I’m really grateful for them sharing.

  • Audience
    • The API’s primary audience is other business units and functions within VCE.
    • This makes this a private API.
    • This use case expects the audience within VCE to be application users mostly, people consuming information.
    • Developer users, coding applications to consume the API, will be the same people as those that develop the API in this first use case.
    • The CTO team are the primary producers of information that is made visible by the API.  If they aren’t writing it, they are the ones surfacing it.  If there’s is content in the CTO API space that’s not authored by the CTO team, there’s something wrong.
    • This initial, focused audience is just for the API development test and maybe one day it will grow up to be a public API and have a more developer focus.
    • Signup for the API should be via a simple web page that is linked into Role Based Access Control.
    • If you sign up as a CTO guy, then you get to POST and PUT content in the CTO API space.  If you sign up as someone who is not a CTO, then you get to GET, HEAD, and OPTIONS content in the CTO API space.
  • Don’t Do Anything Weird
    • Use convention – normal HTTP verbs and codes, Oauth and not some mad science security scheme.
  • Less Is More
    • Start with minimal functionality, then add (but never take away!) functionality in subsequent versions.
  • Richardson’s Maturity Model
    • The design goal is to
  • HatEoAS
    • Hypertext as the Engine of Application State.
    • Clients should not bookmark “deep” URIs, such as http://api.vce.com/api/cto/corner-plant.
    • Clients should find resources by using hyperlinks to find the URN (corner-plant).
    • This is *important* because it allows the system to grow and change, changing URIs but always keeping the same URN.
    • HATEOAS allows you to move home and always be found.
  • The API will meet the six REST constraints identified by Fielding:
    • Client/Server – these can evolve independently from each other with limited, well-structured modes of communication between them.  This should be renamed Machine:Machine decoupling, because as well as being an important separation between Consumer <-> API, it is also important between API <-> Back End Systems like persistent data stores.
    • Stateless – The API has no concept of state.  The external systems (consumer, backend) take care of that.
    • Caching – API should indicate what can be cached.
    • Uniform Interface – George describes this as a complex one, causing battles between the RESTafarians and API devs.  Mainly: don’t redefine HTTP verbs, stick with the pattern!
    • Layered System – this is what the API is really for, to use a Facade pattern to hide the complexity of back-end stores of resources from the consumer (in my use case, anyway!).
    • Code on Demand – George describes this as bizarre because it is an “optional constraint” (oxymoron?).  What it means is you can return executable code to the consumer, which sounds pretty good to me in some use cases (think of an adaptive system).
  • There are six HTTP verbs used to interact with the API.  The key here is: one URI; many verbs.
    • GET – to retrieve a resource via a unique URI.
    • POST – to add a new resource to the model at a new unique URI.
    • PUT – to update a new resource that already exists at a unique URI in the model.
    • DELETE – I’ll let you work that one out :)
    • HEAD – just return the Response headers, no body.  Useful for testing.
    • OPTIONS – part of the “self documenting” approach, a Resource can respond with what HTTP verbs/request details it accepts.
  • This API is resource-based, not service-based, as per the REST design constraint.
    • You can access the team artefacts like people bios and written papers.
    • You cannot execute the “give_the_gal_a_raise()” method on  a staff object :)
  • The API end point
    • This is the root of the API and must be designed first.
    • The root url is http://api.vce.com/api.
    • This root end point will point to child resources, which may be a list of other supported APIs.
    • The consumer should be able to discover everything about the API from the API.
  • The resource model
    • Don’t tightly couple the resource model to the back end (real?) resource implementation.  No tight coupling!
    • Think of the resource model from the consumer view: the API is for the consumer, primarily.  What should their view of your universe be?  Why not ask them?
  • The methods
    • GET – two consecutive gets return the same resource.
    • POST – creates, does NOT update.
    • PUT – updates, does NOT create.
    • DELETE – I wish this worked on cranky 3yr olds.
    • But WAIT!  Should these verbs apply to all resources?  Maybe only in certain permission able ways – ie. an author can DELETE and PUT to his resource, but other consumers cannot?
    • In the cases where the verbs don’t work (e.g. I can’t delete JimBob’s documents) then the API returns a HTTP Error Code, not a fancy HTML page.  Got it? :)
  • Representations
    • XML or JSON?  I loved George’s comment of “Don’t drag me into your dirty little war” :)
    • Give the consumer the choice by supporting the Accept header MIME type, and tell them what’s coming back with the Content-Type HTTP header
    • Consider that some resources (long ones needing pagination, for example) are better in one format (XML) than others (JSON).
    • Use a HTML representation for those pesky humans using a browser.
    • Matching Accept with Content-Type is a best effort, after all the consumer isn’t always right…
  • Response Codes and Error Handling
    • Minimal response to a request is a HTTP code, even if there’s no content.
    • Do NOT embed custom response codes into the body of the message or (worse) the URI. Stick. To. HTTP. Or. Die.
    • Do use the HTTP response codes AND put more detail in the message body to describe the error and, most importantly, what to do about it!
    • Success – GET returns 200, 204, 206.  HEAD returns 204.  POST returns 201, 202.  PUT returns (rare) 201, 202, 204.  DELETE returns 202, 204.
    • Failure of the request – HTTP 4xx codes (it’s the consumer’s fault)
    • Failure of the system – HTTP 5xx codes (file a bug with the provider)
  • API must cover the full problem domain
    • In my use case, this is quite limited so we focus on the primary goal (developing an API) compared to solving a big hairy problem of which API is but one facet.
    • The problem domain here is limited to the people and artefacts in a team, and exposing them in a programmatic way.
    • People might look human, but really they’re just an astral projection to abstract the fact they are a complex set of resources :)  Things / resources to access about a person: their name, title, bio, picture, other interesting and as yet unknown resources they are masking!
    • Artefacts is an abstract name for things like documents, papers, presentations etc.  This is the real crux of the problem: these artefacts are liberally spread around many persistent data stores, sometime mistakenly referred to as collaboration suites.  The API should liberate these resources to the consumer by abstracting away the back end complexity.
  • Resource identifiers – URIs, URLs and URNs
    • The URI is the full name and address of a resource, containing the URL + URN.
    • The URL is the street address of the resource, and can change (mutable).
    • The URN is the name of the resource, and cannot change (immutable).
    • Example: http://api.vce.com/api/cto/people/steve-chambers
      • The URI is http://api.vce.com/api/cto/people/steve-chambers
      • The URL is http://api.vce.com/api/cto/people/
      • The URN is steve-chambers
  • Authentication
    • HTTP Basic Authentication is ruled out (why: SSL, Logout, Client retains auth info)
    • HTTP Digest Authentication is ruled out (why: SSL)
    • Public Key Authentication is ruled out (why: complexity)
    • Credentials in Request is ruled out (why: creeds should not be passed in each Request)
    • Token Keys with Request Signing (of all headers and payload for zero collisions) are selected.
  • Versioning
    • Never break existing client code.  Never deprecate.  Support all old clients.
    • Version numbering should be easy for humans and machines, with date format (year-month-day) being best.
    • Negotiation – not in the URI or a query parameter, instead client uses a customer HTTP Request header (x-vce-version).
  • Asynchronous Operation (deferred)
    • Although the team API is a facade and should be loosely coupled in both resource model and operation, I’m going to leave Asynchronous Operation until later, as a more advanced topic.
    • I have in mind some future operations that will really need this, but crawl-walk-run, eh?
  • Rate Limiting (not required).
  • Polling and Event Notifications (deferred)
    • Polling should be avoided because it’s a waste.
    • Event Notifications should be used to that clients can register their interest and be notified by the API.
    • Clients should be able to register their interests in resource state changed (GET, PUT, POST and DELETE).
    • Notifications should be asynchronous and require implementation of a message queue.
    • Message queue should run externally to the API.
    • API has no knowledge of message queues or who has registered for what, it just sends notification data structures to the external “Notification System”.
    • The list of recipients should not be a data structure but should be a queue itself for a notification delivery system to process.
    • The recipient should be able to verify the authenticity of the message.
    • This feature important but not urgent and is deferred to stage 2 or 3 of crawl-walk-run.
  • Pagination and Streaming (deferred)
    • The data structures in the Response from the API to the client are small sized.
    • The message body in the Response can be large as it is a document from a remote system.
    • In future, system-specific data structures (e.g. list of product components) will be returned and this topic will need to be addressed.
  • Documentation
    • Should be delivered through the API, starting by pointing a browser at the root end point URI (http://api.vce.com/api).
    • Should describe how authentication works, with real examples in different languages, showing responses, headers, signing mechanisms etc.
    • Should describe all HTTP headers and status codes used with rationale.  Missing info = perceived ignorance.
    • Enumerate all resources and methods that apply to each of those resources.
    • Provide API call examples with real stuff, like unit-test output.

One Comment

  • Ryan Schipper
    Posted 22 August, 2012 at 08:22 | Permalink

    Hi Steve,

    I found this project via Mark O’Neills blog entry (http://www.soatothecloud.com/)

    I was wondering why you ruled out public key authentication as too complex for the project. That is, what are the specific bugbears?

    Feel free to email me.

    Full disclosure: My employer sells an authentication product that uses public key authentication (http://www.keyvault.com.au).

Post a Comment

Your email is kept private. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>