Thoughts on development & design: api

Showing posts with label api. Show all posts

The video for the talk I gave at the 2018 API Conference is now available online.

I have talked about a bit before, as well as sharing the slides, but one of my main take-aways is that we are all (mostly) in the business of building products - on a daily basis, whether we are coding, write docs, tests, change requests, specifications, designs - there is almost always an end product of our work, and the product decisions we make whilst building it has a direct impact on the end-users (people will have to read/amend your code, read your specifications, translate your designs, consume your APIs etc). With that in mind, it seems sensible that we look at what lessons we can take from the discipline of Product Management to help us make smart decisions in our day-to-day.

Last week I attended and spoke at the 2018 API Conference in Berlin.

Having written about the topic before, my talk was titled: Your API as a Product - Thinking like a Product Manager (really aimed at engineers/architects/technologists).

It was recorded, so hopefully I will be able to share the video in the future, but the thrust of the talk was based around the concept that we are all building products on some level, and even if we don't have direct input into a commercial product that we might be writing code for, we have some output: code, bug reports, designs etc. So it makes sense, if we are all building products, and those products will all have users and therefore a user experience (your code, bug reports, design docs etc will be used by others, or maybe you yourself will be the user), that we try and learn from the discipline of Product Management, as it is focussed on building better products and better user experiences for the end users.

Slides are below, and really your best bet is to scroll down to the references section at the end and start watching the real Product Managers' talks.

API Conference Berlin 2018 - Product Management for Engineers from Rob Hinds

Something I have been thinking about for a while is the idea of APIs as your Product - the idea that you should design, build and test your API like it's a public facing product (as it might actually be if its a public API), and what lessons we should be learning from Product Managers about how we achieve this.

In the latest instalment of their "Technology Radar", ThoughtWorks recommended the concept "API as a Product" as ready to trial (although I think it's far beyond trial stage - I can see no real risk of adopting the lessons from Product Management for your API).

I have generally been interested in Product Management for several years - I think it's hard not to get interested in the science of Product Management when you are building side projects, as you naturally have to think about the product you are building - sure, with side projects they can easily turn into vanity projects, where you just build what you think is cool, but I think it's still fun to think about real users and good Product Management techniques that can be applied even to side-projects.

Having tried to gain some understanding, I have turned to some classic material on the subject of Product Management: The Lean Startup, Don't Make Me Think, How to Build a Billion Dollar App. I have also been lucky enough to attend Mind The Product Conference twice (though volunteering - but managed to watch lots of the talks), and to have worked with several great Product Managers.

So what lessons can we start applying? I think, if one statement sums it up, its an idea from Kathy Sierra:

Does your product make your users awesome?

This is a great way to think about APIs (I'm mostly talking about REST/GraphQL type APIs here - but this thinking is a great way of approaching design for normal class/interface/library APIs too).

As engineers, I'm sure we have all used APIs that don't achieve this - APIs that make us have to think, that make us feel stupid, that seem overly complicated (for example, the old Java Date library - the number of times I had to google just to find out how to simply create a specific date for something simple like a unit test is ridiculous). And hopefully, we have all used APIs that make us feel awesome - APIs that make us productive, that let us do what we expect to be able to do, and can (with the help of the IDE's auto-complete) let us guess what the method calls are.

Another idea Kathy Sierra mentions in that talk, and is also made famous from the title of the Product Management classic, is:

Don't make me think

I have written before about the StackOverflow and GitHub APIs, but they deserve a mention again, if you have ever written an integration with them, you will know. The APIs are designed in such a way that sensibly models the domain model.

For example, take StackOverflow, if we were going to describe their domain model, what first class objects would we have? I guess Questions, Answers, Comments etc - and sure enough, their API models these:

/questions - will get you all the questions on the site

/answers - will get you all the answers,

Now, how do you think you might get all answers for a given question? Well, a list of answers should probably be a child of a question, so something that represents that, something like:

/questions/{Question ID}/answers

Sure enough, thats the URL - and the behaviour is consistent through the design, I'm sure you can guess how you might retrieve all comments for a given answer.

I'll be honest, developing against an API where I can guess the endpoints makes me feel pretty awesome.

MVP and the Lean Product

The idea of a Minimum Viable Product is a core concept in Lean Startup - the idea that you build the minimum version of the product that can get you feedback as to whether you are heading in the right direction before investing more effort in building out a fuller fledged product.

Whilst I think an appreciation of an MVP is a good awareness for any engineer - I have generally found it is useful in facilitating conversation when building a new product - if as an engineer you recognise some new feature is going to be considerable effort, its helpful to think about what hypothesis the feature is trying to test, and is there a simpler engineering solution to test the hypothesis.

However, the principal of not over engineering the API is also a good one - once you have a well designed domain model, its quite tempting to get carried away with REST design best practices and start building endpoints for all different variations of the model - however, unless you need them, it's better to start with less (if you are building a public REST API from the start then it might be harder to know what isn't needed, but you can still certainly build an MVP version of the API to test the hypothesis - e.g. whether people actually want to use your API)

User Testing

Another important part of building a product is collecting user feedback - Whilst a REST API may not be well suited to A/B testing (your consumers won't be very happy if switch the API design on them after they have implemented against a specific variation!), listening to user feedback is still important - because the consumers will likely be engineers (writing the client code), then hopefully they will be expecting REST standards and good API design principles, but another way to think about this is the importance of not making breaking changes to your API.

A technical way of thinking about this is Consumer Driven Contracts Testing - As RESTful APIs are de-coupled from the client code, the API doesn't have any idea what clients are using or doing, so this approach allows consumers to define what they expect from the API, and then the API can always test against these contracts, so be sure that any change to the API is still in keeping with existing consumers, or users, of the API.

Conclusion

I think, in general, all engineers can benefit from thinking about how good products are built, what concepts are important, and be able to think about the products or services they are building in this way - both in terms of thinking about the actual implementation as a product that someone else will be the user of (either externally if a RESTful API, but also internally if its cross-team or even just someone coming to take over your code once you have moved on).

This is very much an opinionated rant about APIs, so it's fine if you have a different opinion. These are just my opinions. Most of the examples I talk through are from the Stack Exchange or GitHub API - this is mostly just because I consider them to be well designed APIs that are well documented, have non-authenticated public endpoints and should be familiar domains to a lot of developers.

URL formats

Resources

Ok, lets get straight to one of the key aspects. Your API is a collection of URLs that represent resources in your system that you want to expose. The API should expose these as simply as possible - to the point that if someone was just reading the top level URLs they would get a good idea of the primary resources that exist in your data model (e.g. any object that you consider a first-class entity in itself). The Stack Exchange API is a great example of this. If you read through the top level URLs exposed you will probably find they match the kind of domain model you would have guessed:

/users
/questions
/answers
/tags
/comments
etc

And whilst there is no expectation that there will be anyone attempting to guess your URLs, I would say these are pretty obvious. What’s more, if I was a client using the API I could probably have a fair shot and understanding these URLs without any further documentation of any kind.

Identifying resources

To select a specific resource based on a unique identifier (an ID, a username etc) then the identifier should be part of the URL. Here we are not attempting to search or query for something, rather we are attempting to access a specific resource that we believe should exist. For example, if I were to attempt to access the GitHub API for my username: https://api.github.com/users/robhinds I am expecting the concrete resource to exist.

The pattern is as follows (elements in square braces are optional):
/RESOURCE/[RESOURCE IDENTIFIER]

Where including an identifier will return just the identified resource, assuming one exists, else returning a 404 Not Found (so this differs from filtering or searching where we might return a 200 OK and an empty list) - although this can be flexible, if you prefer to return an empty list also for identified resources that don’t exist, this is also a reasonable approach, once again, as long as it is consistent across the API (the reason I go for a 404 if the ID is not found is that normally, if our system is making a request with an ID, it believes that the ID is valid and if it isn't then its an unexpected exception, compared to if our system was querying filtering user by sign-up dates then its perfectly reasonable to expect the scenario where no user is found).

Subresources

A lot of the time our data model will have natural hierarchies - for example StackOverflow Questions might have several child Answers etc. These nested hierarchies should be reflected in the URL hierarchy, for example, if we look at the Stack Exchange API for the previous example:
/questions/{ids}/answers

Again, the URL is (hopefully) clear without further documentation what the resource is: it is all answers that belong to the identified questions.

This approach naturally allows many levels of nesting as necessary using the same approach, but as many resources are top level entities as well, then this prevents you needing to go much further than the second level. To illustrate, let’s consider we wanted to extend the query for all answers to a given question, to instead query all comments for an identified answer - we could naturally extend the previous URL pattern as follows
/questions/{ids}/answers/{ids}/comments

But as you have probably recognised, we have /answers as a top level URL, so the additional prefixing of /questions/{ids} is surplus to our identification of the resource (and actually, supporting the unnecessary nesting would also mean additional code and validation to ensure that the identified answers are actually children of the identified questions)

There is one scenario where you may need this additional nesting, and that is when a child resource’s identifier is only unique in the context of its parent. A good example of this is Github’s user & repository pairing. My Github username is a global, unique identifier, but the name of my repositories are only unique to me (someone else could have a repository the same name as one of mine - as is frequently the case when a repository is forked by someone). There are two good options for representing these resources:

The nested approach described above, so for the Github example the URL would look like:
/users/{username}/repos/{reponame}

I like this as it consistent with the recursive pattern defined previously and it is clear what each of the variable identifiers is relating to.
Another viable option, the approach that Github actually uses is as follows:
/repos/{username}/{reponame}

This changes the repeating pattern of {RESOURCE}/{IDENTIFIER} (unless you just consider the two URL sections as the combined identifier), however the advantage is that the top level entity is what you are actually fetching - in other words, the URL is serving a repository, so that is the top level entity.

Both are reasonable options and really come down to preference, as long as it's consistent across your API then either is ok.

Filtering & additional parameters

Hopefully the above is fairly clear and provides a high level pattern for defining resource URLs. Sometimes we want to go beyond this and filter our resources - for example we might want to filter StackOverflow questions by a given tag. As hinted at earlier, we are not sure of any resources existence here, we are simply filtering - so unlike with an incorrect identifier we don’t want to 404 Not Found the response, rather return an empty list.
Filtering controls should be entered as part of the URL query parameters (e.g. after the first ? in the URL). Parameter names should be specific and understandable and lower case. For example:
/questions?tagged=java&site=stackoverflow

All the parameters are clear and make it easy for the client to understand what is going on (also worth noting that https://api.stackexchange.com/2.2/questions?tagged=awesomeness&site=stackoverflow for example returns an empty list, not a 404 Not Found). You should also keep your parameter names consistent across the API - for example if you support common functions such as sorting or paging on multiple endpoints, make sure the parameter names are the same.

Verbs

As should be obvious in the previous sections, we don’t want verbs in our URLs, so you shouldn’t have URLs like /getUsers or /users/list etc. The reason for this is the URL defines a resource not an action. Instead, we use the HTTP methods to describe the action: GET, POST, PUT, HEAD, DELETE etc.

Versioning

Like many of the RESTful topics, this is hotly debated and pretty divisive. Very broadly speaking, two approaches to define API versioning is:

Part of the URL
Not part of the URL

Including the version in the URL will largely make it easier for developers to map their endpoints to versions etc, but for clients consuming the API it can make it harder (often they will have to go and find-and-replace API URLs to upgrade to a new version). It can also make HTTP caching harder - if a client POSTs to /v2/users then the underlying data will change, so the cache for GET-ting users from /v2/users is now invalid, however, the API versioning doesn’t affect the underlying data so that same POST has also invalidated the cache for /v1/users etc. The Stack Exchange API uses this approach (as of writing their API us based at https://api.stackexchange.com/2.2/)

If you choose to not include the version in your API then two possible approaches are HTTP request headers or using content-negotiation. This can be trickier for the API developers (depending on framework support etc), and can also have the side affect of clients being upgraded without knowing it (e.g. if they don’t realise they can specify the version in the header, they will default to the latest). The GitHub API uses this approach https://developer.github.com/v3/media/#request-specific-version

I think this sums it up quite nicely:

I wish more people would ask how to avoid breaking changes in HTTP APIs instead of asking how to version them.
— Darrel Miller (@darrel_miller) March 13, 2016

Response format

JSON is the RESTful standard response format. If required you can also provide other formats (XML/YAML etc), which would normally be managed using content negotiation.

I always aim to return a consistent response message structure across an API. This is for ease of consumption and understanding across calling clients.

Normally when I build an API, my standard response structure looks something like this::

[ code: "200", response: [ /** some response data **/ ] ]

This does mean that any client always needs to navigate down one layer to access the payload, but I prefer the consistency this provides, and also leaves room for other metadata to be provided at the top level (for example, if you have rate limiting and want to provide information regarding remaining requests etc, this is not part of the payload but can consistently sit at the top level without polluting the resource data).

This consistent approach also applies to error messages - the code (mapping to HTTP Status codes) reflects the error, and the response in this case is the error message returned.

Error handling

Make use of the HTTP status codes appropriately for errors. 2XX status codes for successful requests, 3XX status codes for redirecting, 4xx codes for client errors and 5xx codes for server errors (you should avoid ever intentionally returning a 500 error code - these should be used for when unexpected things go wrong within your application).

I combine the status code with the consistent JSON format described above.

At Twitter's Flight conference this week, they announced a couple of new toys for developers: Fabric and Digits.

Fabric is a new platform for building mobile apps

The Fabric platform is made of three modular kits that address some of the most common and pervasive challenges that all app developers face: stability, distribution, revenue and identity.

It seems everyone has to be building a platform these days, but really it sounds like Twitter, having previously shut down developers on their API/platform a while ago, realised that actually letting devs on their platform (now that it includes advertising APIs - thanks to their MoPub acquisiting a while ago) is going to be profitable for them.

Maybe it will be well received, and the way the market changes maybe people will go for it - but I wouldn't be building an app based around a Twitter platform given their track record. The company are a lot older, and maturer now (and they probably released their API too early last time around), but it still seems like quite a risk to have something as central as login/revenue tied to Twitter (I dont like third party logins anyway..).

Anyway, the much more interesting product they released was Digits - which allows sign-in with your mobile number - which as I have ranted before, is an idea that I love, and seems like a really smart move by Twitter. Both in as much that it will move twitter to that space of smartphone is your social-identity/network and also offers a lot more value than just another platform that serves ads and supports login etc.

I have recently been working on an android app and found myself needing a Twitter style input box that allowed certain words to be looked up using an API driven autocomplete but also supporting freetext (if you have used the compose tweet input on android, like this - free text, but if you type @ then you get a user name autocomplete).

I google'd for a while, and stumbled through a few StackOverflow answers but none of there were 100% clear, and further more, just copy-pasting the code into a dummy project I was working in didn't work. So here is a break down of the what and how this works.

There are four main components:

The XML Layout: MultiAutoCompleteTextView

This is simple - just add the MultiAutoCompleteTextView just like you would any input. The only point of interest here is the "completionThreshold" - this is just the number of characters that have to be typed before auto-complete kicks in. We have set this to one char so it kicks in early.

We then just setup the auto-complete in our Activity onCreate

SetTokeniser: UsernameTokenizer

This is a custom class that implements the Tokenizer interface. This will be used by our autocomplete imput box to work out whether or not it should be displaying a dropdown menu. If you are using a static list for lookups (countries, fixed codes from your app, etc) then this is the only thing you really need to do - then you can just set the fields ArrayAdapter as the list of Strings etc and it will automatically kick in.

In our case, we are using the @ character to identify the start of pieces of text that should be lookedup and spaces to determine the end of the look up token.

The three methods are relatively straight forward - and will drive when your app presents the drop down:

terminateToken(CharSequence text) - this basically just provides a presentable version of our token with a proper terminator at the end (e.g. makes sure a trailing single space in this case)
findTokenStart(CharSequence text, int cursor) - Just finds the position of the start of the token. It does this by iterating backwards through the provided CharSequence (the text input in the input box) starting from the position of the cursor, which is just the position of the last character edited (this means if you go back and edit text in the middle of a block it still finds the correct token). If no valid token start is found (e.g. we go backwards and we can't find a @ character) then the current position is returned - no dropdown is displayed.
findTokenEnd(CharSequence text, int cursor) - As above, but finds the end position. Iterates forward until a token terminator (in our case a space) or the end of the text is found

As long as you implement these to support your token identification pattern then you will get a dropdown appearing appropriately.

Adding a text changed listener

For ease of use on this one, I have just set the activity to implement the TextWatcher interface - the reason for this is just convenience - the implementation is going to handle calling our API asyncronously, so it is easier if it has the activity context.

There are three methods that need to be implemented - four our case there will be two no-op methods and just one implementation:

The method onTextChanged is implemented - unfortunately the tokenizer will only indicate to the application when to display the dropdown - it doesn't actually call the API, so we have to slightly repeat ourselves here in handling the API invocation. In this method we need to again check for and find the relevant valid token in the input, and if a valid token found, then pass it to our API to lookup the dataset.

To help with that, I also added some additional methods to help check for the existence of a valid token (reading them should be self explanatory, but comments included inline)

In this case, as it was just an experiment, and I didn't actually have a user lookup API I have just used the LinkedIn skills API (this code also taken in part from a StackOverflow answer).

The Async task should also be straight forward, and can be implemented in whatever pattern you are using for Async calls - but the key point to note is how we are updating our dropdown list.

The API driven, multi-autocomplete/free text field should then be working as expected

Thoughts on development & design

Your API as a Product: Thinking like a Product Manager [VIDEO]

API Conf 2018 - Product Management for Engineers

Building a RESTful API like a Product Manager

MVP and the Lean Product

User Testing

Conclusion

RESTful API Design: An opinionated guide

URL formats

Resources

Identifying resources

Subresources

Filtering & additional parameters

Verbs

Versioning

Response format

Error handling

Twitter API & Smartphone as your identity

Building a Twitter style autocomplete for Android

The XML Layout: MultiAutoCompleteTextView

SetTokeniser: UsernameTokenizer

Adding a text changed listener

ROB HINDS

Popular Posts

Blog Archive

DZone MVB