Hidden Image for Share

Sunday, November 24, 2013

On the Uniform Access Principle

The UAP is not good idea for generalised use, and Properties (e.g. C#, ObjC) should be avoided. There, I said it :) I think this is the post with the most non-standard viewpoint so far, so I wanted to start with it stated very clearly, and look forwards to reasons why my logic is wrong (as I'm guessing there are a number), so as always, please add feedback after reading this post.

Uniform Access Principle definition

Firstly, some background. The Uniform Access Principle was coined by the creator of the Eiffel language, and states:
All services offered by a module should be available through a uniform notation, which does not betray whether they are implemented through storage or through computation. 
I'm not 100% certain on the exact reasons why this was considered the best idea, but from reading some sources it seemed that the benefits primarily are derived from the abstraction of the source of the data from callers - i.e if you call getFoo(), you don't need to know whether it's stored directly or computed, which also means that the provider of the data can change that source without having to change all callers.

On hiding the data source

The main criticism of the UAP seems to be that hiding the cost of computing some data can be problematic - e.g. there is a big difference between returning a member, and computing it via RPC to somewhere that has to do a long calculation before the value is known. This is certainly a problem, and in particular is even moreso in a more network-based client-server world, but in itself seems to be mostly solved by simply not hiding expensive calculations in get() methods. That is, getCommentCount()#int that requires a database index look-up is more likely to be queryCommentCount(Callback<int>)#void - callers know this will take a while, must provide an asynchronous solution, and everyone is happy.

On hiding the dependencies

The more fundamental problem I see is making a calculated value appear like one that is a normal member, independent to the other members of the containing object. Firstly, if the calculation is only to look up the value from an external source (i.e. it's not stored locally) then that's no issue, and is mostly solved by fixing the querying as mentioned above - e.g having some way to convert an UnresolvedFoo into a ResolvedFoo, from which the required members can be retrieved locally.

No, the main problem is from calculations of 'members' which depend on other members of the object. The best example of this is a collection's size, but also things like displayName (= firstName + ' ' + lastName) or BMI (= mass/(height * height)). Having the value of 'members' depend on the values of others embeds within it a scoping issue - if the value of one of the dependencies changes, so does the value of the calculated member, and this should be made clear to callers of the API.

On commutativity

To express this more clearly, there's a concept of commutativity of an API - that is, the order in which you call methods does not matter. Or, to put it another way, the methods are as independent/orthogonal as possible, which is good for maintainability, but also is theorised to improve scalability.

Obviously, there are some things which this can't hold, but ideally they're as easy to spot as possible. For example, for a mutable member, setFoo() and getFoo() do not commute, but this is clear from the fact they operate on the same state. Calling setFoo() resets the foo scope, and everything that was using the value (displaying it on a screen, used as input for a calculation, ...) is now invalid, and needs to acquire the new value.

So as long as our members are kept separate, commutativity is clear. With dependencies between a local member and a calculated member however, these become very complex - how should a caller know that calling setFirstName() no longer commutes with getDisplayName()? This knowledge is required in order to have the correct value when needed (e.g. to reset the display name scope when the first name is changed), but it is exactly this dependency that the UAP is hiding!

On serialization

Another smaller area where the distinction between local members and computed properties is required is in serialization of each type; e.g. across a network, or into persistent storage, etc... When sending or storing an object of information, the individual members are generally the only parts used, and instead the calculated properties remain calculated, the definition for the calculation (i.e. function) is instead defined once for the type. This distinction (sent as values on the wire vs. sent as a function in code) is quite important given that cross-language/cross-platform member serialization is mostly a solved problem (see previous post) but there is no equivalent good solution for function definitions (...if you know one, it would be great to hear!).

Immutability

It is important to add that immutable data structures are a special case for this - for those, I see much less of a problem in knowing what is calculated and what is not, other than the calculation cost mentioned above. Because of the immutability, none of the values can change, and hence the update scoping between dependencies is no longer a problem. Note that immutable objects also have the nice property that, without a set() method, members with multiple dependencies do not have to decide their update semantics. In the example above, getBMI() depended on both height and weight, so it is not clear what setBMI() would do - it could lock weight and update height, or vice versa, or even some combination in the middle. One caution: these need to be actually immutable, not just behind an unmodifiable API. If the data can change, even if only remotely, then dependencies still need to be clear.

Summary

Seriously, don't use the UAP. Keep your mutable members as members, and make sure they're all orthogonal (i.e. the value of one does not depend on the value of another). Serialization and persistence will be much easier, callers will better know when values update, and correctness in general should be simpler. Try to define members that actually represent what you need, as changing them to calculated values will now be harder, not so much because the clients will need to be updated, but also because they probably made assumptions about the life-cycles of the values they are using, which you just broke (but wouldn't have been noticed if using UAP). And take immutable snapshots whenever possible.

No comments:

Post a Comment