Saturday, August 19, 2017

Nine points for defining stateful systems.

I also talk about state in context of HTTP in another blog post.  Here's my blog post about that too.  

Reminder by liftarn

What exactly is state? Wikipedia says: 

A program is described as stateful if it is designed to remember preceding events or user interactions; the remembered information is called the state of the system. 
[...] 
[I]nformation about previous data characters or packets received is stored in variables and used to affect the processing of the current character or packet. This is called a "stateful protocol" and the data carried over from the previous processing cycle is called the "state". In others, the program has no information about the previous data stream and starts "fresh" with each data input; this is called a "stateless protocol".
In casual conversation with programmer friends, there always seems to be confusion surrounding stateful protocols and applications.  Programmers learn early about state while reading programming books, but we sometimes fail to see how it applies to the real world.  So here are 9 points to help in defining stateful systems.

1. A stateless application can be written on top of a stateful protocol.  A particular stateless application might be able to ignore stateful components provided by a stateful protocol, but this may not be possible if a protocol requires specific stateful mechanisms.


For example, there are many old school games hosted on websites with no game saves.  These applications are stateless and every time the URL is reloaded, the player starts over in the game.  No state is remembered even though the website might use http cookies for analytics or advertising.  The game simply ignores stateful information provided by the browser over http.  


2. Inversely, a stateful application can be written on top of a stateless protocol.  In this case stateful applications must supplement the 
utilized stateless protocol with a stateful mechanism.  It shouldn't be difficult to create stateful applications on top of even primitive protocols.  


Additionally, sometimes a stateful application can ignore protocol stateful mechanisms and use its own. This happens in the real world frequently with legacy applications when security or performance needs do not fit legacy protocol offerings.  Seasoned engineers are sure to have seen many hacks and protocol abuses designed to get applications to work as needed.  


JOSE's JWT for cookies is a great example of using public private key encryption as  stateful mechanism.  Just because a server doesn't store a particular piece of information doesn't mean it can't maintain state!  JWT's allow servers to "remember" information by being "reminded" by the client, as long as the server remembers its own public keys which are then used to verify signed messages.  This looks different from traditional stateful cookies, but no where in JOSE cookie usage is there a hint of statelessness.  


3. A stateless protocol is still stateless even while delegating state to other protocols.  This is the philosophical approach of HTTP/1 which originally delegated state to external mechanism and thereby maintained stateless purity.  But don't be confused! HTTP still used stateful software stacks, such as stateful protocols like TCP or stateful components such as cookies.  HTTP's explicit deferment of state to the outside is what allowed early HTTP versions to maintain its claim of statelessness.  


4. An agnostic or optionally stateful approach is best defined by majority usage. It's mostly meaningless to talk about state if it's entirely up to individual applications.  But this rarely matches reality.  Most protocols are used by diverse applications consistently statefully or statelessly.  When it comes to state, why split hairs when majority usage is usually overwhelmingly biased?  Instead, shouldn't we refer to the overwhelming majority? 


Programmers sometimes love being pedantic and pointing at edge cases, but most protocols are only used in very specific ways.  Applications using a particular stack of protocols will utilize these protocols statefully or statelessly consistently.  


For example, almost all web applications use cookies, meaning the vast majority of web applications are stateful.  There are no web applications I use on a daily basis that are not stateful.  To say, "web applications are stateless" is absurd.  


HTTP is almost always used statefully via extension of cookies.  If one is to talk broadly of HTTP being stateless, why?  Why refer to statelessness while modern real world use cases are majoritively the opposite.  It's mostly useless to talk of HTTP being stateless other than to make the distinction between core features and the dependence of those features on stateful mechanism.  


Moreover, cookies are a long established in the HTTP specification as cookies extended the original standard. Early HTTP revisions, which are now ancient in protocol terms, cited cookies as the stateful mechanism, acknowledging cookies as a part of the HTTP standard.  Is it really fair to say HTTP/1.* was entirely stateless when state was added in this era?  I would argue the introduction of cookie killed "stateless http".  HTTP has been stateful since the introduction of cookies into the HTTP standard, and the march of adding stateful components didn't stop with cookies.


I have yet to work on a restful API's or other backend uses of HTTP where rate limiting, authentication, or other mechanisms requiring state isn't considered.  


5. A protocol requiring state is stateful.  This should be obvious, but this seems to be an issue from time to time.  

If any "itsy bitsy teeny weeny" necessary part of a protocol is stateful then I consider the whole protocol is stateful.  It's black and white, no 50 shades of Grey, and no ifs, ands, or buts.  


6.  An opinionated protocol or application that favors state is almost certainly stateful.  This is an edge case, but I mention it to demonstrate the dominance of stateful protocols and applications.  


Protocols that are opinionatedly stateful, although perhaps optional, are best described as stateful.  
A protocol designed to be used efficiently with state is stateful.  The conservative stance must agree with the protocol's guidance.  


7.  Stateful components pollute their supersets.  Statefulness works itself up from the smallest components
 to the largest.


I could diagram a system like this:


protocols ⇒ applications ⇒ application stacks ecosystems


Stateful components confer their statefulness to larger supersets.  At any point in my chain, if there is a stateful component, all larger superset will be stateful.   The same is not true for stateless components as stateless components do not confer statelessness to stateful system.  Once a system requires state in any component at any level, the system as a whole is stateful.  


Yes, an individual application in an application stack might be stateless, but if you mentioning this to a programmer friend in your team, it is almost certainly to highlight it's dependence on other stateful components.  My JSON http api server might be stateless, but without a stateful database it's useless.  


Compare this to a biological cell.  Biological systems larger than any individual living component can be said to be "alive".  The point of distinction is at the level of the cell, where "living" is no longer an appropriate classification.  Sure, there is pedantic grey space among biologist, but the conservative stance is the cell is this fundamental unit for large living biological systems.  Organs are alive because cells are alive.  Organism are alive because organs are living.  Colonies are alive because individuals are alive.  When I eat sugar, sugar does not confer it's non-living state to my living body.  That would be silly, but so too is expecting statelessness to confer statelessness to larger stateful systems.  


8. Complexity and security begets state. Concerning c
omplexity versus state, gains in protocol performance are almost achieved by defining more complexity.  It's difficult to make stateless protocol stacks performant.  Security too requires state.  


A protocol requiring state is the the norm for any sufficiently complex system.  There are many capabilities only possible with state.  State is an early stepping stone to larger and more complex systems.  Only simple systems can squeak by and remain stateless.  


Your best performance and security benefits will almost inevitably require state.  As a protocol stack or application stack becomes more complex and performant, state is certainly guaranteed to emerge.  





Which brings as to the final conclusion:


9. Statefulness is the norm for mature systems. Almost all everyday applications are stateful.  Statelessness is uncommon and the exception, usually reserved for immature systems or individual subcomponents.  


The world is full of stateful applications.  Almost everything is stateful!


Statelessness was more important when computer resources were limited.  In modern systems where resources are plentiful, the distinction is much less useful.  


Defining a stateful system is simple.  Is any part of the system stateful?  If the answer is yes, then it is stateful.  It's that simple.  



Fore more real world examples see my post on HTTP/2 and state.