Saturday, May 6, 2017

Is HTTP/2 a Stateful Protocol?

TL;DR:

  • HTTP/2 is a stateful protocol.
  • HTTP/1 was not originally a stateful protocol.  
  • Your existing HTTP/1 application is probably stateful anyway.

What is a stateful protocol or application?  Go see my other post on state and then come back here.

HTTP/1's Stateless History

HTTP/1 was opinionatedly designed to be stateless long before the Internet was in common use. Work on HTTP was started in 1989 with its first documented version published in 1991, years before the Internet was in the average American home.  The stateless hopes of HTTP/1 were for simplicity.  State introduced complexity that could complicate quick adoption.   In the uncharted waters of the Internet birthing the World Wide Web, it was decided that until clear use cases were known defining stateful mechanisms was excessive.


By 1995 the Internet was booming but before cookies existed websites had a huge issue,  how to keep track of customers with a stateless HTTP?  Since HTTP was stateless, there wasn't a way for customers to "log in" other than each website creating their own session management mechanism on top of HTTP.  


For example, a site could include a unique secret string in all the self referencing links for site. This secret string would be used to track a particular user and ensure that only someone with this secret string could access user account information.  Such an implementation would also need all self referencing web page links for logged in users to reinclude the secret string on every site link in a page.  If a single link didn't include this information a user would be logged out and would need to log in again.  This would require developers to design their applications accounting for stateful links in the entire application, and this user information would pollute every link on every page. What a management nightmare!   There were also usability issues with such an approach.  A browser's URL bar would expose this secret user string to the end user, something the user shouldn't care about.  What if a customer shared this secret user string URL?  It could leak sensitive account information and was an obvious usability and security issue.  


What if a site wanted to automatically log users in? Homepages like ebay.com had no ability to know who the user was. The only possibility was to once again manually log in.  Yuck!


In short, session management in HTTP was a pain. It was obvious that HTTP needed state management independent of this back and forth with stateful URL's.  

1994: Introducing State via Cookies

The stateless HTTP 1 issue was resolved with the introduction of cookies during the Netscape glory days in 1994.  The cookies RFC is titled, "HTTP State Management Mechanism" and quickly cookies became the primary HTTP session mechanism. Cookies  made user accounts easy to create and manage all around the web.  When cookies first gain public awareness in 1996 the media had a frenzy with the privacy implications, but cookies stuck and we use them heavily today.  

For a modern example of cookies, look no further than Facebook.  You cannot log into a Facebook account with cookies disabled (go ahead, try!).  A quick test shows a new login to Facebook uses 23 cookies!  


You've heard it said, "HTTP is stateless", and that once was 100% true as the original protocol itself was specifically designed to be stateless, but in everyday modern practice we use HTTP 1 statefully via HTTP cookies, an accepted extension to the original protocol mentioned in later revisions.


This is all before we even talk about HTTPS, which uses TLS to add state (and security) to HTTP.  


Said again, even though HTTP/1 was stateless in everyday practice we use HTTP statefully. You can consider HTTP no longer stateless with the long past addition of new stateful components.


On to HTTP/2

There's more problems than just user logins with stateless protocols, chiefly performance.  Without session, browsers must make many requests to load a modern HTTP page and each request comes with it's own performance overhead.  As pages grow more complex, browsers make more HTTP requests and applications begin to feel sluggish due to the performance overhead of a stateless protocol.  


HTTP/2 was specifically designed to address some performance issues with HTTP/1 and it accomplishes a lot of this through state.  


So what components of HTTP/2 are stateful? For starters:
Section 5.1 of the HTTP/2 RFC is a great example of stateful mechanisms defined by the HTTP/2 standard.  I won't detail it here, but another great starting point is the SPDY wikipedia page.  HTTP/2 is a whole lot more than HTTP/1 and most of the new goodies use state.  


Does this mean that you can't use HTTP/2 statelessly?  There's no reason why you can't use HTTP/2 statelessly just like HTTP/1.  Your application can easily remain stateless even while being built on top of stateful protocols.


The difference of HTTP/2 is that unlike HTTP/1, HTTP/2 defines stateful components in its standard, something HTTP/1.0 intentionally avoided.  

HTTP is nearly as old as me, only one year younger, and in that time a lot has changed.  Don't be surprised that HTTP/2 introduced state!

Summary


HTTP/2 is a stateful protocol and that doesn't preclude a particular HTTP/2 application using a subset of HTTP/2 features to maintain statelessness.


Yes, you can have stateless HTTP/2 applications.

No, you're HTTP/1.1 application is probably stateful, even though people may say "HTTP is stateless".

Most of all, HTTP/2 is a stateful protocol, no ifs, ands, or buts.  

Still confused about stateful systems? See my other blog post for nine points in defining stateful systems.