Sunday, August 2, 2020

Modern HTTP is stateful. Old timey HTTP was stateless.


Imagine the year 1988

In 1988 HTTP and I were both stateless.  I hadn't been born and Sir Timothy John Berners-Lee, while working at CERN, envisioned the World Wide Web with HTTP as a stateless protocol.  On March 12, 1989, he submitted a proposal to his boss. 


Once released the WWW conquered the globe but there were difficult aspects for businesses.  Tim succeeded in making HTTP stateless but his design shoved responsibility downstream to individual developers.  


How could someone log on a website or add an item to a shopping cart without state?  The simple stateless HTTP resulted in complex web applications littered with stateful links.  


What was a "stateful url"? For a contrived example, after logging into a shopping site all URLs on a website had to change from:


http://www.example.com/shopping&product=12345


to


http://www.example.com/shopping&product=12345?user=4T85KGJHXHT3M3R6LQKQT2


where "user" is a website wide token the web application used for state.  Every link on a page needed to repeat this stateful token or risk the user needing to log in again.  And mind you, login was sent over in unencrypted plain text because state is also required for encryption.  Businesses had limited options to secure their websites.

HTTP's statelessness was the wrong level of abstraction.  The obvious solution was to increase the scope of the original stateless HTTP standard to include state, but who would hear Silicon Valley's demands for stateful HTTP?   


Now teleport to the cypherpunk utopia of 1994

Once again HTTP and I had something in common.  I had state and HTTP did too because Netscape invented cookies.  


HTTP cookies are one of many stateful mechanisms for HTTP.  


Famously in an article in 1996 the Financial Times raised privacy concerns, but cookies were an instant hit and in 1997 cookies were absorbed into the official HTTP standard under the aptly named RFC 2109 "HTTP State Management Mechanism".  


The stateful fun didn't stop there!  Many new stateful goodies were added to HTTP:


  • HTTPS, also invented in 1994 by Netscape, is stateful.  The "S" part, SSL/TLS, needs state.  This was officially integrated into the HTTP standard by RFC 2818 in 2000.

  • HTTP authentication is stateful and was defined in 1997 by RFC 2068 Section 11, and later in 2014 by it's own RFC 7235.  As a foreshadowing sidenote, this token is the origin of HTTP 2's upgrade token.

  • HTTP caching is stateful and was defined in 1997 by RFC 2068 Section 13 and later in its own RFC 7234 in 2014.

  • HTML itself also added state with stateful mechanisms like Web Storage defined and used by the industry by 2011.

Now fondly remember the world of 2015

Specifically, May 14, 2015 when HTTP 2 was released in RFC 7540. The stateful fun was in full swing!  Many new stateful components were built on top of HTTP 1.1's existing stateful components.


No longer did HTTP ceremoniously call itself "stateless" in honor of Sir Tim's original vision, although inaccurate for nearly two decades.  A Control-F returns 125 hits for "state" and zero for "stateless" in the HTTP/2 RFC. HTTP/2 finally banished the stateless masquerading and embraced the long established reality of the web's statefulness. 


These parts of HTTP/2 are stateful additions to the HTTP corpus:



The stateful present

Is HTTP stateless? HTTP can be stateless if you:

  • Don't use stateful url's or other pre-cookie stateful gimmicks.

  • Don't use cookies.

  • Don't use HTTPS.

  • Don't use HTTP authentication.

  • Don't use HTTP caching.

  • Don't use web storage.

  • Don't use HTTP 2's:

    • Stream identifiers.

    • Header blocks.

    • Frames.

    • Header compression.

    • Opportunistic encryption.


...But then what's the point in calling HTTP "stateless"?  It's reasonable for a system of HTTP's maturity to be stateful.  


Is it possible to develop a stateless HTTP web application?  Of course!  Your web application can ignore the many stateful components of HTTP and operate statelessly, but your one stateless application would not reflect the pervasive stateful web apps on the web at large.  Wikipedia, Facebook, Google, Reddit, Hacker News, Spotify, Netflix, and more all use HTTP statefully.  It is difficult to mention websites that even work without state.  


Without careful consideration your new "stateless" http application may still allow browsers and HTTP servers to use HTTP statefully.  But why trouble with such mental gymnastics?  It's far harder to use HTTP statelessly than just to accept the efficiency benefits of state.  


Our once simple HTTP has grown up from it's humble stateless roots to a full fledged stateful system with many useful stateful components.  The next time a 1988 nostalgic web developer says "http is stateless" while in the same breath mentioning cookies, chortle knowingly with the stateful HTTP truth.


HTTP is stateful.  Almost all of the web is stateful and that's to be expected with it's complexity and maturity.