Saturday, September 7, 2019

A Network Reading List

I get to be  mentor to an incoming program manager where I work. Most newcomers aren't super familiar with networking, and many people aren't familiar with the actual history of our part of computer technology.

Here's a hopefully useful curated list of useful links to source material on the creation of the internet.

The first internet was created for ARPA by BBN

The American Deparment of Defense "Advanced Research Projects Agency" (ARPA) asked for the internet in this original request. The winning proposal was from Bolt, Beranek and Newman (BBN). About a decade later, BBN wrote the Completion Report.


Ton of BBN history is online.

RFCs and Protocols everyone should know

The internet is governed (as much as it is governed) by a set of "Request for Comments" (RFCs). There's also this handy list for all 8649 current RFCs. These are all numbered; there's a society that curates them. However, anyone can be part of the RFC process. In addition to RFCs, there are also "IEN" paper with their own numbering

One of the most accessible papers (and a great way to start) is the Endian RFC, published by Danny Cohen of the Internet Engineering Task Force (IETF) as IEN 137. A good introduction to network humor is RFC 1149.

As you might guess, creating an RFC can be a long, slow slog, starting with basic proposals in a proposed state, getting consensus from experts, and converging on a draft standard and finally being voted in as an official internet standard. You might also guess that that many important protocols have gone all the way through the entire process, but here you would be wrong. The IMAP Email standrd 3501 (from 2003) is still "proposed".

You might also think that conforming to the official standards is critical. Actually, not so much. 

Internet Mail

EMail was one of the earliest super-popular uses of the Internet. There are two main ways to read email: the Post Office Protocol (POP) formats POP (RFC 918) , POP2 (RFC 937) and the current and most popular POP3 (RFC 1939) and the much more powerful / complex IMAP (currently IMAP 4.1 RFC 3501 from 2003). Sending email is via SMTP, which is a different protocol, and calendars use CALDAV (or other)

The Internet before HTTP

Telnet (RFC 318) lets you type into a computer here and have it be accepted over there. Now replaced with SSH

Gopher (RFC 1436) is kind of like HTTP/HTML, but is more of a "menu" oriented system. There still exist Gopher servers, and there's even a yearly conference. Amazingly, there's only one Gopher program for Windows in the Windows Store (and I wrote it)

Usenet News (RFC 977) is like a distributed Reddit but without reputation points.

FTP (RFC 2065) is how we used to download files. Unlike many other protocols, two sockets are used: one is for sending commands, and the other is for getting a response. This actually works really badly with firewalls. FTP is also a great example of RFCs being updated: the original RFC is 114 and then runs up to 172 265 354 542 765 959. After that there are merely a string of updates. Also notice that the orignal email protocols were handled by FTP.

Finger (RFC 742) is what was facebook. People would write information in their .plan file and it would be retrieved.

Internet Relay Chat (IRC) (RFC 1459) is like Teams (or text messages). Unlike some of these other protocols, plenty of people still use IRC.

The internet and HTTP + HTML

HTML is the markup language; HTTP is the protocol for sending HTML over the internet. It's HTML plus a bunch of headers. HTTP and HTML was first created in 1991 by Tim Berners-Lee. The NeXT computer he used when creating it is on display at the Science Museum in London. 

HTTP is defined originally in RFC 1945 which also defines Uniform Resource Locators (URLs)
HTML is defined in RFC 1866 and heavily updated since then.

Security and the internet

A common thread through all of the original RFCs is a complete lack of care about security. For example, RFC 475 says:

It has been suggested that FTP specification should require that mail
   function (for receiving mail) should be "free", i.e., FTP servers
   should not require the user to "login" (send the USER, PASS, and ACCT
   commands).

Furthermore, note that all communications are send in the clear with no encryption whatsoever. Amazingly, a recent Senate proposal for all states to email a number of sensitive documents to a senate committe proposed using a completely in-clear mail system!

Reading the Transport Layer Security (TLS) RFCs is not recommended. They are very complex, and won't help you understand the protocol. You should know that the current common TLS version is 1.2 and the new one is 1.3 (but it's not in common use yet). TLS 1.0 is barely acceptable for security and is expected to be completely broken before long. The old Secure Sockets Layer (SSL) is completely broken and must never be used if you want security.

VPN and WiFi access points often used RADIUS (RFC 2059) for authentication and authorization. One neat thing about RADIUS is that it's designed to be secure, but thanks to advances in breaking encryption, it's not at all secure on the internet.

You'll see tons of stuff about X.509 certificates. The most important thing to know is that all certificates are X.509. Other than that:
  1. A certificate is just a thin wrapper for a security key with some extra goo
  2. People handled their certificates incorrectly so often (e.g., using the code signing cert for TLS) that they are now marked with the intended purpose.
  3. The Public Key Infrastructure (PKI) is a way to 'chain' from one certificate to the next. This is always done so that there are just a few super-trusted certs and everything else flows down from there.
  4. There are custom mechanisms for trusting a certificate. Azure uses a custom mechanism inside the datacenter; for their purposes it's much faster, more robust and more secure than using the standard PKI
  5. Combining #3 and #4: the bar for inventing a new scheme to validate certificates is spending three month with the security experts. 
  6. "Browser level" certificate checking is pretty secure. The entire chain is examined; each cert along the chain is validate for expiration and to make sure that each cert correctly signed its part of the chain.
  7. Adding to browser-level certificate check is safe (security-wise). 
  8. Bypassing the browser-level certificate check is not safe. People screw this up constantly.
Read The Most Dangerous Code in the World for how common security mess-ups are.


No comments: