Saturday, March 30, 2019

Any port in a storm!

Gopher, of course, runs off of port 70, and has since the Gopher RFC 1436 from March, 1993. Since that, many protocols have embraced using SSL (or TLS in its more modern and secure form), often preferring to send all SSL-protected and encrypted data over some other port. HTTP, for example, uses port 80 for unencrypted traffic and 443 for encrypted traffic.

Heat map for the most popular ports in GopherSpace


So what about poor Gopher? How do Gophers in the wild handle SSL/TLS? I've seen a gopher/s protocol on the internet where if a port number is > 100,000 then it's assumed to be TLS. That has the problem that although it's a technically valid URL, some programs (like OneNote) don't seem to much like them.

Or khzea.net treats port 105 (normally the CSO port, and therefore pretty much unused in the real world) as the SSL port for Gopher.

What I want to know, of course, is the real-world distribution of port numbers in the existing GopherSpace. As always, I'm using the a bunch of data from an earlier Gopher crawl.

No surprise, the most common specifically mentioned port is port 70, the gopher port. There was just a single reference to a port > 100105, so the new standard of using very large ports to indicate SSL/TLS hasn't taken off yet.

The file entry overwhelmingly uses port 70; failing that, port 9999 is popular.
The directory entry is also mostly port 70 (really, not a surprise at all), with port 9999 the second-most popular port along with a smattering of 7070 7006 and 7005 ports.
The HTTP (h) entry is also mostly port 70 with port 80 (the official HTTP port) being the runner-up
The info (i) entry has an unsurprisingly variance. Since the info entry isn't a selectable entry, the port and host for it aren't actually used; developers can pick whatever random values they want. On common value is no value at all
Lastly, the Image (I) entry is once again most commonly on port 3298 with essentially no variation. Interestingly, the GIF(g) entry is also essentially always on port 70.

What can we learn from all this? My biggest takeaway is that there are enough users of non-standard ports that any gopher client that's worth anything should use the given port numbers. I suspect that I'd get a lot more secure gopher (perhaps on port 105 like khzae.net does it) showing up on my Gopher scan if the scanner actually supported TLS/SSL Gopher -- right now, it won't correctly follow most of the secure Gopher links.

What I don't know how to do is to correctly follow a secure link. If I'm at a secure Gopher site (like maybe the gophers://khzae.net:105 site), should I simply assume that all links are secure links? Or should I assume that all non-70 links are? Or just the 105 links? I suspect that the only way to know is to try to pull data from each Gopher directory in both TLS/SSL mode and plain text mode, and see what works.

No comments: