Tuesday, March 5, 2019

Directory entry says what? Current Gopher type field types

Ready to dig into the details of the modern Gopher ecosystem?  

The Gopher network protocol is like the precursor to modern HTML web browsers. Like HTML, a Gopher client could display links and text, download files, and display images. Unlike HTML, a Gopher client is using very spare, without the opportunity to display colorful, interactive displays. 

A typical Gopher screen is a directory, a menu-like list of lines, each of which does one thing. A line might display some text, or be a link to a file to view or download, or an image, or might be  a link to another Gopher directory, possibly on another server.  
The Amadeus Gopher Server, served up by my own Simple Gopher Client

I started looking more into modern Gopher sites and how they actually work when creating my own Simple Gopher Client for the Window Store.

 The Gopher RFC 1436 lists 14 different directory entry type fields, each of which is given a single letter identifier. 0, for example, means that the entry refers to a text file that can be displayed; 1 stands for a link to another gopher server. The 'g' type field is for a GIF file and 'I' is for an image type (but the type isn't explicitly given). Uniquely, directory entries can point to other protocols: '8' for a Telnet server, '2' for a particular phone-lookup protocol, '7' for a search engine, and 'T' for an IBM 3270-style terminal connection. 

Over the years other type fields have been informally added to the list. I recently did a crawl of the Gopher space as it exists in February, 2019 to see what kinds of directory entry type fields are in current usage across the current Gopher space. 

Many Gopher files are served up using generic descriptions and not the more precise descriptions. Type 9, "Binary file" is the most common, followed by type 'd', document (a modern addition that's not part of the official Gopher spec). These two account for 87% of the non-image files served up by Gopher. The third most common type is type 5, DOS Binary, followed by BinHex, PDF and UUEncoded. 


Something similar happens with picture image formats. The 'I' generic image field type is used about 30x more often than the more specific 'g' GIF field type. 


Looking at the most popular field types, the 0=file, i=information and 1=directory are the top three field types by far, accounting for about 90% of the field types. 




Some of the original field types are hardly present. The "T" type field that indicated a IBM TN3270 style interaction is entirely missing. The type field '2' CSO Phone book lookup is present on just a 4 pages total, but most of them seem to be samples of what a CSO phone book would be like, not a real phone book. There are actual field type '3' error pages, and no surprise, they seem to result from correctly handling errors from the scripts that generate some Gopher pages. There are also no Duplicate Server '+' type field entries. 


 (Note: I removed from the numbers pages that are test pages whose purpose is to validate Gopher clients) 

Type Fields (Alphabetical order) 
I'll finish this blog post with a handy table of existing Gopher tag types. Every tag that was found at least 10 times, or is part of the official RFC, is listed here 

Field 
Count 
Type 
Status 
; 
11 
Video 

+ 
0 
Duplicated Server 
RFC 
0 
60976 
File 
RFC 
1 
29335 
Directory 
RFC 
2 
5 
CSO Phone 
RFC 
3 
36 
Error 
RFC 
4 
223 
File (BinHex) 
RFC 
5 
631 
File (DOS Binary) 
RFC 
6 
3 
File (UUEncoded) 
RFC 
7 
257 
Index-search server (Veronica) 
RFC 
8 
479 
Telnet 
RFC 
9 
4799 
File (binary) 
RFC 
D 
12 
Some kind of binary file? 

d 
1590 
File (document) 

g 
102 
Image (gif) 
RFC 
H 
4 


h 
3914 
HTML Link 

I 
3300 
Image 
RFC 
i 
13216 
Information 

M 
115 
Mail file? 

P 
26 
PDF File 

p 
15 
Image (PNG) 

s 
278 
Sound 

T 
0 
IBM TN3270 
RFC 
w 
9 
Wiki edit link 
Field 
Count 
Type 
Status 
; 
11 
Video 

+ 
0 
Duplicated Server 
RFC 
0 
60976 
File 
RFC 
1 
29335 
Directory 
RFC 
2 
5 
CSO Phone 
RFC 
3 
36 
Error 
RFC 
4 
223 
File (BinHex) 
RFC 
5 
631 
File (DOS Binary) 
RFC 
6 
3 
File (UUEncoded) 
RFC 
7 
257 
Index-search server (Veronica) 
RFC 
8 
479 
Telnet 
RFC 
9 
4799 
File (binary) 
RFC 
D 
12 
Some kind of binary file? 

d 
1590 
File (document) 

g 
102 
Image (gif) 
RFC 
H 
4 


h 
3914 
HTML Link 

I 
3300 
Image 
RFC 
i 
13216 
Information 

M 
115 
Mail file? 

P 
26 
PDF File 

p 
15 
Image (PNG) 

s 
278 
Sound 

T 
0 
IBM TN3270 
RFC 
w 
9 
Wiki edit link 
Field 
Count 
Type 
Status 
; 
11 
Video 

+ 
0 
Duplicated Server 
RFC 
0 
60976 
File 
RFC 
1 
29335 
Directory 
RFC 
2 
5 
CSO Phone 
RFC 
3 
36 
Error 
RFC 
4 
223 
File (BinHex) 
RFC 
5 
631 
File (DOS Binary) 
RFC 
6 
3 
File (UUEncoded) 
RFC 
7 
257 
Index-search server (Veronica) 
RFC 
8 
479 
Telnet 
RFC 
9 
4799 
File (binary) 
RFC 
D 
12 
Some kind of binary file? 

d 
1590 
File (document) 

g 
102 
Image (gif) 
RFC 
H 
4 


h 
3914 
HTML Link 

I 
3300 
Image 
RFC 
i 
13216 
Information 

M 
115 
Mail file? 

P 
26 
PDF File 

p 
15 
Image (PNG) 

s 
278 
Sound 

T 
0 
IBM TN3270 
RFC 
w 
9 
Wiki edit link 
  
Field 
Count 
Type 
Status 
; 
11 
Video 

+ 
0 
Duplicated Server 
RFC 
0 
60976 
File 
RFC 
1 
29335 
Directory 
RFC 
2 
5 
CSO Phone 
RFC 
3 
36 
Error 
RFC 
4 
223 
File (BinHex) 
RFC 
5 
631 
File (DOS Binary) 
RFC 
6 
3 
File (UUEncoded) 
RFC 
7 
257 
Index-search server (Veronica) 
RFC 
8 
479 
Telnet 
RFC 
9 
4799 
File (binary) 
RFC 
D 
12 
Some kind of binary file? 

d 
1590 
File (document) 

g 
102 
Image (gif) 
RFC 
H 
4 


h 
3914 
HTML Link 

I 
3300 
Image 
RFC 
i 
13216 
Information 

M 
115 
Mail file? 

P 
26 
PDF File 

p 
15 
Image (PNG) 

s 
278 
Sound 

T 
0 
IBM TN3270 
RFC 
w 
9 
Wiki edit link 
  
Field
Count
Type
Status
;
11
Video

+
0
Duplicated Server
RFC
0
60976
File
RFC
1
29335
Directory
RFC
2
5
CSO Phone
RFC
3
36
Error
RFC
4
223
File (BinHex)
RFC
5
631
File (DOS Binary)
RFC
6
3
File (UUEncoded)
RFC
7
257
Index-search server (Veronica)
RFC
8
479
Telnet
RFC
9
4799
File (binary)
RFC
D
12
Some kind of binary file?

d
1590
File (document)

g
102
Image (gif)
RFC
H
4


h
3914
HTML Link

I
3300
Image
RFC
i
13216
Information

M
115
Mail file?

P
26
PDF File

p
15
Image (PNG)

s
278
Sound

T
0
IBM TN3270
RFC
w
9
Wiki edit link




1 comment:

Unknown said...

This is awesome analysis! I would love to see the urls for the gopher menus that contain some of the more uncommon link types.