Lezione 1: Supplement

A supplement for non-computer scientists reading Website Architecture Lezione 1: HTML.

Browser & server

We will see elaborations on each of these topics in the future; now they are not so important. Browsers ("web browsers") are the programs that display web pages, such as Firefox, Chrome, Safari, Opera, and maybe something else. Servers are comptuers that hold websites on the internet and correspond with browsers, delivering the site (incl. supplementary files images, video, etc.) to the browser that requests it. Hence websites are typically downloaded from the internet, specifically from "web[site] servers".

Opening & editing files

For the first several lessons, through more than the first half of the course, you will not need access to a server or even the internet. You can have the entire website on your hard drive, like any collection of regular files, and you can run the site from there. Just double-click on HTML files and they will open in your browser.

For editing files, Notepad++ is good on Windows, jEdit is apparently a thing for Mac, and so is Text Edit for Mac. If using Text Edit, just make sure that when you save the file, it is not saved as the RTF file type or something weird/special, just plain text (or "HTML" if that is an option). HTML files are simply text files, and the extension is renamed to .html, which is only cosmetic. (In general, file extensions in filenames are only cosmetic; they don't change the file contents.) Don't open HTML files in Microsoft Word, beause it will try to add an unholy amount of crap that you didn't want.

index.html

Conventionally, the main HTML file of a site is called index.html; later, we will learn more about this. In the meantime, you might as well call the first/home page of your sites index.html.

Absolute & relative paths

When specifying the location of an external file, such as when specifying a link to another page or specifying the location of an image, you can use an absolute or a relative path. A relative path is a path to the external file which is expressed in relation to the location of the "current" file, such as the HTML file you're editing. An absolute path is the location of the file without context or consideration for the "current" file.

Absolute paths typically begin with "http://", "www.", "file://", or something similar.

Relative paths may simply be the filename of the external file, like "cat.jpg", if the external file is in the same folder as the current file. If the external file is in a child directory, it may look like "images/cat.jpg" or "images/cats/cat.jpg". If the external file is in the parent directory of the current file, you have to use a ".." syntax.

For example, the current file may be "site.com/reports/july/24.html" and you want to include "reportHeading.jpg" which is inside of "reports". You could refer to the image via "../reportHeading.jpg". You could also keep going back up the tree for other files, such as "../../images/siteLogo.jpg", referring to "site.com/images/siteLogo.jpg".

HTML special characters

In the main PDF, there is an allusion to a "huge list" of possible special characters in HTML. You could Google "HTML special characters", or you could go to one of these lists: UTexas, Wikipedia.

Special characters in URLs

Similar to the problem of special characters in HTML, there are special characters that should not occur directly in URLs (web addresses). They would need to be encoded first. This encoding scheme is called "URL encoding" or "percent encoding", because the encoded sequences are generally of the form "%number". For example, the "at" sign ("@") is encoded as "%40". It might occur in a URL as http://www.site.com/hey%40something. If you need to paste a URL in your HTML, specifically as the href of a link, you would need to encode it in this scheme first. Fortunately, there are online tools that can do this for you. Search for "url encoder" or go to the first result.

Deprecated HTML tags & attributes

The word deprecate occures often in computer science; it is (close to) the opposite of appreciate, meaning something has been devalued or has fallen out of favor. When there is an old feature of a programming language, it may be deprecated, meaning that it is no longer officially supported and it is being phased out (it may still work in the meantime). A few of the HTML features we've seen are deprecated and thus they would give warnings or errors during validation. Most typically, the deprecated features we have seen are related to the presentational/graphical qualities of the page, which are now better served by CSS. For example, the table attribute border (e.g. border="1") is deprecated; we would now use CSS for borders. If you see that a feature is deprecated, don't freak out; it was probably just used for convenience while teaching, e.g. so we didn't have to discuss CSS prematurely.