×

Search anything:

DOCTYPE in HTML

Binary Tree book by OpenGenus

Open-Source Internship opportunity by OpenGenus for programmers. Apply now.

<!DOCTYPE html>

If you've written some HTML code in your life you would have written the above a lot. Most call it a tag. But is it?

<!DOCTYPE html>

is a declaration, but definitely not a tag. It's the first line that you write in a HTML/XHTML documents.

To understand further,

LET'S REWIND THE CLOCK.

BROWSER WARS : Due to the popularity which Internet was getting in 1995, webpages and the new tech blew away naive users. They could get access to information on their computer screens, this was pioneered by browsers at the time. The competition was mostly between Netscape Navigator and Windows Internet Explorer. Developers then who had to create webpages needed to choose a browser. There were no standardization, this led to a webpage not running as the user would have wanted if it didn't adhere to browser's rules.

W3C : The W3C(World Wide Web Consortium) , looks to tackle this problem. They made a standard that should be followed by all the browsers.

Let's come to why?

  • We write this to specify to browser what type of document it should expect.
  • It's essential because, after the standards were set the existing webpages then was not able to meet the standards. Hence, their pages wouldn't run as they expect.
  • Browser vendors handled this, they did so by creating 3 modes.
    • Full Standard / No Quirk Mode.
    • Quirk Mode.
    • Limited Quirk / Limited Standard Mode.
  • Full Standard Mode : In this mode, the pages which follow W3C standards are displayed accordingly.
  • Quirk Mode : Here, the pages which have no standards are displayed.
  • Limited Quirk / Limited Standard Mode: It's similar to Full Standard Mode but will allow occasional Quirks.
  • To sum it up, we write
<!DOCTYPE html>

just to avoid browsers from switching to quirk mode.

SYNTAX:

  • Because HTML5 is not a SGML( Standard Generalized Markup Language) we just write

        <!DOCTYPE HTML>
    

    (it's case insensitive)

  • Whereas in previous versions, we follow the XML DTD (Document Type Definition) which has a syntax like this

        <!DOCTYPE root Public/Private URL>
    
  • root specifies the starting point of the tree structure.

  • Public/private specify whether the file is available locally or on the Internet.

  • URL specifies where the file is located. If it's locally inside the same package we just write it's filename(dot)DTD.

  • It has a lot of variations

    • Strict DTD.
    • Transitional DTD.
    • Frameset DTD.

DTD? What's that?

DTD defines the legal building blocks of an XML/HTML document. Every document has Elements,Attributes,Entities,#PCDATA,CDATA. In simple words, it creates a structure similar to the tree data structure. It describes what is the root, type of data, the types of tags which are enclosed in one etc.

  1. Elements are the HTML/XML elements which could be anything like heading tag, body tag etc.

  2. Attributes are the additional information given to the element or by the element. They are generally inside the Element's opening tag. Evey attribute has a key and a value.

  3. Entities are the special characters which are written in XML/HTML. Every reference has it own character.

  4. #PCDATA is the parsed character data. It's the information which is intended to be displayed. Text between Element's opening tag and closing tag is the actual data. This will be parsed by the parser. The parser looks for markup and entities.

  5. #CDATA is the data which is not parsed by the parser.

Let's,Take an example, Consider a scenario where a student is sending a message to his professor just to introduce himself as part of cohort guidelines.

The XML document for it is below:

<?xml version="1.0"?>
<!DOCTYPE msg [
<!ElEMENT msg (name,title,body)>
<!ElEMENT name (#PCDATA)>
<!ElEMENT title (#PCDATA)>
<!ElEMENT body (#PCDATA)>
]>
<msg>
<name>Jake</name>
<title> Introduction </title>
<body> Hi, this is jake from cs-101 </body>
</msg>

Here, there are 4 elements. They are msg, name, title, body.

Explanation of code:

  • Doctype declaration here specifies what the root of the structure will be. In this case msg.
  • The msg contains others elements in them which are nested which should be specified in parenthesis.
  • So, msg contains name, title, body elements in it.

With this article at OpenGenus, you must have the complete idea of DOCTYPE.

DOCTYPE in HTML
Share this