- HTML Course
- HTML Tutorial
- HTML Document Structure
- HTML Root Tags
- HTML Flow Tags
- HTML Interactive Tags
- HTML Basic Tags
- HTML Paragraph
- HTML Line Break
- HTML Horizontal Line
- HTML Images
- HTML Data Types
- HTML Attributes
- HTML Character Entities
- HTML Styles
- HTML Formatting Text
- HTML Logical Style Text
- HTML Organizing Text
- HTML List
- HTML Tables
- HTML Forms
- HTML action Attribute
- HTML Multimedia
HTML Data Types
In HTML, a "data type" is explained as the type of data that is used in the content of an element or in the value of an attribute.
In the HTML Data Types Section, we will learn about the different data types classified under the following different specifications:
- Basic HTML data types
- Data types defined by the RFC and IANA documentation
- Data types defined by W3C Specifications
Now, let's move on to providing a concise explanation of all the data types based on the aforementioned three.
HTML Basic Data Types
In HTML, the most commonly used data types are classified under "basic data types." There are the following four basic data types in HTML:
- Character data type: stores single alphanumeric text, which includes letters, numbers, symbols, spaces, and punctuation.
- Text data type: Stores a string with a maximum length of 2,147,483,647 printable characters.
- Name data type: Refers to a name given to any particular datun (singular form of data), function, or unit of a program in a programming language.
- Number data type: refers to the data type that can store a number in the range of 1E-323 to 1.79E+308 (positive or negative) with an accuracy of about 15 digits. Arithmetical operations can be performed with number data types.
Different types of alphanumeric text
In the character data types, there are different types of alphanumeric text. The following table lists the different types of commonly used alphanumeric text:
Letters | A...Z and a...z |
Number | 0...9 |
Symbols | @ , # , $ , % , ∧ , & , * , () , _ , — , + , = , \ , | , {} , [] , ~ |
Punctuations | Comma, full stop, and exclamation mark |
Note: Arithmetic operations cannot be performed on the numbers included under the character data type.
Data types defined by the RFC and IANA documentation
RFC is a memorandum that describes the methods, behaviors, or research on the workings of the Internet. IANA is the entity that looks over the global IP address allocation, media types, and other Internet protocol-related assignments.
Note: The RFC stands for "Request for Comments," and the IANA stands for "Internet Assigned Numbers Authority."
According to the RFC and IANA documentation, there are the following four basic data types:
Let's go over each of the four basic data types listed above one by one.
HTML Uniform Resource Identifier (URI)
A uniform resource identifier (URI) is a string of characters that can be used to locate or designate a specific item on the web. As the following example demonstrates, URIs can also be defined as a shorthand way of identifying an online resource that can be easily expanded upon.
http://william@fresherearth.com:80/over/there/index.dtb;type=anime1?name=ferret#nose
In the above example,
URI Component | Description |
---|---|
http | scheme name |
william | user information (also known as userinfo) |
fresherearth.com | host name |
80 | port |
william@fresherearth.com:80 | authority |
over/there/index.dtb;type=anime1 | path |
index | file name |
dtb | extension |
type=anime1? | parameter |
?name=ferret# | query |
nose | fragment |
Let me briefly define each of these URI components or terms used in the above example using the following table:
Component | Description |
---|---|
Scheme | refers to the specification for assigning an identifier. The schemes that are used in URI are: Hypertext Transfer Protocol (HTTP), File Transfer Protocol (FTP), mailto, Uniform Resource Name (URN), tel, Rapid Spanning Tree Protocol (RTSP), and file. |
User Information | refers to personal information, such as a user name and password, that is used to access websites or resources. |
Authority | refers to the part that consists of optional user information that is terminated with @, a host name, and an optional port number preceded by a color. |
Host name | refers to the scheme required to access the given host on the Internet. It is also used for reusing the registration created by the Domain Name System (DNS), therefore saving the cost of deploying another registration. |
Port | refers to the optional decimal number that follows the host after a semicolon. Schemes also define their default port number. For example, http has 80 as its default port number. |
Path | consists of a sequence of text segments that are separated by a forward slash (/). |
File Name | refers to any name that can be given to a targeted file. |
Extension | refers to a code of three to four characters that comes after the file name followed by a dot (.). It specifies the information contained in the file. The .html extension signifies that the file contains an HTML document, and the .jpg extension signifies that the file under consideration is an image file. |
Query | starts with a question mark (? ), when the URI requests a program to run rather than a file to be accessed. Query represents the parameter to be passed into the server-side program. |
Fragment | refers to a particular point in the accessed file. |
HTML Content-Type
The Content-Type (also known as the media type or MIME) represents the type of content used in an embedded or linked resource. For instance, the Content-Type can be plain text or a JPEG image. It is not case-sensitive. Its syntax is divided into two parts: top-level and bottom-level. The top level is separated from the bottom level by a slash (/) symbol. Following are some examples of the Content-Type:
- Text/plain: represents plain text.
- Image/jpeg: represents a compressed image file.
- Audio/basis: represents an audio file.
- Video/mpeg: represents a transmitted compressed video file.
- Application/octet-stream: represents a binary file.
HTML Language Code
The language code is used to represent the code of various literal languages, which are used to script the HTML document. It is not case-sensitive and is written by using the lang attribute used in the HTML document.
The implementation of the language code is shown in the following example:
<html lang="en"> ... ... ... </html>
The following table lists some of the most famous and well-known language codes around the globe.
Language | Language Code |
---|---|
English | en |
French | fr |
German | de |
Japanese | ja |
Arabic | ar |
Chinese | zh |
Dutch | nl |
Italian | it |
Korean | ko |
Russian | ru |
Spanish | es |
HTML Character Set
The character set is a set of standard characters taken from several languages and scripts around the world and represented by unique code points. These code points can be defined as the unique names and integers that are assigned to the character sets for their unique identification.
Following are some examples of the character set:
- dollar symbol
- yen symbol
- lower case letters
- upper case letters
- delta
- omega
- exclamation mark
- quotation marks
Data types defined by W3C Specifications
W3C is an international community that develops standards to ensure the long-term growth of the Web. W3C stands for "World Wide Web Consortium."
The W3C specifies the following five additional data types for HTML:
HTML DateTime Format
DateTime uses the ISO date format (ISO 8601), that is, "YYYY-MM-DDThh:mm:ssTZD." The components of the given format are described using the following table:
Component | Description |
---|---|
YYYY | represents a year in four-digit format, for example: 2022. |
MM | a two-digit numerical value of a month (01 through 12). |
DD | Represents the date of the month (01 through 31). |
T | acts as the separator between the date and time, and it must be written in capital letters. |
hh | represents the hour that ranges from 00 through 23. |
mm | represents the minutes that range from 00 through 59. |
ss | represents the second that ranges from 00 through 59. |
TZD | stands for Time Zone Designator (Z or +hh:mm or -hh:mm). |
HTML RGB Triplet
The RGB triplet denotes three standard colors: red, green, and blue. All possible colors can be created by combining these three colors in various intensities and proportions.
All colors can be represented by a six-digit hexadecimal number, such as (xxyyzz), where:
- The first two consecutive digits (xx) of the hexadecimal number represent red. The hexadecimal equivalent of the red color is #xx0000.
- The second set of two consecutive digits (yy) represents green. The hexadecimal equivalent of green is represented by #00yy00.
- The last two digits (zz) represent blue. The hexadecimal equivalent is represented by #0000zz.
HTML Color Names
In HTML, 16 colors can be called directly by their names rather than their hexadecimal values. This feature makes it easy for the users of HTML to call a color by its name if they are unaware of its hexadecimal number or the concept of the RGB triplet.
The following table represents 16 color names along with their hexadecimal values:
Color Name | Hexadecimal Value |
---|---|
Black | #000000 |
Silver | #C0C0C0 |
Gray | #808080 |
White | #FFFFFF |
Maroon | #800000 |
Green | #008000 |
Lime | #00FF00 |
Olive | #808000 |
Yellow | #FFFF00 |
Red | #FF0000 |
Purple | #800080 |
Fuchsia | #FF00FF |
Navy | #000080 |
Blue | #0000FF |
Teal | #008800 |
Aqua | #00FFFF |
HTML Link Types
Link types are used to provide search engines with a variety of information. You can use the various recognized link types and their standard interpretations. These link types are not case-sensitive, which means you can represent a link type with both lower- and upper-case characters.
There are the following link types available in HTML:
- Alternate: Refers to the substitute for the document in which the link occurs. When used with the lang attribute, the alternate link type represents the translated version of the current document. When combined with the media attribute, it represents a version intended for a different medium.
- Style Sheet: This represents the external style sheet. You can select a style from alternate style sheets by using the style sheet and alternate link types together.
- Start: This represents the first document in the collection of different documents. The start link type provides information about the initial document to the search engine.
- Next: represents the next document in a linear order in the set of documents.
- Prev: represents the previous document in the linear set of documents.
- Contents: denotes the document's table of contents.
- Index: denotes the document that contains the index.
- Glossary: represents the document having a glossary.
- Copyright: refers to a document that includes a copyright statement.
- Chapter: represents a document collection's chapter.
- Section: represents a document collection's section.
- Subsection: a subsection of a collection of documents.
- Appendix: represents a collection of documents' appendices.
- Help: Represents the document with the help feature.
- Bookmark: represents the bookmark.
HTML Media Types
We are given the ability to specify how HTML documents are displayed on various media, such as paper, the screen of a computer, or an aural browser, thanks to the existence of media types. Because of HTML5, we are now able to use the CSS properties to display the text of an HTML page in a variety of font types, colors, and sizes according to the type of media being used.
There are the following available media types:
- screen: represents a computer screen.
- tty: represents media using a fixed-pitch character grid with limited display capabilities.
- tv: denotes a television.
- projection: Represents a projector.
- handheld: represents hand-held devices, such as a mouse, joystick, and keyboard.
- print: represents the print preview mode.
- aural: represents the speech synthesizer.
- all - Represents the media type that is suitable for all the devices
« Previous Tutorial Next Tutorial »