What are Character Sets?

To display an HTML page correctly, a web browser must know which character set (character encoding) to use. A character set is a set of characters that can be represented in a computer. Each character is assigned a unique number (code point).

The ASCII Character Set

The first character set used on the web was ASCII (American Standard Code for Information Interchange). It supports 128 different characters, including English letters, numbers, and some special characters.

The ISO-8859-1 Character Set

As the web grew, ISO-8859-1 (Latin-1) became the default character set for HTML 2.0. It extended ASCII to include characters for many Western European languages.

UTF-8: The Modern Standard

Today, UTF-8 is the universal character set used by almost all websites. It supports almost all characters and symbols in the world. It is also backward-compatible with ASCII.

<meta charset="UTF-8">

How to Specify the Character Set

You should always specify the character set in the <head> of your HTML document using the <meta> tag:

<!DOCTYPE html>
<html>
<head>
  <meta charset="UTF-8">
</head>
<body>
  <p>Hello World! 😊</p>
</body>
</html>