UTF-8, UTF-16, UTF-32: Encoding Methods
Part of Character Sets · GCSE GCSE Computer Science revision
This key facts covers UTF-8, UTF-16, UTF-32: Encoding Methods within Character Sets for GCSE Computer Science. Revise Character Sets in 3.3 Data Representation for GCSE Computer Science with 15 exam-style questions and 18 flashcards. This topic appears less often, but it can still be a useful differentiator on mixed-topic papers. It is section 6 of 11 in this topic. Use this key facts to connect the idea to the wider topic before moving on to questions and flashcards.
Topic position
Section 6 of 11
Practice
15 questions
Recall
18 flashcards
UTF-8, UTF-16, UTF-32: Encoding Methods
Unicode Transformation Formats (UTF):
Unicode defines WHAT each character's code is. UTF defines HOW to store those codes in bytes.
UTF-8 (Most Common):
- Variable length: 1 to 4 bytes per character
- ASCII compatible: ASCII characters still use 1 byte (efficient!)
- English text: 1 byte per character (same size as ASCII)
- Accented letters: 2 bytes (é, ñ, ü)
- Chinese/Japanese: 3 bytes per character
- Emoji: 4 bytes
- Advantages: Efficient for English, backward compatible with ASCII
- Disadvantage: Asian languages take 3× more space than UTF-16
UTF-16:
- Variable length: 2 or 4 bytes per character
- Most characters: 2 bytes (including Chinese, Japanese, Korean)
- Emoji & rare: 4 bytes (surrogate pairs)
- Use case: Windows internals, Java, JavaScript strings
- Advantage: Efficient for Asian languages
- Disadvantage: English takes 2× space vs ASCII/UTF-8
UTF-32:
- Fixed length: Exactly 4 bytes per character (always)
- Advantage: Simple - every character same size, easy indexing
- Disadvantage: Wastes space - 'A' takes 4 bytes (0x00000041)
- Use case: Internal processing where speed > space
Keep building this topic
Read this section alongside the surrounding pages in Character Sets. That gives you the full topic sequence instead of a single isolated revision point.
Practice Questions for Character Sets
How many bits does standard ASCII use to represent each character?
Explain why using Unicode to store a text file produces a larger file than using ASCII to store the same text.
Quick Recall Flashcards
15 questions on Character Sets — practise free
Instant marking, adaptive difficulty, and 18 spaced repetition flashcards. Free until your GCSEs.
Try PrepWise Free