Knowledge Organiser: Character Sets
Part of Character Sets · GCSE GCSE Computer Science revision
This topic summary covers Knowledge Organiser: Character Sets within Character Sets for GCSE Computer Science. Revise Character Sets in Memory & Storage for GCSE Computer Science with 15 exam-style questions and 18 flashcards. This topic appears less often, but it can still be a useful differentiator on mixed-topic papers. It is section 11 of 11 in this topic. Use this topic summary to connect the idea to the wider topic before moving on to questions and flashcards.
Topic position
Section 11 of 11
Practice
15 questions
Recall
18 flashcards
Knowledge Organiser: Character Sets
Key Terms
- Character set: A defined list of characters and their binary codes
- ASCII: American Standard Code for Information Interchange — 7-bit, 128 characters
- Unicode: Universal character set supporting 143,859+ characters and all world languages
- UTF-8: Variable-length encoding (1–4 bytes); web standard for Unicode
- UTF-16: 2–4 bytes per character; used internally by Windows and Java
- Extended ASCII: 8-bit version with 256 characters, adds accented letters
Must-Know Facts
- ASCII is 7-bit (NOT 8-bit) — 128 characters total
- ASCII code for 'A' = 65; 'a' = 97; '0' = 48; space = 32
- Lowercase letter = uppercase + 32 (e.g. 'A'=65, 'a'=97)
- Unicode covers all languages, symbols and emoji
- UTF-8 is backward compatible with ASCII (same codes 0–127)
- More characters in Unicode = more bits needed per character = larger file sizes
- UTF-8 dominant on the web (used by 95%+ of websites)
Key Concepts
- Why ASCII is limited: Only 128 characters — covers English but not other languages
- Why Unicode was created: Single universal standard replacing hundreds of incompatible code pages
- Character set vs encoding: Unicode is the set; UTF-8/16/32 are ways to encode it in bytes
- File size trade-off: Unicode files can be larger than ASCII because each character may need more bytes
Common Mistakes
- Saying ASCII is 8-bit: Standard ASCII uses 7 bits (128 characters) — Extended ASCII uses 8 bits (256 characters); exam questions often test this distinction
- Confusing Unicode with UTF-8: Unicode is the character set (the list of characters and code points); UTF-8, UTF-16, and UTF-32 are encoding schemes that store Unicode in binary
- Saying Unicode always uses more storage than ASCII: UTF-8 encodes the first 128 characters identically to ASCII using just 1 byte — it only uses more bytes for characters beyond the ASCII range
- Forgetting that numbers and symbols have ASCII codes too: ASCII encodes digits 0–9, punctuation, and control characters as well as letters — every character a computer handles has a numeric code
Revise this topic interactively on PrepWise — self-test mode, tap-to-reveal definitions, and Common Mistakes from examiners.
Try the interactive Knowledge Organiser — free →Keep building this topic
Read this section alongside the surrounding pages in Character Sets. That gives you the full topic sequence instead of a single isolated revision point.
Practice Questions for Character Sets
How many bits does standard ASCII use to represent each character?
Explain why using Unicode to store a text file produces a larger file than using ASCII to store the same text.
Quick Recall Flashcards
15 questions on Character Sets — practise free
Instant marking, adaptive difficulty, and 18 spaced repetition flashcards. Free until your GCSEs.
Try PrepWise Free