C Character Set
In C programming, the character set is the foundation of every program you write. It includes all the valid characters that can be used in writing source code — such as letters, digits, symbols, and white spaces. Understanding this character set is essential because every keyword, identifier, operator, and constant is built from it.
The C language follows the ASCII (American Standard Code for Information Interchange) character set for most compilers. This character set includes 128 standard characters, each represented by a unique integer value (ranging from 0 to 127).
The characters in C are grouped into the following categories:
Categories of the C Character Set
a. Letters
C supports the English alphabet in both:
- Uppercase letters: A to Z
- Lowercase letters: a to z
Letters are used to form identifiers, such as variable names and function names.
Example:
main
,total
,Result
C is case-sensitive, so Variable
and variable
are considered different identifiers.
b. Digits
The digits from 0 to 9 are part of the character set and are used to create numeric constants, array indexes, loop counters, and more.
Example:
0
,5
,99
,2025
Note: Identifiers cannot begin with a digit, but digits can appear after the first character.
c. Special Characters
Special symbols in C have specific meanings and uses. Some of the most commonly used special characters include:
Character | Purpose | |
---|---|---|
+ |
Addition or unary plus | |
- |
Subtraction or unary minus | |
* |
Multiplication or pointer | |
/ |
Division | |
% |
Modulus (remainder) | |
= |
Assignment | |
< > |
Comparison | |
& |
Address operator or AND | |
` | ` | OR operator |
! |
NOT (logical negation) | |
^ |
Bitwise XOR | |
; |
Statement terminator | |
, |
Separator | |
. |
Structure member access | |
# |
Preprocessor directive | |
() |
Function call or grouping | |
{} |
Code block (compound stmt) | |
[] |
Array declaration/access | |
" |
String literal delimiter | |
' |
Character literal delimiter | |
\\ |
Escape character |
Each of these characters contributes to how C instructions are structured and interpreted.
d. White Space Characters
Whitespace characters are used to separate tokens and improve readability. These include:
- Space
- Horizontal tab
- Newline
- Carriage return
- Form feed
Whitespace is generally ignored by the compiler except when it separates identifiers, keywords, or constants. It has no effect on program logic but is vital for clean code formatting.
e. Escape Sequences
Escape sequences are combinations starting with a backslash (\
) followed by a character. They are used to represent non-printable or special characters within strings or output.
Common escape sequences include:
Escape Sequence | Meaning |
---|---|
\n |
New line |
\t |
Horizontal tab |
\\ |
Backslash |
\" |
Double quote |
\' |
Single quote |
\r |
Carriage return |
\a |
Alert (bell) |
\b |
Backspace |
These sequences are part of the character set because they are used in strings and characters handled in the C language.
Importance of the Character Set in C
- Every C program is made up of characters from this set.
- Incorrect use of characters outside this set results in compilation errors.
- Understanding the valid characters helps in writing proper identifiers, operators, keywords, and statements.
Summary
The C Character Set includes:
Category | Examples |
---|---|
Letters | A–Z, a–z |
Digits | 0–9 |
Special Symbols | + - * / = < > { } ; , # ( ) [ ] |
Whitespace | space, tab, newline |
Escape Sequences | \n , \t , \\ , \" , \r , etc. |
Understanding the character set is your first step toward mastering the syntax and semantics of the C language.