scan
Parse a string according to a format specification.
Syntax
tcl
scan string format ?varName ...?Parameters
- string: The string to parse
- format: A format string with conversion specifiers (like C scanf)
- varName: Variable names to store the parsed values
Description
Parses the input string according to the format specification and stores the results in the specified variables. Returns the number of conversions successfully performed. If no variable names are provided, returns a list of the parsed values. Returns -1 when the end of the input string is reached before any conversions have been performed.
Format Specifiers
| Specifier | Description |
|---|---|
%d | Decimal integer |
%u | Unsigned decimal integer |
%o | Octal integer |
%x, %X | Hexadecimal integer |
%b | Binary integer |
%i | Auto-detect base integer (0x for hex, 0 for octal, otherwise decimal) |
%f, %e, %E, %g, %G | Floating-point number |
%s | Non-whitespace string |
%c | Single character (returns Unicode codepoint value) |
%[chars] | Character set matching |
%[^chars] | Negated character set |
%n | Count of characters scanned so far |
%% | Literal percent sign |
Format Features
| Feature | Example | Description |
|---|---|---|
| Field width | %10s | Limits to 10 characters |
| Suppression | %*d | Discards the value (not stored) |
| Positional specifiers | %2$d | Assigns to 2nd variable |
| Size modifiers | %ld, %lld | Parsed for compatibility |
| Character ranges | %[a-z] | Matches lowercase letters |
| Bracket in charset | %[]abc] | ] as first character matches literally |
Return Value Modes
| Mode | Description |
|---|---|
| Variable mode | With varNames, returns count of successful conversions |
| Inline mode | Without varNames, returns list of values |
| EOF detection | Returns -1 when input exhausted before conversion |
Unicode Character Handling
The %c specifier reads a single Unicode character and returns its codepoint value. This correctly handles multi-byte UTF-8 sequences:
- ASCII characters: Returns values 0-127
- 2-byte UTF-8: Returns codepoints U+0080 to U+07FF
- 3-byte UTF-8: Returns codepoints U+0800 to U+FFFF
- 4-byte UTF-8: Returns codepoints U+10000 to U+10FFFF (including emoji)
Examples
Basic parsing
Output
Parsing without variables
Output
Parsing floating-point
Output
Hexadecimal parsing
Output
Parsing fixed-width fields
Output
Count successful conversions
Output
Binary parsing
Output
Auto-detect base with %i
Output
Unicode character codepoint
Output
Suppression with %*
Output
