Previous | Contents | Index |
The compiler treats all characters as an integer representation, so it is possible to represent any character in the source code with its numeric equivalent. This is called a numeric escape sequence. The character is represented by typing a backslash ( \ ), followed by the character's octal or hexadecimal integer equivalent from the current character set (see Appendix C for the ASCII equivalence tables). For example, using the ASCII character set, the character A can be represented as \101 (the octal equivalent) or \x41 (the hexadecimal equivalent). A preceding 0 in the octal example is not necessary because octal values are the default in numeric escape sequences. A lowercase x following the backslash indicates a hexadecimal representation. For example, \x5A is equivalent to the character Z .
An example of numeric escape sequences follows:
#define NUL '\0' /* Defines logical null character */ char x[] = {'\110','\145','\154','\154','\157','\41','\0'}; /* Initializes x with "Hello!" */ |
The escape sequence extends to three octal digits, or the first character that is not an octal digit, whichever is first. Therefore, the string "\089" is interpreted as four characters: \0 , 8 , 9 , and \0 .
With hexadecimal escape sequences, there is no limit to the number of
characters in the escape sequence, but the result is not defined if the
hexadecimal value exceeds the largest value representable by the
unsigned char
type for an normal character constant, or the largest value
representable by the
wchar_t
type for a wide-character constant. For example,
'\x777'
is illegal.
In addition, hexadecimal escape sequences with more than three
characters provoke a warning if the error-checking compiler option is
used.
String concatenation can be used to specify a hexadecimal digit following a hexadecimal escape sequence. In the following example, a is initialized to the same value in both cases:
char a[] = "\xff" "f"; char a[] = {'\xff', 'f', '\0'}; |
Using numeric escape sequences can result in a nonportable program if
the executing machine uses a different character set. Another threat to
portability exists if arithmetic operations are performed on the
integer character values, because multiple character constants (such as
'ABC'
can be represented differently on different machines.
1.8.4 Enumeration Constants
An enumerated type specifies one or more enumeration constants
to define allowable values for the enumerated type. Enumeration
constants have the type
int
. See Section 3.6 for details on the declaration and use of
enumerated types.
1.9 Header Files
Header files are text files included in a source file during compilation. To include a header file in a compilation, the #include preprocessor directive must be used in the source file. See Chapter 8 for more information on this directive. The entire header file, regardless of content, is substituted for the #include preprocessor directive.
A header file can contain other #include preprocessor directives to include another file. You can nest #include directives to any depth.
Header files can include any legal C source code. They are most often used to include external variable declarations, macro definitions, type definitions, and function declarations. Groups of logically related functions are commonly declared together in a header file, such as the C library input and output functions listed in the stdio.h header file. Header files traditionally have a .h suffix ( stdio.h , for example).
The names of header files must not include the ', \, ", or /* characters, because the use of these punctuation characters in a header file is undefined.
When referenced in a program, header names are surrounded by angle brackets or double quotation marks, as shown in the following example:
#include <math.h> /* or */ #include "local.h" |
Chapter 8 explains the difference between the two formats. The
algorithm the compiler uses for finding the named files is discussed in
Section B.37. Chapter 9 describes the library routines in each of
the ANSI standard header files.
1.10 Limits
The ANSI C standard suggests several environmental limits on the use of the C language. These limits are an effort to define minimal standards for a conforming implementation of a C compiler. For example, the number of significant characters in an identifier is implementation-defined, with a minimum set required by the ANSI C standard.
The standard also includes several numerical limits that restrict the
characteristics of integral and floating-point types. For the most
part, these limits will not affect your use of the C language or
compiler. However, for unusually large or unusually constructed
programs, certain limits can be reached. The ANSI standard contains a
list of minimum limits, and your platform-specific Compaq C
documentation contains the actual limits used in Compaq C.
As intended by the ANSI C standard, the Compaq C implementation
avoids imposing many of the translation limits, allowing applications
more flexibility. The Compaq C limits are:
Numerical limits define the sizes and characteristics of
integral and floating-point types. Numerical limits are described in the
limits.h
and
float.h
header files. The limits are:
1.10.1 Translation Limits
1.10.2 Numerical Limits
1.10.3 Character Display
Characters from the executable character set are output to the active position on the screen or in a file. The active position is defined by the ANSI C standard as the spot where the next output character will appear. After a character is output, the active position advances to the next position on the current line (to the left or right).
The Compaq C compiler moves the active position from left to right across an output line.
The C language was initially designed as a small, portable programming language used to implement an operating system. In its history, C has evolved into a powerful tool for writing all types of programs, and includes mechanisms to achieve most programming goals. C offers:
To help you take full advantage of C's features, the following sections provide a guide to the basic concepts of the language:
These sections represent an expanded glossary of selected C terms and
basic concepts. Understanding these concepts will provide a good
foundation for a working knowledge of C, and will help show the
relationship of these concepts to more complex ones in the language.
2.1 Blocks
A block in C is a section of code surrounded by braces { }. Understanding the definition of a block is very important to understanding many other C concepts, such as scope, visibility, and external or internal declarations.
The following example shows two blocks, one defined inside the other:
main () { /* This brace marks the beginning of the outer block */ int x; if (x!=0) { /* This brace marks the beginning of the inner block */ x = x++; return x; }; /* This brace marks the end of the inner block */ } /* This brace marks the end of the outer block */ |
A block is also a form of a compound statement; a set of
related C statements enclosed in braces. Declarations of objects used
in the program can appear anywhere within a block and affect the
object's scope and visibility. Section 2.3 discusses scope;
Section 2.4 discusses visibility.
2.2 Compilation Units
A compilation unit is C source code that is compiled and treated as one logical unit. The compilation unit is usually one or more entire files, but can also be a selected portion of a file if, for example, the #ifdef preprocessor directive is used to select specific code sections. Declarations and definitions within a compilation unit determine the scope of functions and data objects.
Files included by using the #include preprocessor directive become part of the compilation unit. Source lines skipped because of the conditional inclusion preprocessor directives are not included in the compilation unit.
Compilation units are important in determining the scope of identifiers, and in determining the linkage of identifiers to other internal and external identifiers. Section 2.3 discusses scope. Section 2.8 discusses linkage.
A compilation unit can refer to data or functions in other compilation units in the following ways:
Programs composed of more than one compilation unit can be separately
compiled, and later linked to produce the executable program. A legal C
compilation unit consists of at least one external declaration, as
defined in Section 4.3.
A translation unit with no declarations is accepted with a compiler
warning in all modes except for the strict ANSI standard mode.
2.3 Scope
The scope of an identifier is the range of the program in which the declared identifier has meaning. An identifier has meaning if it is recognized by the compiler. Scope is determined by the location of the identifier's declaration. Trying to access an identifier outside of its scope results in an error. Every declaration has one of four kinds of scope:
An enumeration constant's scope begins at the defining enumerator in an
enumerator list. The scope of a statement label includes the entire
function body. The scope of any other type of identifier begins at the
identifier itself in the identifier's declaration. See the following
sections for information on when an identifier's scope ends.
2.3.1 File Scope
An identifier whose declaration is located outside any block or function parameter list has file scope. An identifier with file scope is visible from the declaration of the identifier to the end of the compilation unit, unless hidden by an inner block declaration. In the following example, the identifier off has file scope:
int off = 5; /* Declares (and defines) the integer identifier off. */ main () { int on; /* Declares the integer identifier on. */ on = off + 1; /* Uses off, declared outside the function block of main. This point of the program is still within the active scope of off. */ if (on<=100) { int off = 0;/* This declaration of off creates a new object that hides the former object of the same name. The scope of the new off lasts through the end of the if block. */ off = off + on; return off; } } |
An identifier appearing within a block or in a parameter list of a function definition has block scope and is visible within the block, unless hidden by an inner block declaration.
Block scope begins at the identifier declaration and ends at the closing brace (}) completing the block. In the following example, the identifier red has block scope and blue has file scope:
int blue = 5; /* blue: file scope */ main () { int x = 0 , y = 0; /* x and y: block scope */ int red = 10; /* red: block scope */ x = red + blue; } |
Only statement labels have function scope (see Chapter 7). An identifier with function scope is unique throughout the function in which it is declared. Labeled statements are used as targets for goto statements and are implicitly declared by their syntax, which is the label followed by a colon (:) and a statement. For example:
int func1(int x, int y, int z) { label: x += (y + z); /* label has function scope */ if (x > 1) goto label; } int func2(int a, int b, int c) { if (a > 1) goto label; /* illegal jump to undefined label */ } |
See Section 7.1 for more information on statement labels.
2.3.4 Function Prototype Scope
An identifier that appears within a function prototype's list of parameter declarations has function prototype scope. The scope of such an identifier begins at the identifier's declaration and terminates at the end of the function prototype declaration list. For example:
int students ( int david, int susan, int mary, int john ); |
In this example, the identifiers (
david, susan, mary
, and
john
) have scope beginning at their declarations and ending at the closing
parenthesis. The type of the function
students
is "function returning
int
with four
int
parameters." In effect, these identifiers are merely placeholders
for the actual parameter names to be used after the function is defined.
2.4 Visibility
An identifier is visible only within a certain region of the program. An identifier has visibility over its entire scope, unless a subsequent declaration of the same identifier in an enclosed block overrides, or hides, the previous declaration. Visibility affects the ability to access a data object or other identifier, because an identifier can be used only where it is visible.
Once an identifier is used for a specific purpose, it cannot be used for another purpose within the same scope, unless the second use of the identifier is in a different name space. Section 2.15 describes the name space restrictions. For example, declarations of two different data objects using the same name as an identifier is illegal within the same scope.
When the scope of one of two identical identifiers is contained within the other (nested), the identifier with inner scope remains visible, while the identifier with wider scope becomes hidden for the duration of the inner identifier's scope.
In the following example, the identifier number is used twice: once as an integer variable and once as a floating-point variable. For the duration of the function main , the integer number is hidden by the floating-point number .
#include <math.h> int number; /* number is declared as an integer variable */ main () { float x; float number; /* This declaration of number occurs in an inner block, and "hides" the outer declaration. The inner declaration creates a new object */ x = sqrt (number);/* x receives a floating-point value */ } |
The actual order in which expressions are evaluated is not specified for most of the operators in C. Because this sequence of evaluation is determined within the compiler depending on context, some unexpected results may occur when using certain operators. These unexpected results are caused by side effects.
Any operation that affects an operand's storage has a side effect. Side effects can be deliberately induced by the programmer to produce a desired result; in fact, the assignment operator depends on the side effect of altered storage to do its job. C guarantees that all side effects of a given expression will be completed by the next sequence point in the program. Sequence points are checkpoints in the program at which the compiler ensures that operations in an expression are concluded.
The most important sequence point is the semicolon marking the end of a statement. All expressions and their side effects are completely evaluated when the semicolon is reached. Other sequence points are as follows:
These operations do guarantee the order, or sequence, of evaluation (expr1), expr2, and expr3 are expressions). For each of these operators, the evaluation of expression expr1 is guaranteed to occur before the evaluation of expression expr2 (or expr3, in the case of the conditional expression).
Relying on the execution order of side effects, when none is guaranteed, is a risky practice because results are inconsistent and not portable. Undesirable side effects usually occur when the same data object is used in two or more places in the same expression, where at least one use produces a side effect. For example, the following code fragment produces inconsistent results because the order of evaluation of operands to the assignment operator is undefined.
int x[4] = { 0, 0, 0, 0 }; int i = 1; x[i] = i++; |
If the increment of i occurs before the subscript is evaluated, the value of x[2] is 1. If the subscript is evaluated first, the value of x[1] is 1.
A function call also has side effects. In the following example, the order in which f1(y) and f2(z) are called is undefined:
int y = 0; int z = 0; int x = 0; int f1(int s) { printf ("Now in f1\n"); y += 7; /* Storage of y affected */ return y; } int f2(int t) { printf ("Now in f2\n"); z += 3; /* Storage of z affected */ return z; } main () { x = f1(y) + f2(z); /* Undefined calling order */ } |
The printf functions can be executed in any order even though the value of x will always be 10.
Previous | Next | Contents | Index |