C and C++ Programmers reference << Technology Program

Data Types, Variables, and Constants

C and C++ offer the programmer a rich assortment of built-in data types. Programmer-defined data types can be created to fit virtually any need. Variables can be created for any valid data type. Also, it is possible to specify constants of C/C++’s built-in types. In this section, various features relating to data types, variables, and constants are discussed.

The Basic Types

C89 defines the following elemental data types:

Type	Keyword
Character	char
Integer	int
Floating point	float
Double floating point	double
Valueless	void

To these, C99 adds the following:

Type	Keyword
Boolean (true/false)	_Bool
Complex	_Complex
Imaginary	_Imaginary

C++ defines the following basic types:

Type	Keyword
Boolean (true/false)	bool
Character	char
Integer	int
Floating point	float
Double floating point	double
Valueless	void
Wide character	wchar_t

As you can see, all versions of C and C++ provide the following five basic types: char, int, float, double, and void. Also notice that the keyword for the Boolean type is bool in C++ and _Bool in C99. No Boolean type is included in C89.

Several of the basic types can be modified using one or more of these type modifiers:

signed
unsigned
short
long

The type modifiers precede the type name that they modify. The basic arithmetic types, including modifiers, allowed by C and C++ are shown in the following table along with their guaranteed minimum ranges. Most compilers will exceed the minimums for one or more types. Also, if your computer uses two’s complement arithmetic (as most do), then the smallest negative value that can be stored by a signed integer will be one more than the minimums shown. For example, the range of an int for most computers is –32,768 to 32,767. Whether type char is signed or unsigned is implementation dependent.

Type	Minimum Range
char	–127 to 127 or 0 to 255
unsigned char	0 to 255
signed char	–127 to 127
int	–32,767 to 32,767
unsigned int	0 to 65,535
signed int	same as int
short int	same as int
unsigned short int	0 to 65,535
signed short int	same as short int
long int	–2,147,483,647 to 2,147,483,647
signed long int	same as long int
unsigned long int	0 to 4,294,967,295
long long int	–(2⁶³–1) to 2⁶³–1 (C99 only)
signed long long int	same as long long int (C99 only)
unsigned long long int	0 to 2⁶⁴–1 (C99 only)
float	6 digits of precision
double	10 digits of precision
long double	10 digits of precision
wchar_t	same as unsigned int

When a type modifier is used by itself, int is assumed. For example, you can specify an unsigned integer by simply using the keyword unsigned. Thus, these declarations are equivalent.

unsigned int i; // here, int is specified
unsigned i; // here, int is implied

Declaring Variables

All variables must be declared prior to use. Here is the general form of a declaration:

type variable_name;

For example, to declare x to be a float, y to be an integer, and ch to be a character, you would write

float x;
int y;
char ch;

You can declare more than one variable of a type by using a comma-separated list. For example, the following statement declares three integers:

int a, b, c;

Initializing Variables

A variable can be initialized by following its name with an equal sign and an initial value. For example, this declaration assigns count an initial value of 100:

int count = 100;

An initializer can be any expression that is valid when the variable is declared. This includes other variables and function calls. However, in C, global variables and static local variables must be initialized using only constant expressions.

Identifiers

Variable, function, and user-defined type names are all examples of identifiers. In C/C++, identifiers are sequences of letters, digits, and underscores from one to several characters in length. (A digit cannot begin a name, however.)

Identifiers may be of any length. However, not all characters will necessarily be significant. There are two types of identifiers: external and internal. An external identifier will be involved in an external link process. These identifiers, called external names, include function names and global variable names that are shared between files. If the identifier is not used in an external link process, it is internal. This type of identifier is called an internal name and includes the names of local variables, for example. In C89, at least the first 6 characters of an external identifier and at least the first 31 characters of an internal identifier will be significant. C99 has increased these values. In C99, an external identifier has at least 31 significant characters and an internal identifier has at least 63 significant characters. In C++, at least the first 1,024 characters of an identifier are significant.

The underscore is often used for clarity, such as first_time, or to begin a name, such as _count. Uppercase and lowercase are different. For example, test and TEST are two different variables. C/C++ reserves all identifiers that begin with two underscores, or an underscore followed by an uppercase letter.

Classes

The class is C++’s basic unit of encapsulation. A class is defined using the class keyword. Classes are not part of the C language. A class is essentially a collection of variables and functions that manipulate those variables. The variables and functions that form a class are called members. The general form of class is shown here:

class class-name : inheritance-list {
// private members by default
protected:
// private members that can be inherited
public:
// public members
} object-list;

Here, class-name is the name of the class type. Once the class declaration has been compiled, the class-name becomes a new type name that can be used to declare objects of the class. The object-list is a comma-separated list of objects of type class-name. This list is optional. Class objects can be declared later in your program by simply using the class name. The inheritance-list is also optional. When present, it specifies the base class or classes that the new class inherits. (See the following section entitled “Inheritance.”)

A class can include a constructor function and a destructor function. (Either or both are optional.) A constructor is called when an object of the class is first created. The destructor is called when an object is destroyed. A constructor has the same name as the class. A destructor has the same name as the class, but is preceded by a ~ (tilde). Neither constructors nor destructors have return types. In a class hierarchy, constructors are executed in order of derivation and destructors are executed in reverse order.

By default, all elements of a class are private to that class and can be accessed only by other members of that class. To allow an element of the class to be accessed by functions that are not members of the class, you must declare them after the keyword public. For example:

class myclass {
  int a, b; // private to myclass
public:
  // class members accessible by nonmembers
  void setab(int i, int j) { a = i; b = j; }
  void showab() { cout << a << ' ' << b << endl; }
} ;

myclass ob1, ob2;

This declaration creates a class type, called myclass, that contains two private variables, a and b. It also contains two public functions called setab( ) and showab( ). The fragment also declares two objects of type myclass called ob1 and ob2.

To allow a member of a class to be inherited, but to otherwise be private, specify it as protected. A protected member is available to derived classes, but is unavailable outside its class hierarchy.

When operating on an object of a class, use the dot (.) operator to reference individual members. The arrow operator (–>) is used when accessing an object through a pointer. For example, the following accesses the putinfo( ) function of ob using the dot operator and the show( ) function using the arrow operator:

struct cl_type {
  int x;
  float f;
public:
  void putinfo(int a, float t) { x = a; f = t; }
  void show() { cout << a << ' ' << f << endl; }
} ;

cl_type ob, *p;

// ...

ob.putinfo(10, 0.23);

p = &ob; // put ob's address in p

p->show(); // displays ob's data

Inheritance

In C++, one class can inherit the characteristics of another. The inherited class is usually called the base class. The inheriting class is referred to as a derived class. When one class inherits another, a class hierarchy is formed. The general form for inheriting a class is

class class-name : access base-class-name {
// . . .
} ;

Here, access determines how the base class is inherited, and it must be either private, public, or protected. (It can also be omitted, in which case public is assumed if the derived class is a struct, or private if the derived class is a class.) To inherit more than one class, use a comma-separated list.

If access is public, all public and protected members of the base class become public and protected members of the derived class, respectively. If access is private, all public and protected members of the base class become private members of the derived class. If access is protected, all public and protected members of the base class become protected members of the derived class.

In the following class hierarchy, derived inherits base as private. This means that i becomes a private member of derived.

class base {
public:
  int i;
};

class derived : private base {
  int j;
public:
  derived(int a) { j = i = a; }
  int getj() { return j; }
  int geti() { return i; } // OK, derived has access to i
};

derived ob(9); // create a derived object

cout << ob.geti() << " " << ob.getj(); // OK

// ob.i = 10; // ERROR, i is private to derived!

Structures

A structure is created using the keyword struct. In C++, a structure also defines a class. The only difference between class and struct is that, by default, all members of a structure are public. To make a member private, you must use the private keyword. The general form of a structure declaration is like this:

struct struct-name : inheritance-list {
// public members by default
protected:
// private members that can be inherited
private:
// private members
} object-list;

In C, several restrictions apply to structures. First, they may contain only data members; member functions are not allowed. C structures do not support inheritance. Also, all members are public and the keywords public, protected, and private are not allowed.

Unions

A union is a class type in which all data members share the same memory location. In C++, a union may include both member functions and data. In
a union, all of its members are public by default. To create private elements, you must use the private keyword. The general form for declaration of
a union is

union class-name {
// public members by default
private:
// private members
} object-list;

In C, unions may contain only data members and the private keyword is not supported.

The elements of a union overlay each other. For example,

union tom {
  char ch;
  int x;
} t;

declares union tom, which looks like this in memory (assuming 2-byte integers):

Like a class, the individual variables that comprise the union are referenced using the dot operator. The arrow operator is used with a pointer to a union.

There are several restrictions that apply to unions. First, a union cannot inherit any other class of any type. A union cannot be a base class. A union cannot have virtual member functions. No members may be declared as static. A reference member cannot be used. A union cannot have as a member any object that overloads the = operator. Finally, no object can be a member of a union if the object’s class explicitly defines a constructor or destructor function. (Objects that have only the default constructors and destructors are acceptable.)

Programming Tip

In C++, it is common practice to use struct when creating C-style structures that include only data members. A class is usually reserved for creating classes that contain function members. Sometimes the acronym POD is used to describe a C-style structure. POD stands for Plain Old Data.

There is a special type of union in C++ called an anonymous union. An anonymous union declaration does not contain a class name and no objects of that union are declared. Instead, an anonymous union simply tells the compiler that its member variables are to share the same memory location. However, the variables themselves are referred to directly, without using the normal dot or arrow operator syntax. The variables that make up an anonymous union are at the same scope level as any other variable declared within the same block. This implies that the union variable names must not conflict with any other names valid within their scope. For example, here is an anonymous union:

union { // anonymous union
  int a;   // a and f share
  float f; // the same memory location
};

// ...

a = 10; // access a
cout << f; // access f

Here, a and f both share the same memory location. As you can see, the names of the union variables are referred to directly without the use of the dot or arrow operator.

All restrictions that apply to unions in general apply to anonymous unions. In addition, anonymous unions must contain only data—no member functions are allowed. Anonymous unions may not contain the private or protected keywords. Finally, an anonymous union with namespace scope must be declared as static.

Enumerations

Another type of variable that can be created is called an enumeration. An enumeration is a list of named integer constants. Thus, an enumeration type is simply a specification of the list of names that belong to the enumeration.

To create an enumeration requires the use of the keyword enum. The general form of an enumeration type is

enum enum-name { list of names } var-list;

The enum-name is the enumeration’s type name. The list of names is comma separated.

For example, the following fragment defines an enumeration of cities called cities and the variable c of type cities. Finally, c is assigned the value “Houston”.

enum cities { Houston, Austin, Amarillo} c;
c = Houston;

In an enumeration, the value of the first (leftmost) name is, by default, 0; the second name has the value 1; the third has the value 2; and so on. In general, each name is given a value one greater than the name that precedes it. You can give a name a specific value by adding an initializer. For example, in the following enumeration, Austin will have the value 10:

enum cities { Houston, Austin=10, Amarillo };

In this example, Amarillo will have the value 11 because each name will be one greater than the one that precedes it.

C Tags

In C, the name of a structure, union, or enumeration does not define a complete type name. In C++, it does. For example, the following fragment is valid for C++, but not for C:

struct s_type {  int i;
  double d;
};
// ...
s_type x; // OK for C++, but not for C

In C++, s_type defines a complete type name and can be used, by itself, to declare objects. In C, s_type defines a tag, which is not a complete type specifier. In C, you need to precede a tag name with either struct, union, or enum when declaring objects. For example,

struct s_type x; // now OK for C

The preceding syntax is also permissible in C++, but seldom used.

The Storage Class Specifiers

The type modifiers extern, auto, register, static, and mutable are used to alter the way C/C++ creates storage for variables. These specifiers precede the type that they modify.

extern

If the extern modifier is placed before a variable name, the compiler will know that the variable has external linkage. External linkage means that an object is visible outside its own file. In essence, extern tells the compiler the type of a variable without actually allocating storage for it. The extern modifier is most commonly used when there are two or more files sharing the same global variables.

auto

auto tells the compiler that the local variable it precedes is created upon entry into a block and destroyed upon exit from a block. Since all variables defined inside a function are auto by default, the auto keyword is seldom (if ever) used.

register

When C was first invented, the register modifier could be used only on local integer, character, or pointer variables because it caused the compiler to attempt to keep that variable in a register of the CPU instead of placing it in memory. This made all references to that variable extremely fast. The definition of register has since been expanded. Now, any variable may be specified as register and it is the compiler’s job to optimize accesses to it. For characters, integers, and pointers, this still means putting them into a register in the CPU, but for other types of data, it may mean using cache memory, for example. Keep in mind that register is only a request. The compiler is free to ignore it. The reason for this is that only so many variables can be optimized for speed. When this limit is exceeded, the compiler will simply ignore further register requests.

static

The static modifier instructs the compiler to keep a local variable in existence during the lifetime of the program instead of creating and destroying it each time it comes into and goes out of scope. Therefore, making local variables static allows them to maintain their values between function calls.

The static modifier may also be applied to global variables. When this is done, it causes that variable’s scope to be restricted to the file in which it is declared. This means that it will have internal linkage. Internal linkage means that an identifier is known only within its own file.

In C++, when static is used on a class data member, it causes only one copy of that member to be shared by all objects of its class.

mutable

The mutable specifier applies to C++ only. It allows a member of an object to override constness. That is, a mutable member can be modified by a const

Type Qualifiers

The type qualifiers provide additional information about the variables they precede.

const

Objects of type const cannot be changed by your program during execution. Also, an object pointed to by a const pointer cannot be modified. The compiler is free to place variables of this type into read-only memory (ROM). A const variable will receive its value either from an explicit initialization or by some hardware-dependent means. For example,

const int a = 10;

will create an integer called a with a value of 10 that may not be modified by your program. It can, however, be used in other types of expressions.

volatile

The modifier volatile tells the compiler that a variable’s value may be changed in ways not explicitly specified by the program. For example, a global variable’s address may be passed to the clock routine of the operating system and updated with each clock tick. In this situation, the contents of the variable are altered without any explicit assignment statements in the program. This is important because compilers will sometimes automatically optimize certain expressions by making the assumption that the contents of a variable are unchanging inside an expression. This is done to achieve higher performance. The volatile modifier will prevent this optimization in those rare situations where this assumption is not the case.

Programming Tip

If a class member function is modified by const, it cannot alter the object that invokes the function. To declare a const member function, put const after its parameter list. For example,

class MyClass {
  int i;
public:
  // a const function
  void f1(int a) const {
    i = a; // Error! can't modify invoking object
  }
  void f2(int a) {
   i = a; // OK, not const function
  }
};

As the comments suggest, f1( ) is a const function and it cannot modify the object that invokes it.

restrict

C99 adds a new type qualifier called restrict. This qualifier applies only to pointers. A pointer qualified by restrict is initially the only means by which the object it points to can be accessed. Access to the object by another pointer can occur only if the second pointer is based on the first. Thus, access to the object is restricted to expressions based on the restrict-qualified pointer. Pointers qualified by restrict are primarily used as function parameters, or to point to memory allocated via malloc( ). The restrict qualifier does not change the semantics of a program. restrict is not supported by C++.

Arrays

You may declare arrays of any data type, including classes. The general form of a singly dimensioned array is

type var-name[size];

where type specifies the data type of each element in the array and size specifies the number of elements in the array. For example, to declare an integer array x of 100 elements, you would write

int x[100];

This will create an array that is 100 elements long with the first element being 0 and the last being 99. For example, the following loop will load the numbers 0 through 99 into array x:

for(t=0; t<100; t++) x[t] = t;

Multidimensional arrays are declared by placing the additional dimensions inside additional brackets. For example, to declare a 10 × 20 integer array, you would write

int x[10][20];

Arrays can be initialized by using a bracketed list of initializers. For example,

int count[5] = { 1, 2, 3, 4, 5 };

In C89 and C++, array dimensions must be specified by constant values. Thus, in C89 and C++, all array dimensions are fixed at compile time and cannot change over the lifetime of a program. However, in C99, the dimensions of a local array can be specified by any valid integer expression, including those whose values are known only at compile time. This is called a variable- length array. Thus, the dimensions of a variable-length array can differ each time its declaration statement is encountered.

Defining New Type Names Using typedef

You can create a new name for an existing type using typedef. Its general form is

typedef type newname;

For example, the following tells the compiler that feet is another name for int:

typedef int feet;

Now, the following declaration is perfectly legal and creates an integer variable called distance:

feet distance;

Constants

Constants, also called literals, refer to fixed values that cannot be altered by the program. Constants can be of any of the basic data types. The way each constant is represented depends upon its type. Character constants are enclosed between single quotes. For example 'a' and '+' are both character constants. Integer constants are specified as numbers without fractional components. For example, 10 and –100 are integer constants. Floating-point constants require the use of the decimal point followed by the number’s fractional component. For example, 11.123 is a floating-point constant. You may also use scientific notation for floating-point numbers.

There are two floating-point types: float and double. Also, there are several flavors of the basic types that are generated using the type modifiers. By default, the compiler fits a numeric constant into the smallest compatible data type that will hold it. The only exceptions to the smallest-type rule are floating-point constants, which are assumed to be of type double. For many programs, the compiler defaults are perfectly adequate. However, it is possible to specify precisely the type of constant you want.

To specify the exact type of numeric constant, use a suffix. For floating-point types, if you follow the number with an F, the number is treated as a float. If you follow it with an L, the number becomes a long double. For integer types, the U suffix stands for unsigned and the L for long. Some examples are shown next:

Data Type	Constant Examples
int	1 123 21000 –234
long int	35000L –34L
unsigned int	10000U 987U
float	123.23F 4.34e–3F
double	123.23 12312333 –0.9876324
long double	1001.2L

C99 also allows you to specify a long long integer constant by specifying the suffix LL (or ll).

Hexadecimal and Octal Constants

It is sometimes easier to use a number system based on 8 or 16 instead of 10. The number system based on 8 is called octal and uses the digits 0 through 7. In octal, the number 10 is the same as 8 in decimal. The base 16 number system is called hexadecimal and uses the digits 0 through 9 plus the letters A through F, which stand for 10, 11, 12, 13, 14, and 15. For example, the hexadecimal number 10 is 16 in decimal. Because of the frequency with which these two number systems are used, C/C++ allows you to specify integer constants in hexadecimal or octal instead of decimal if you prefer. A hexadecimal constant must begin with a 0x (a zero followed by an x) or 0X, followed by the constant in hexadecimal form. An octal constant begins with a zero. Here are two examples:

int hex = 0x80; // 128 in decimal 
int oct = 012;  // 10 in decimal

String Constants

C/C++ supports one other type of constant in addition to those of the predefined data types: a string. A string is a set of characters enclosed by double quotes. For example, "this is a test" is a string. You must not confuse strings with characters. A single-character constant is enclosed by single quotes, such as 'a'. However, "a" is a string containing only one letter. String constants are automatically null terminated by the compiler. C++ also supports a string class, which is described later in this book.

Boolean Constants

C++ specifies two Boolean constants: true and false.

C99, which adds the _Bool type to C, does not specify any built-in Boolean constants. However, if your program includes the header <stdbool.h>, then the macros true and false are defined. Also, including <stdbool.h> causes the macro bool to be defined as another name for _Bool. Thus, it is possible to create code that is compatible with both C99 and C++. Remember, however, that C89 does not define a Boolean type.

Complex Constants

In C99, if you include the header <complex.h>, then the following complex and imaginary constants are defined.

_Complex_I	(const float _Complex) i
_Imaginary_I	(const float _Imaginary) i
I	_Imaginary_I (or _Complex_I if imaginary types are not supported)

Here, i represents the imaginary value, which is the square root of –1.

Backslash Character Constants

Enclosing character constants in single quotes works for most printing characters, but a few, such as the carriage return, are impossible to enter into your program’s source code from the keyboard. For this reason, C/C++ recognizes several backslash character constants, also called escape sequences. These constants are listed here:

Code	Meaning
\b	Backspace
\f	Form feed
\n	Newline
\r	Carriage return
\t	Horizontal tab
\"	Double quote
\'	Single quote
\\	Backslash
\v	Vertical tab
\a	Alert
\N	Octal constant (where N is an octal constant)
\xN	Hexadecimal constant (where N is a hexadecimal constant)
\?	Question mark

The backslash constants can be used anywhere a character can. For example, the following statement outputs a newline and a tab and then prints the string “This is a test”.

cout << "\n\tThis is a test";

Technology Program

Blog Archive

popular blogs 2011

Total Pageviews