Skip to content

Latest commit

 

History

History
323 lines (309 loc) · 13 KB

C17.md

File metadata and controls

323 lines (309 loc) · 13 KB

Environment

  • a source file, with headers and sources included via #include, is called preprocessing translation unit. After preprocessing, it is called translation unit
  • a source file that is not empty shall end in a newline character which shall not be immediately preceded by a backslash character
  • in freestanding environment, C program executes without any benefit of an operating system (except minimal set in clause 4) and the name/type of a function is called at startup is implementation-defined
  • in a hosted environment
    • startup function is main with return type int and with no parameters or two named argc/argv
    int main() {}
    int main(int argc, char *argv[]) {} // argc is non-negative, argv[0] is program name if exist <- both are modifiable
    • return of main is equivalent to calling exit with the same value if its return type is int-compatible and reach } is equivalent to return 0
  • integer/double-precision promotion promote each operand to int/double then do the operator and truncate afterward, but the actual execution need only produce the same result, possibly omitting the promotion
    char c1, c2;
    c1 = c1 + c2;
    ----
    float f1, f2;
    f1 = f2 + 0.2;
  • rearrangement for floating-point expression is often restrict because of precision
    double x, y, z;
    x = (x * y) * z;        // != x *= y * z
    z = (x - y) + y;        // != z = x
    z = x + x * y;          // != z = x * (1 + y)
    y = x / 5.0;            // != y = x * 0.2
  • on a machine which overflows produce an explicit trap, cannot rewrite expression even with associativity
    int a, b;
    a = a + 32760 + b + 5;  // != a = a + b + 32760 + 5
  • the order of execution of each operand is undefined
    int sum;
    int *p;
    sum = sum + 10 + (*p++ = getchar()) // increment of p or getchar can occur at any point between previous and next sequence point
  • trigraph sequences are replaced before any processing (usually need to enable via -trigraphs)
    ??=define arraycheck(a, b) a??(b??) ??!??! b??(a??)
    ↓
    #define arraycheck(a, b) a[b] || b[a]
  • digraphs contain <: :> <% %> %: %:%: behave, respectively, the same as [ ] { } # ## except for their spelling (when "stringized")

Language

Concepts

  • identifiers can belong to one of four scopes (function, file, block, and function prototype)
    • label name scope is only function scope
  • only enumeration constant has scope right after its defining enumerator, others have scope after the completion of its declarator
    enum state {
      PENDING = 0,
      FAILED = PENDING,     // can use `pending` right after
    };
  • if an identifier with extern in the scope which already exist that identifier, identifier keep the prior linkage
    static int i = 10;
    extern int i;           // i linkage is internal
    -----
    int i = 10;
    extern int i;           // i linkage is external
  • an automatic storage duration object (not VLA)'s lifetime from entry into the object (goto into the middle of the block) until execution of that block ends in anyway (goto). For VLA, its lifetime is available only if passing through the declaration
    int main()
    {
      goto test;
      {
        int x = 1;
      test:
        printf("%d", x);    // work
      }
    }
  • a non-lvalue expression with struct or union type, its member with array-type refer to an object which has temporary lifetime
    struct X { char a[8]; };
    struct Y { char a; };
    struct X getX(void) {
      struct X result = { "world" };
      return result;
    }
    struct Y getY() {
      struct Y result = { 1 };
      return result;
    }
    int main() {
      char *i = getX().a;   // work
      char *j = &getY().a;  // not work
    }
  • scalar types = arithmetic types + pointer types
    • arithemetic types = integer types + floating types
    • integer types = char types + signed/unsigned integer types + enumerated types
    • floating types = real floating types + complex types
  • trap representation don't present values of the object type and not applied for struct or union (even any of their members can be trap representation) for example, a C implementation take non {0, 1} values to be trap representation for _Bool
  • most of the recent C compilers only support two's complement
  • alignment of a struct object is the strictest (largest) alignment from its members and valid alignment values are a non-negative integral power of two

Expressions

  • bitwise operators yield values depend on the internal representation of integers and have implementation-defined for signed bytes
  • compound literal storage duration depends on where it occurs, static when outside of a function, automatic within the enclosing block
      "/tmp/fileXXXXXX"                   // static no mater what and non-modifiable
      (char []){"/tmp/fileXXXXXX"}        // like compound literal storage above and modifiable
      (const char []){"/tmp/fileXXXXXX"}  // like above (except it can be placed in read-only memory and even be shared) and non-modifiable
      (const char []){"abc"} == "abc"     // might yield 1
  • each compound literal creates only single object in a given scope (in iteration statement, it is initialized each time since each iteration has each own scope)
    struct s { int i; };
    int main() {
      struct s q;
      int j = 0;
    again:
      q = (struct s){j++};
      if (j < 2)
        goto again;
      return q.i == 1;      // true
    }
  • the result of unary operators +, - and ~ is the value of its (promoted) operand
    char x;
    printf("%lu %lu", sizeof(x), sizeof(+x)); // 1 4
  • the operands of sizeof/alignof operators are expressions that are not evaluated (except VLA for sizeof)
    size_t n = sizeof(printf("%d", 4)); // doesn't output to terminal, n = 4
  • in assignment expression, the evalution of operands (both left and right) are unsequenced, but updating stored value in left operand is sequenced after both
  • & and * operators are canceled each other when apply to the same operand (no dereference is applied)
    int a[10];
    int *p = &*NULL;        // p = NULL, work
    int *q = &a[2];         // q = a + 2

Declarations

  • enumerated type is implemented-defined (char, signed or unsigned integer), is capable of representing values (an implementation may delay the choice of which integer type untils the type is completed)
  • qualifiers between * and the identifer qualify the type of the pointer
    int n = 10;
    const int *p = &n;      // pointer to a const int
    *p = 10;                // invalid
    p = NULL;               // valid
    ----
    int * const p = &n;     // const pointer to a non-const int
    *p = 10;                // valid
    p = NULL;               // invalid
    ----
    int p[const] = &n;      // const array of a non-cost int
  • objects of any variably-modified type may only be declared at block or function scope and must have automatic/allocated storage duration (except pointer to VLA)
    int n = 10;
    int foo[n];             // invalid
    void func() {
      int baz[n];           // valid
      static int x[n];      // invalid
      extern int y[n];      // invalid
      static int (*z)[n];   // valid
    }
  • l-value expressions designate objects and any members of those objects of const-qualified type are not modifiable lvalues (even re-assign with the same value)
    struct { int a; const int b; } s1 = { .b = 1 }, s2 = { .b = 2 };
    s1 = s2; // valid
  • if a object (array, struct/union) is declared with any combination of type qualifiers (const, volatile and restrict), its member types acquire those qualifiers
  • struct/union doesn't create a new scope -> nested types, enumeration, which is introduced inside are visible to the surrounding scope
  • zero with bit field is only allowed for nameless and specify the next bit field will begin at a next unit type
    struct foo {
      int x: 1;
      int y: 2;
      int z: 3;
    } // sizeof(struct foo) == 4
    struct baz {
      int x: 1;
      int : 0;
      int z: 3;
    } // sizeof(struct baz) == 8
  • register/restrict and inline (only for function) are the hint, a compiler is free to ignore it
  • function declaration can appear at block scope as well as function scope (not like function definition <- usually compiler support block scope function definition but it is non-standard)
  • for static inline, if there is no reference, the compiler can ignore that function in the resulting object file (each translation unit will have its definition of that function -> waste space). While extern inline, the compiler has to generate a global definition in the resulting object file (only one global definition). inline alone can lead to confusing thing
  • an empty list in a function declaration means a function has an unspecified number of parameters, while an empty list in a function definition means a function has no parameters
    int foo();
    int main() {
      foo(1);               // valid, without any warnings
    }
    int foo() {
      return 1;
    }
  • typedef can be interpreted different based on contexts
    typedef signed int t;
    typedef int plain;
    struct tag {
      unsigned t:4;         // is unsigned identifer t with 4-bit field
      const t:5;            // is unnamed const signed int 5-bit field
      plain r:5;            // is signed int identifer r with 5-bit field
    };
    ----
    t f(t (t));             // f is the function, returning signed int, with one unnamed parameter pointer (to function returning signed int with one unnamed parameter signed int)
    long t;                 // is long identifer t
  • typedef denotes a variable length array type, the length of an array is fixed at the time typedef is defined
    void copy(int n)
    {
      typedef int B[n];     // B is n ints, n evaluated now
      n += 1;
      B a;                  // a is n ints, n without += 1
      int b[n];             // a and b are different sizes
      for (int i = 1; i < n; i++)
        a[i-1] = b[i];
    }
  • in the initializer, if number of initialized elements exceed size array, compiler freely discard those
    int p[2] = {1, 2, 3, 4}; // two last elements 3 and 4 are discard
  • if the initializer of the subobject doesn't begin with a left brace, the subobject will take enough initializers from the list and left remaining initializers for the next element
    struct foo
    {
    	int x, y, z;
    };
    int main()
    {
    	struct foo baz[] = {1, 2, 3, 4};          // baz[0] = { .x = 1, .y = 2, z = 3}, baz[1] = { .x = 4};
    	struct foo bar[] = {1, 2, {3, 4, 5}, 6};  // baz[0] = { .x = 1, .y = 2, z = 3}, baz[1] = { .x = 6};
      struct foo qux[] = {{1}, 2, 3};           // baz[0] = { .x = 1 }, baz[1] = { .x = 2, .y = 3};
    }
    struct fred
    {
    	int a[3], b;
    } w[] = {[0].a = {1}, [1].a[0] = 2};
  • you re-run the array index in initializer
    int a[MAX] = {
      1, 3, 5, 7, 9, [MAX-5] = 8, 6, 4, 2, 0    // if MAX > 10, some zero-values element in the middle
                                                // if MAX < 10, some of values provided by the first 5 elments will be overrided
    };

Statements

  • statements are not/don't include declarations
  • if the first substatement is reached via a label, the second substatement is not executed
  • jump into loop body cause the controlling expression of a for or while is not evaluated, nor clause-1 of a for statement
    goto bump;
    ----
    for(int i = 0; i < 10 ; ++i) { // i is declared but `i = 0` and `i < 10` are not executed after jumping
      bump:
        printf("%d", i);
    }
    ----
    while(i < 0) {                  // `i < 0` is not executed after jumping
      bump:
        printf("%d", i);
    }
  • goto statement is not allowed to jump past any declarations of VLA (jump within scope is permitted)
    goto lab3; // invalid: going INTO scope of VLA.
    {
      double a[n];
      a[j] = 4.4;
    lab3:
      a[j] = 3.3;
      goto lab4; // valid: going WITHIN scope of VLA.
      a[j] = 5.5;
    lab4:
      a[j] = 6.6;
    }
    goto lab4; // invalid: going INTO scope of VLA.

External definitions

  • "An identifier declared as a typedef name shall not be redeclared as a parameter"
    typedef int foo;
    int baz(foo) int foo;   // invalid
    {
      return 1;
    }
  • in the same scope, an identifier can be declared more than once (except identifier with no linkage) while the definition is only once

Preprocessing directives

  • source file inclusion has 3 forms
    #include <h-char-sequence> new-line
    #include "h-char-sequence" new-line     // if not supported for failed, continue with form-1
    #include pp-tokens new-line             // macro replacement if having
  • # follows by an identifier, the identifier is not subject to macro replacement
  • if any nested replacements encounter for the name of the macro being replaced, it is not replaced (or unspecified behavior)
  • line control has 2 forms
    #line digit-sequence "s-char-sequence"(opt) new-line
    #line pp-tokens new-line                // macro replacement if having