Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CIR][CodeGen] Support compound literal expression #505

Open
wants to merge 1,429 commits into
base: main
Choose a base branch
from

Conversation

rusyaev-roman
Copy link
Member

No description provided.

htyu and others added 30 commits January 28, 2024 22:36
…n. (llvm#253)

Introducing `cir.ConstPtrAttr` to represent arbitrary absolute pointer
value initializations. Also incorporating previous `cir.nullptr` effort
into this work.
Following discussion in llvm#237 this adds support for `fabs` builtins which
are used extensively in llvm-test-suite.
Temporary workaround until we patch the codegen for record types.
This patch brings up the basic support for C++ virtual inheritance. VTT
(virtual table table) now can be laid out as expected for simple program
with single virtual inheritance. RTTI support is on the way.

This patch does not include LLVM lowering support.
This PR introduces bitfelds support.  This now works:

```
#include <stdio.h>

typedef struct {
    int a1 : 4;
    int a2 : 28;
    int a3 : 16;
    int a4 : 3;
    int a5 : 17;
    int a6 : 25;
} A;

void init(A* a) {
    a->a1 = 1;
    a->a2 = 321;
    a->a3 = 15;
    a->a4 = -2;
    a->a5 = -123;
    a->a6 = 1234;
}

void print(A* a) {
    printf("%d %d %d %d %d %d\n",
        a->a1,
        a->a2,
        a->a3,
        a->a4,
        a->a5,
        a->a6
    );
}

int main() {
    A a;
    init(&a);
    print(&a);
    return 0;
}

```
the output is:
`1 321 15 -2 -123 1234`
- Introduces `CIR/Interfaces/ASTAttrInterfaces` which model API of clang
AST nodes, but allows to plugin custom attribute, making `CIR` dialect
AST independent.

- Extends hierarchy of `DeclAttr`s to model `Decl` attributes more
faithfully.
- Notably all `CIRASTAttr`s are now created uniformly using
`makeAstDeclAttr` which builds corresponding Attribute based on
`clang::Decl`.
…lvm#263)

There is a bug in the code generation: the field index for the
`GetMemberOp` is taken from the `FieldDecl`, with no respect to the
record layout. One of the manifestation of the bug is the wrong index
generated for a field in a derived class that does not take into the
account the instance of the base class (that has index 0).

You can take a look at the example in
`test/CIR/CodeGen/derived-to-base.cpp`, i.e. the current test is not the
correct one
```
// CHECK-DAG: !ty_22C23A3ALayer22 = !cir.struct<class "C2::Layer" {!ty_22C13A3ALayer22, !cir.ptr<!ty_22C222>
// CHECK-DAG: !ty_22C33A3ALayer22 = !cir.struct<struct "C3::Layer" {!ty_22C23A3ALayer22

// CHECK: cir.func @_ZN2C35Layer10InitializeEv

// CHECK:  cir.scope {
// CHECK:    %2 = cir.base_class_addr(%1 : cir.ptr <!ty_22C33A3ALayer22>) -> cir.ptr <!ty_22C23A3ALayer22>
// CHECK:    %3 = cir.get_member %2[0] {name = "m_C1"} : !cir.ptr<!ty_22C23A3ALayer22> -> !cir.ptr<!cir.ptr<!ty_22C222>>
```
As one can see, the result (of ptr type to `!ty_22C222` ) must have the
index `1` in the corresponded struct `ty_22C23A3ALayer22`.

Basically the same is done in the `clang/CodeGen/CGExpr.cpp`, so we
don't invent something new here.

Note, this fix doesn't affect anything related to the usage of
`buildPreserveStructAccess` where the `field->getFieldIndex()` is used.
This PR adds MLIR lowering of `cir.scope`.

I also notice that the MLIR unit tests still uses old integer types. I
will fix those in a separate PR.
Recent work on exceptions broke an internal testcase, free the path
back while I work on a holistic solution.
We've integrated this functionality into the main clang-tidy and clangd
tools and no longer have a purpose for it. We only kept around to
maintain support for internal tooling built upon it. So remove it here.
Here is the promised patch that adds proper type conversion to CIR ->
MLIR conversion. I tried to keep the changes minimum but the existing
implementation doesn't use `TypeConverter`.

This should not have any functional changes except for one tiny fix that
registers the `cf` dialect, which should allow `goto.mlir` to pass.
Happy to break the PR into two if requested.
…lvm#268)

This is an updated PR for [PR
llvm#233](llvm#233) with some fixes
explained in [PR llvm#261](llvm#261) which
now can be safely closed.

First of all, let me introduce how do the bitfields looks like in CIR.
For the struct `S` defined as following:
```
typedef struct {
  int a : 4;
  int b : 27;
  int c : 17;
  int d : 2;
  int e : 15;
  unsigned f;
} S;
```
the CIR type is `!ty_22S22 = !cir.struct<struct "S" {!u32i, !u32i,
!u16i, !u32i} #cir.record.decl.ast>` where all the bitfields are packed
in the first three storages.

Also, the next bugs was fixed:
- type mismatch
- index out of bounds
- single bitfield of size < 8

The cases covered with tests.
llvm#270)

This is a minor fix similar to the one introduced in llvm#263.
Basically, all calls to the `buildLValueForFieldInitialization` are even
with the origin codegen `emitLValueForFieldInitialization` calls, i.e.
the field index is calculated from the record layout, but not from the
decl `field->getFieldIndex()`.

Added just one test, because looks like we need to implement some `NYI`
features first to test another places e.g. in `CIRGenExprAgg.cpp`,
though I could miss something.

Anyway, given the original codegen doesn't use `getFieldIndex` in these
places, we also should not.

All the remaining usages of `getFieldIndex` are ok.
Updates the lowering pass to use only opaque pointers. This essentially
involves updating the type converter to drop pointee types and
explicitly defining the types loaded/stored/GEPed by LLVM operations.

The reasons for this are twofold:

- LLVM dialect is currently transitioning to deprecate typed pointers,
allowing only opaque pointers. The sooner we transition the fewer
changes we will have to make.
- Opaque pointers greatly simplify lowering self-references, since all
self-references in records are wrapped in a pointer.
The wrong element type was being passed to LLVM's GEP op, generating an
invalid IR. Tests were also updated to properly validate the
`llvm.getelementptr` element type.

Fixes llvm#272
This PR adds `cir.brcond` lowering, which more or less follows the one
in LLVM dialect lowering.
This is a suggestion to relax the existing verification even more than
we did it in PR llvm#257.
Here we also skip verification if a field on the given index is also of
incomplete type - and we can not compare it with the result type of the
operation.

Now the next code fails with type mismatch error:
```
typedef struct Node {    
    struct Node* next;
} NodeStru;

void foo(NodeStru* a) {        
    a->next = 0;
}
```
because the result type is kind of full and the type of field is not
(for the reasons discussed in llvm#256).
Basically, the problem is in the `GetMemberOp` result type generated as
following (via `CIRGenTypes::convertType`)
`!cir.ptr<!cir.struct<struct "Node" {!cir.ptr<!cir.struct<struct "Node"
incomplete #cir.record.decl.ast>>} #cir.record.decl.ast>>`

where the field type at index differs from the record type - compare
with
 `!cir.ptr<!cir.struct<struct "Node" incomplete #cir.record.decl.ast>>`

We just slightly relax the previous solution in llvm#257 - and the
compilation won't fail in the case of recursive types.

Well, if there are some other thoughts?
Just a trivial fix that enables declaration of local structs. 
Basically, there it's a copy-pasta from the original `CodeGen`, without
debug info handling.

Co-authored-by: Bruno Cardoso Lopes <bcardosolopes@users.noreply.github.com>
This PR handles globals initializations for c++ code for the case when a
global is inited from a function call or another global.
Traditional Clang's codegen generates IDs for anonymous records (e.g.
"struct.anon.1") and ensures that they are unique. This patch does the
same for CIRGen, which, until now, would just identify any anonymous
record as "anon".

This will be required to support mutable structs uniquely identified by
their names.
Rename `typeName` to just `name`, also use `StringAttr`'s nullability to
identify if the record is identified or anonymous. Unnamed structs are
also no longer aliased, as they have no unique name to be used in the
alias.
This patch adds RTTI support for C++ virtual inheritance. 

This patch does not include LLVM lowering support.
Lowering Vtable and RTTI globals. Also lowering AddressPoint.

based on llvm#259
Lancern and others added 18 commits February 21, 2024 11:13
This PR adds a dedicated `cir.float` type for representing
floating-point types. There are several issues linked to this PR: llvm#5,
llvm#78, and llvm#90.
This PR fixes a couple of  NIY features for bit fields. 

Basically, such expressions with bit fields `x->a++` and `x->a |= 42`
are supported now.

The main problem is `UnOp` verification - now it can be defined both by
`loadOp` and by `GetBitfieldOp` or even by `CastOp`. So shame on me, I
removed a test from `invalid.cir`
…#467)

CIR codegen always casts the no-proto function pointer to `FuncOp`. But
the function pointer may be result of cir operations (f.e. `cir.load`).
As a result in such cases the function pointer sets to `nullptr`. That
leads to compilation error.
So this PR removes the unecessary cast to 'FuncOp' and resolves the
issue.
This patch adds initial support for the pointer-to-data-member type.
Specifically, this commit includes:

- New ops, types, and attributes:
- CodeGen for pointer-to-data-member types and values
  - Lower C++ pointer-to-member type
  - Lower C++ expression `&C::D`
  - Lower C++ expression `c.*p` and `c->*p`

This patch only includes an initial support. The following stuff related
to pointer-to-member types are not supported yet:

- Pointer to member function;
- Conversion from `T Base::*` to `T Derived::*`;
- LLVMIR lowering.
As discussed in pull llvm#401 , The present `UnimplementedFeature` class is
made for the CIRGen submodule and we need similar facilities for code
under `clang/lib/CIR/Dialect/IR`. This NFC patch adds a new
`CIRDialectUnimplementedFeature` class that provides unimplemented
feature guards for CIR dialect code.
This PR adds lowering for `cir.asm`.

Also, two flags were added to the `cir.asm` : `hasSideEffects` and
`isStackAligned` in order to match with the llvm dialect.

Also, I added several simple tests for lowering.

I'm not sure but most likely the next PR will be the last one in this
story about assembly support )
This patch changes the emission of implicit returns from functions whose
return type is not `void`. Instead of emitting `cir.return`, this PR
aligns to the original clang CodeGen and emits a `cir.unreachable`
operation.

Related issue: llvm#457 .
This PR adds CIRGen support for the following built-in bit operations:

  - `__builtin_ffs{,l,ll,g}`
  - `__builtin_clz{,l,ll,g}`
  - `__builtin_ctz{,l,ll,g}`
  - `__builtin_clrsb{,l,ll,g}`
  - `__builtin_popcount{,l,ll,g}`
  - `__builtin_parity{,l,ll,g}`

This PR adds a new operation, `cir.bits`, to represent such bit
operations on the input integers. LLVMIR lowering support is not
included in this PR.

> [!NOTE]
> As a side note, C++20 adds the `<bit>` header which includes some bit
operation functions with similar functionalities to the built-in
functions mentioned above. However, these standard library functions
have slightly different semantics than the built-in ones and this PR
does not include support for these standard library functions. Support
for these functions may be added later, or amended into this PR if the
reviewers request so.

Co-authored-by: Bruno Cardoso Lopes <bcardosolopes@users.noreply.github.com>
This is part 4 of implementing vector types and vector operations in
ClangIR, issue llvm#284. This change has three small additions.

Implement a "vector splat" conversion, which converts a scalar into
vector, initializing all the elements of the vector with the scalar.

Implement incomplete initialization of a vector, where the number of
explicit initializers is less than the number of elements in the vector.
The rest of the elements are implicitly zero initialized.

Implement conversions between different vector types. The language rules
require that the two types be the same size (in bytes, not necessarily
in the number of elements). These conversions are always implemented
with a bitcast.

The first two changes only required changes to the AST -> ClangIR code
gen. There are no changes to the ClangIR dialect, so no changes to the
LLVM lowering were needed.

The third part only required a change to a validation rule. The code to
implement a vector bitcast was already present. The compiler just needed
to stop rejecting it as invalid ClangIR.
This PR adds the new `cir.expect` opcode which is similar to llvm.expect
intricnsics.
Codegen of `__builtin_expect` emits `cir.expect` opcode. Then
`cir.expect` will be lowered to `llvm.expect` intrinsic.

When implementing __builtin_expect I faced with minor issue.
CIR lowering of `if` often emits the lllvm IR with redundant cast
instructions. Like this:
```
%0 = call i64 @llvm.expect.i64(i64 %any, i64 1), !dbg !13
%1 = icmp ne i64 %0, 0
%2 = zext i1 %0 to i8   // redundant
%3 = trunc i8 %1 to i1 // redundant
br i1 %3, label %l1, label %l2
```
But the llvm pass `LowerExpectIntrinsicPass` (that should replace
`llvm.expect` with branch metadata) performs only simple
pattern-matching. And it can't handle this zext/trunc intructions. So
this pass in such cases just removes the `llvm.expect` without updating
a branch metadata.
In this reason this PR also avoid emitting the redundant zext/cast
instruction sequence.
Support `offset` expression in case when we can evaluate offset
expression as integer.
This PR adds the `cir.trap` operation, which corresponds to the
`__builtin_trap` builtin function. When executed, the operation
terminates the program abnormally in an implementation-defined manner.

This PR also includes CIRGen and LLVM lowering support for the new
operation.
Remove check since paths that aren't related to ClangIR shouldn't be affected
at all, even if the flag is ON. Also, this conservative error message wasn't
tested.
Implement the vector version of the ternary (`?:`) operator. This is a
separate MLIR op than the regular `?:` operator because the vector
version is not short-circuiting and always evaluates all its arguments.
Copy link

github-actions bot commented Mar 12, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Member

@bcardosolopes bcardosolopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Few remaining questions inline.

bool Destruct =
!CGF.getLangOpts().CPlusPlus && !Slot.isExternallyDestructed();
if (Destruct)
Slot.setExternallyDestructed();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this path also exercised in the tests?


A foo(void) {
return (A){1};
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add a RUN line that covers C++ mode?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or alternatively, create a new .cpp one.

@bcardosolopes
Copy link
Member

Any plans to continue the work here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet