Inconsistent result with --sparsification-and-bufferization and tensor.empty #92069

Anonymous15592 · 2024-05-14T05:38:43Z

Consider the following MLIR program:
a.mlir:

module {
  func.func @tensor_i32(%arg0: tensor<1xi32>) -> i32 {
    %idx0 = index.constant 0
    %0 = tensor.extract %arg0[%idx0] : tensor<1xi32>
    return %0 : i32
  }
  func.func @func1() {
    %c1_i32 = arith.constant 1 : i32
    %c0_i32 = arith.constant 0 : i32
    %c0 = arith.constant 0 : index
    %5 = tensor.empty() : tensor<1xi32> // using empty
    // %5 = tensor.from_elements %c0_i32 : tensor<1xi32>
    
    %inserted_28 = tensor.insert %c1_i32 into %5[%c0] : tensor<1xi32>
    %31 = call @tensor_i32(%inserted_28) : (tensor<1xi32>) -> i32
    %308 = tensor.extract %5[%c0] : tensor<1xi32>
    // vector.print %31 : i32
    vector.print %308 : i32
    return
  }
}

It will output two different results when applying two different optimization pass sequences:
pass sequence1: --sparsification-and-bufferization --tensor-bufferize --func-bufferize --convert-func-to-llvm --convert-index-to-llvm --convert-vector-to-llvm --finalize-memref-to-llvm --convert-arith-to-llvm --reconcile-unrealized-casts
pass sequence2: --tensor-bufferize --func-bufferize --convert-func-to-llvm --convert-index-to-llvm --convert-vector-to-llvm --finalize-memref-to-llvm --convert-arith-to-llvm --reconcile-unrealized-casts

The pass sequence1 outputs the executable that outputs 1, while the latter outputs 0.
The difference between pass sequence1 and pass sequence2 is that there is an additional --sparsification-and-bufferization at the begining of the pass sequence1.

I futher analyze the output of these two sequences:
pass1: --sparsification-and-bufferization --tensor-bufferize
pass2: --tensor-bufferize
The result of pass1 is:

module {
  func.func @tensor_i32(%arg0: memref<1xi32>) -> i32 {
    %idx0 = index.constant 0
    %0 = memref.load %arg0[%idx0] : memref<1xi32>
    return %0 : i32
  }
  func.func @func1() {
    %c1_i32 = arith.constant 1 : i32
    %c0 = arith.constant 0 : index
    %alloc = memref.alloc() {alignment = 64 : i64} : memref<1xi32>
    memref.store %c1_i32, %alloc[%c0] : memref<1xi32>
    %0 = call @tensor_i32(%alloc) : (memref<1xi32>) -> i32
    %1 = memref.load %alloc[%c0] : memref<1xi32>
    vector.print %1 : i32
    return
  }
}

The result of pass2 is:

module {
  func.func @tensor_i32(%arg0: tensor<1xi32>) -> i32 {
    %0 = bufferization.to_memref %arg0 : memref<1xi32>
    %idx0 = index.constant 0
    %1 = memref.load %0[%idx0] : memref<1xi32>
    return %1 : i32
  }
  func.func @func1() {
    %c1_i32 = arith.constant 1 : i32
    %c0_i32 = arith.constant 0 : i32
    %c0 = arith.constant 0 : index
    %alloc = memref.alloc() {alignment = 64 : i64} : memref<1xi32>
    %alloc_0 = memref.alloc() {alignment = 64 : i64} : memref<1xi32>
    memref.copy %alloc, %alloc_0 : memref<1xi32> to memref<1xi32>
    memref.store %c1_i32, %alloc_0[%c0] : memref<1xi32>
    %0 = bufferization.to_tensor %alloc_0 : memref<1xi32>
    %1 = call @tensor_i32(%0) : (tensor<1xi32>) -> i32
    %2 = memref.load %alloc[%c0] : memref<1xi32>
    vector.print %2 : i32
    return
  }
}

It seems that --sparsification-and-bufferization --tensor-bufferize treats the operand and the result of tensor.insert as same tensor(memref),
when the operand of tensor.insert is created by tensor.empty.

If I replace the tensor.empty with tensor.from_element, or just wrap the tensor.empty with a function. The modified MLIR program will output the same result.
The modified program:

module {
  func.func @gen_tensor_i32() -> tensor<1xi32> {
    %c0_i32 = arith.constant 0 : i32
    %5 = tensor.empty() : tensor<1xi32>
    return %5 : tensor<1xi32>
  }
  func.func @tensor_i32(%arg0: tensor<1xi32>) -> i32 {
    %idx0 = index.constant 0
    %0 = tensor.extract %arg0[%idx0] : tensor<1xi32>
    return %0 : i32
  }
  func.func @func1() {
    %c1_i32 = arith.constant 1 : i32
    %c0_i32 = arith.constant 0 : i32
    %c0 = arith.constant 0 : index
    %5 = call @gen_tensor_i32() : () -> tensor<1xi32>
    // %5 = tensor.empty() : tensor<1xi32> // using empty
    // %5 = tensor.from_elements %c0_i32 : tensor<1xi32>
    
    %inserted_28 = tensor.insert %c1_i32 into %5[%c0] : tensor<1xi32>
    %31 = call @tensor_i32(%inserted_28) : (tensor<1xi32>) -> i32
    %308 = tensor.extract %5[%c0] : tensor<1xi32>
    // vector.print %31 : i32
    vector.print %308 : i32
    return
  }
}

I wonder if there is some thing wrong with --sparsification-and-bufferization and tensor.empty.
This result inconsistency may not be a problem because tensor.empty should only contains the shpae information.

git version: 2163ae7

The text was updated successfully, but these errors were encountered:

llvmbot · 2024-05-14T14:05:08Z

@llvm/issue-subscribers-mlir

Author: anonymous (Anonymous15592)

Consider the following MLIR program: a.mlir: ``` module { func.func @tensor_i32(%arg0: tensor<1xi32>) -> i32 { %idx0 = index.constant 0 %0 = tensor.extract %arg0[%idx0] : tensor<1xi32> return %0 : i32 } func.func @func1() { %c1_i32 = arith.constant 1 : i32 %c0_i32 = arith.constant 0 : i32 %c0 = arith.constant 0 : index %5 = tensor.empty() : tensor<1xi32> // using empty // %5 = tensor.from_elements %c0_i32 : tensor<1xi32>

%inserted_28 = tensor.insert %c1_i32 into %5[%c0] : tensor&lt;1xi32&gt;
%31 = call @<!-- -->tensor_i32(%inserted_28) : (tensor&lt;1xi32&gt;) -&gt; i32
%308 = tensor.extract %5[%c0] : tensor&lt;1xi32&gt;
// vector.print %31 : i32
vector.print %308 : i32
return

}
}


It will output two different results when applying two different optimization pass sequences:
```pass sequence1```: ```--sparsification-and-bufferization --tensor-bufferize --func-bufferize --convert-func-to-llvm --convert-index-to-llvm --convert-vector-to-llvm --finalize-memref-to-llvm --convert-arith-to-llvm --reconcile-unrealized-casts```
```pass sequence2```: ```--tensor-bufferize --func-bufferize --convert-func-to-llvm --convert-index-to-llvm --convert-vector-to-llvm --finalize-memref-to-llvm --convert-arith-to-llvm --reconcile-unrealized-casts```

The ```pass sequence1``` outputs the executable that outputs 1, while the latter outputs 0.
The difference between ```pass sequence1``` and ```pass sequence2``` is that there is an additional ```--sparsification-and-bufferization``` at the begining of the ```pass sequence1```.

I futher analyze the output of these two sequences:
pass1: ```--sparsification-and-bufferization --tensor-bufferize```
pass2: ```--tensor-bufferize```
The result of ```pass1``` is:

module {
func.func @tensor_i32(%arg0: memref<1xi32>) -> i32 {
%idx0 = index.constant 0
%0 = memref.load %arg0[%idx0] : memref<1xi32>
return %0 : i32
}
func.func @func1() {
%c1_i32 = arith.constant 1 : i32
%c0 = arith.constant 0 : index
%alloc = memref.alloc() {alignment = 64 : i64} : memref<1xi32>
memref.store %c1_i32, %alloc[%c0] : memref<1xi32>
%0 = call @tensor_i32(%alloc) : (memref<1xi32>) -> i32
%1 = memref.load %alloc[%c0] : memref<1xi32>
vector.print %1 : i32
return
}
}

The result of ```pass2``` is:

module {
func.func @tensor_i32(%arg0: tensor<1xi32>) -> i32 {
%0 = bufferization.to_memref %arg0 : memref<1xi32>
%idx0 = index.constant 0
%1 = memref.load %0[%idx0] : memref<1xi32>
return %1 : i32
}
func.func @func1() {
%c1_i32 = arith.constant 1 : i32
%c0_i32 = arith.constant 0 : i32
%c0 = arith.constant 0 : index
%alloc = memref.alloc() {alignment = 64 : i64} : memref<1xi32>
%alloc_0 = memref.alloc() {alignment = 64 : i64} : memref<1xi32>
memref.copy %alloc, %alloc_0 : memref<1xi32> to memref<1xi32>
memref.store %c1_i32, %alloc_0[%c0] : memref<1xi32>
%0 = bufferization.to_tensor %alloc_0 : memref<1xi32>
%1 = call @tensor_i32(%0) : (tensor<1xi32>) -> i32
%2 = memref.load %alloc[%c0] : memref<1xi32>
vector.print %2 : i32
return
}
}


It seems that ```--sparsification-and-bufferization --tensor-bufferize``` treats the operand and the result of ```tensor.insert``` as same tensor(memref), 
when the operand of ```tensor.insert``` is created by ```tensor.empty```.

If I replace the ```tensor.empty``` with ```tensor.from_element```, or just wrap the ```tensor.empty``` with a function. The modified MLIR program will output the same result.
The modified program:

module {
func.func @gen_tensor_i32() -> tensor<1xi32> {
%c0_i32 = arith.constant 0 : i32
%5 = tensor.empty() : tensor<1xi32>
return %5 : tensor<1xi32>
}
func.func @tensor_i32(%arg0: tensor<1xi32>) -> i32 {
%idx0 = index.constant 0
%0 = tensor.extract %arg0[%idx0] : tensor<1xi32>
return %0 : i32
}
func.func @func1() {
%c1_i32 = arith.constant 1 : i32
%c0_i32 = arith.constant 0 : i32
%c0 = arith.constant 0 : index
%5 = call @gen_tensor_i32() : () -> tensor<1xi32>
// %5 = tensor.empty() : tensor<1xi32> // using empty
// %5 = tensor.from_elements %c0_i32 : tensor<1xi32>

%inserted_28 = tensor.insert %c1_i32 into %5[%c0] : tensor&lt;1xi32&gt;
%31 = call @<!-- -->tensor_i32(%inserted_28) : (tensor&lt;1xi32&gt;) -&gt; i32
%308 = tensor.extract %5[%c0] : tensor&lt;1xi32&gt;
// vector.print %31 : i32
vector.print %308 : i32
return

}
}


I wonder if there is some thing wrong with ```--sparsification-and-bufferization``` and ```tensor.empty```.
This result inconsistency may not be a problem because ```tensor.empty``` should only contains the shpae information.

git version: 2163ae761808ca0e5478357384f6ddbacce279eb
</details>

github-actions bot added the new issue label May 14, 2024

EugeneZelenko added mlir and removed new issue labels May 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inconsistent result with --sparsification-and-bufferization and tensor.empty #92069

Inconsistent result with --sparsification-and-bufferization and tensor.empty #92069

Anonymous15592 commented May 14, 2024

llvmbot commented May 14, 2024

Inconsistent result with --sparsification-and-bufferization and tensor.empty #92069

Inconsistent result with --sparsification-and-bufferization and tensor.empty #92069

Comments

Anonymous15592 commented May 14, 2024

llvmbot commented May 14, 2024