Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: Is sub eax in code_asm unoptimized? #442

Open
Sewer56 opened this issue Aug 8, 2023 · 1 comment
Open

Question: Is sub eax in code_asm unoptimized? #442

Sewer56 opened this issue Aug 8, 2023 · 1 comment

Comments

@Sewer56
Copy link

Sewer56 commented Aug 8, 2023

Hi; I have a quick question regarding code generation with code_asm.
I couldn't see anything related in issues list; so I figured I would ask here:

When I assemble with

a.sub(eax, 10)

I get the result:

0:  2d 0a 00 00 00          sub    eax,0xa 

I.e. It uses SUB (EAX, I32)

However, this can also theoretically be assembled with SUB (R32, I8)

0:  83 e8 0a                sub    eax,0xa

Which yields a smaller size in terms of code generation.

At first, I expected the EAX specific instruction to be faster, so I then had a look at uops.info; and it appears that the non-specialized instruction is measured to be faster on modern CPUs.

Namely, on modern architectures like Zen 3, SUB (R32, I8) clocks in at 0.25 throughput and SUB (EAX, I32) clocks in at 0.33. [Higher value is worse]. Intel CPUs follow the same trend.

I did cross reference these values with very well known Agner's CPU Optimization Guide; they matched; however Agner not have the measurements for SUB (EAX, I32) specifically, only SUB (R32, I8).


Note: In the Rust bindings, the code is auto-generated as:

#[rustfmt::skip]
impl CodeAsmSub<AsmRegister32, i32> for CodeAssembler {
	fn sub(&mut self, op0: AsmRegister32, op1: i32) -> Result<(), IcedError> {
		let code = if op0.register() == Register::EAX {
			Code::Sub_EAX_imm32
		} else if op1 >= i8::MIN as i32 && op1 <= i8::MAX as i32 {
			Code::Sub_rm32_imm8
		} else {
			Code::Sub_rm32_imm32
		};
		self.add_instr(Instruction::with2(code, op0.register(), op1)?)
	}
}

(Picking EAX when possible)

@wtfsck
Copy link
Member

wtfsck commented Aug 26, 2023

Yes, looks like that can be improved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants