Improve shellcode flexibility #1153

wtdcode · 2022-05-10T19:29:52Z

Checklist

Which kind of PR do you create?

This PR only contains minor fixes.
This PR contains major feature update.
This PR introduces a new function/api for Qiling Framework.

Coding convention?

The new code conforms to Qiling Framework naming convention.
The imports are arranged properly.
Essential comments are added.
The reference of the new code is pointed out.

Extra tests?

No extra tests are needed for this PR.
I have added enough tests for this PR.
Tests will be added after some discussion and review.

Changelog?

This PR doesn't need to update Changelog.
Changelog will be updated after some proper review.
Changelog has been updated in my PR.

Target branch?

The target branch is dev branch.

One last thing

I have read the contribution guide

This PR is based on my previous mem slicing PR #1151, which is also one of several phases to make Qiling a bit more pythonic and easy to use.

The motivation of this PR is that, the current implementation is rather not flexible for shellcode usage, e.g.

The shellcode must be specified during Qiling.__init__.
The shellcode can't be changed anymore, or many our internal functions would break like ql.run.

In general, the previous design assumes that only one shellcode is written and never changed and thus this PR improves flexibility.

I also removed qilingcode dirty hack, which is a very old bad design and I believe shouldn't exist anymore.

wtdcode · 2022-05-10T19:45:08Z

I'm also going to test against #1148 after it gets merged.

elicn · 2022-05-10T21:06:07Z

There are several things I don't understand about this PR:

Do we really need flexibility around loading the shellcode? Why not just set it as part of init as we do today?
As far as I see it there is always ql.argv or ql.code, but never both. Though I agree that the handling of "shellcode mode" vs. "loaded file mode" is a bit cumbersome, these changes seem to make that even more complicated with a additional checks. What am I missing?
Wouldn't asserting there is either ql.argv or ql.code (but not both) suffice? Once od them must exist; what is the meaning of else in:

if self.ql.code:
  # shellcode mode
elif len(self.ql.argv) == 0:
  # loaded file mode
else:
  # ??

wtdcode · 2022-05-10T21:10:07Z

There are several things I don't understand about this PR:

Do we really need flexibility around loading the shellcode? Why not just set it as part of init as we do today?

As far as I see it there is always ql.argv or ql.code, but never both. Though I agree that the handling of "shellcode mode" vs. "loaded file mode" is a bit cumbersome, these changes seem to make that even more complicated with a additional checks. What am I missing?

Wouldn't asserting there is either ql.argv or ql.code (but not both) suffice? Once od them must exist; what is the meaning of else in:
if self.ql.code:
  # shellcode mode
elif len(self.ql.argv) == 0:
  # loaded file mode
else:
  # ??

Do we really need flexibility around loading the shellcode? Why not just set it as part of init as we do today?

Say, you have a shellcode and after ql.run you move it to elsewhere or changes its length and you call ql.run again, then it breaks.

As far as I see it there is always ql.argv or ql.code, but never both. Though I agree that the handling of "shellcode mode" vs. "loaded file mode" is a bit cumbersome, these changes seem to make that even more complicated with a additional checks. What am I missing?

Wouldn't asserting there is either ql.argv or ql.code (but not both) suffice? Once od them must exist; what is the meaning of else in:

The test for len(self.ql.argv) == 0 doesn't mean loaded file mode. It means something like Qiling(code=None).

elicn · 2022-05-11T00:03:00Z

Say, you have a shellcode and after ql.run you move it to elsewhere or changes its length and you call ql.run again, then it breaks.

Is this a real usecase..? I guess that all other scenarios will break as well since there are some manual adjustments to be made anyway if you want to re-run the same Qiling instance (e.g. clearing hooks, etc.)

The test for len(self.ql.argv) == 0 doesn't mean loaded file mode. It means something like Qiling(code=None).

My bad, I actually meant to ask what is the difference between if self.ql.code and elif len(self.ql.argv) == 0 where one of them must exist. I'm not sure what else means in that context..

wtdcode · 2022-05-11T20:43:16Z

Say, you have a shellcode and after ql.run you move it to elsewhere or changes its length and you call ql.run again, then it breaks.

Is this a real usecase..? I guess that all other scenarios will break as well since there are some manual adjustments to be made anyway if you want to re-run the same Qiling instance (e.g. clearing hooks, etc.)

Of course, consider something like Self Modified Code emulation. Of course, I'm not talking of current implementation can't do but it's quite not flexible. For other scenarios you mentioned, in most cases, we don't need such flexibility.

Just take a really simple scenario as an example:

The shellcode produces some new code.
I would like to execute the code with the existing Qiling instance. Unfortunately, it's impossible.

Another scenario as I mentioned already, the shellcode might not be available as the time of creating the Qiling instance.

The test for len(self.ql.argv) == 0 doesn't mean loaded file mode. It means something like Qiling(code=None).

My bad, I actually meant to ask what is the difference between if self.ql.code and elif len(self.ql.argv) == 0 where one of them must exist. I'm not sure what else means in that context..

self.ql.code is set => We got a shellcode in Qiling.__init__ so we help users to setup code space and possible stacks,
self.ql.code is not set while self.ql.argv is empty. Users are wiling to provide shellcode afterwards, do nothing.
else. Users provide a file and this is the most common case.

Again, let me clarify my motivation a bit more:

Using raw Unicorn bindings is painful and full of tedious work while Qiling sheld a light on this by providing numerous utilities. However, the assumption that the shellcode is already available (and some other internal assumptions as mentioned) as the time of creating Qiling instance doesn't always hold when interacting with shellcode. In this case, I would like Qiling to provide the maximum flexibility with various helpers (imagine debugging a shellcode through gdb to help understand better).

elicn · 2022-05-12T15:05:45Z

I think I am starting to understand the direction you're going, but I am still not sure why this cannot happen with the current implementation. A shellcode provided at initialization time has a certain size and Qiling copies it to memory, and... that is pretty much it.
For the specific unique case of an unknown shellcode (.. which honestly I still struggle to understand), why can't someone just initialize Qiling with a nopsled shellcode [large enough to make room for any shellcode they may choose] and then overwrite it in memory with a new one? At the end of the day the content of ql.code becomes meaningless as soon as it is copied to memory. Then all that matters is what is in that memory location.

Sorry for bothering, but I am trying to understand the justification here. The code around shellcode handling is already messy, and that PR makes it even more messy than it is today. I had some preliminary thoughts about redesigning the way Core, OS and Loader are interacting with each other, by introducing a concept of Process objects (in short, the Loader is static, it loads a Process and had it over to the OS. All the specific process context is saved there instead of being stored on Loader and OS). That would decouple Qiling from a specific program (or shellcode), and let the user load multiple processes on the same Qiling instance.

I would use this usecase as an additional incentive to drive this re-design.

wtdcode · 2022-05-13T20:54:13Z

I think I am starting to understand the direction you're going, but I am still not sure why this cannot happen with the current implementation. A shellcode provided at initialization time has a certain size and Qiling copies it to memory, and... that is pretty much it.

But still, I would like to somehow support Qiling(code=None). At least, I think it's a valid use case and it should not be too difficult to support.

At the end of the day the content of ql.code becomes meaningless as soon as it is copied to memory. Then all that matters is what is in that memory location.

Incorrect. The length of ql.code is used in ql.run, though can be fixed without messing up more.

Sorry for bothering, but I am trying to understand the justification here. The code around shellcode handling is already messy, and that PR makes it even more messy than it is today. I had some preliminary thoughts about redesigning the way Core, OS and Loader are interacting with each other, by introducing a concept of Process objects (in short, the Loader is static, it loads a Process and had it over to the OS. All the specific process context is saved there instead of being stored on Loader and OS). That would decouple Qiling from a specific program (or shellcode), and let the user load multiple processes on the same Qiling instance.

Any estimation on this? This looks cleaner indeed.

wtdcode added 6 commits May 9, 2022 00:12

Support memory slicing

2e8a703

Fix typo

11224c3

Check None correctly

9f78b69

mimic bytearray

41d6d30

Enforce ql.mem[addr:] to be more intuitive

295ea60

Support creating shellcode after creating Qiling instance

6915bc4

wtdcode changed the title ~~Raw shellcode~~ Improve shellcode flexibility May 10, 2022

Check if shellcode for gdb

8a2709e

Fix typo

1991934

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve shellcode flexibility #1153

Improve shellcode flexibility #1153

wtdcode commented May 10, 2022 •

edited

wtdcode commented May 10, 2022

elicn commented May 10, 2022

wtdcode commented May 10, 2022

elicn commented May 11, 2022

wtdcode commented May 11, 2022

elicn commented May 12, 2022

wtdcode commented May 13, 2022

Improve shellcode flexibility #1153

Are you sure you want to change the base?

Improve shellcode flexibility #1153

Conversation

wtdcode commented May 10, 2022 • edited

Checklist

Which kind of PR do you create?

Coding convention?

Extra tests?

Changelog?

Target branch?

One last thing

wtdcode commented May 10, 2022

elicn commented May 10, 2022

wtdcode commented May 10, 2022

elicn commented May 11, 2022

wtdcode commented May 11, 2022

elicn commented May 12, 2022

wtdcode commented May 13, 2022

wtdcode commented May 10, 2022 •

edited