Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Examples on Taint Analysis do not work or out of date #1392

Open
jstarink opened this issue Dec 5, 2023 · 1 comment
Open

Examples on Taint Analysis do not work or out of date #1392

jstarink opened this issue Dec 5, 2023 · 1 comment

Comments

@jstarink
Copy link

jstarink commented Dec 5, 2023

Description

It seems some of the Python examples are not working (anymore). In particular, I am looking into the implementation of taint.py. Running this script as-is does not work / segfaults.

Findings

Based on the errors I received in the stdout, I identified the following problems:

  1. The string comparison on line 25 should be a bytes comparison or it will never taint anything as fd_to_fname returns bytes.
  2. The call to panda.taint_label_ram requires a label argument.
  3. taint2 should be enabled/loaded before panda.run is called.

Points 1 and 2 are easy enough to address, but with 3 I have some trouble. When I try add the following lines before the call to panda.run, PANDA seems to be crashing with a segfault upon calling the panda.taint_enable.

panda.load_plugin("taint2")
panda.taint_enable()

Example output:

root@5a4e2fa4f486:/local# python taint.py
using generic x86_64
os_name=[linux-64-ubuntu:4.15.0-72-generic-noaslr-nokaslr]
PANDA[core]:os_familyno=2 bits=64 os_details=ubuntu:4.15.0-72-generic-noaslr-nokaslr
[PYPANDA] Panda args: [/usr/local/lib/python3.8/dist-packages/pandare/data/x86_64-softmmu/libpanda-x86_64.so -L /usr/local/share/panda /root/.panda/bionic-server-cloudimg-amd64-noaslr-nokaslr.qcow2 -display none -m 1024 -serial unix:/tmp/pypanda_s2hy2gyig,server,nowait -monitor unix:/tmp/pypanda_m90o4s3uq,server,nowait]
PANDA[osi_linux]:W> kernelinfo bytes [20-23] not read
PANDA[syscalls2]:using profile for linux x64 64-bit
PPP automatically loaded plugin syscalls2
PPP automatically loaded plugin taint2
PANDA[taint2]:propagation via pointer dereference ENABLED
PANDA[taint2]:taint operations inlining DISABLED
PANDA[taint2]:llvm optimizations DISABLED
PANDA[taint2]:taint debugging DISABLED
PANDA[taint2]:detaint if control bits 0 DISABLED
PANDA[taint2]:maximum taint compute number (0=unlimited) 0
PANDA[taint2]:maximum taintset cardinality (0=unlimited) 0
callstack_instr:  setting up threaded stack_type
PANDA[taint2]:taint2_enable_taint
taint2: Allocating small fast_shad (0 bytes) using malloc @ 0x29ac260.
taint2: Allocating small fast_shad (19200000 bytes) using malloc @ 0x7f9eb847a010.
taint2: Allocating small fast_shad (384 bytes) using malloc @ 0x2b9d590.
taint2: Allocating small fast_shad (3072 bytes) using malloc @ 0x2cf7dd0.
taint2: Allocating small fast_shad (1030272 bytes) using malloc @ 0x7f9ec8347010.
PANDA[taint2]:LLVM optimizations DISABLED
taint2: Initializing taint ops
taint2: Done initializing taint transformation.
Segmentation fault (core dumped)
root@5a4e2fa4f486:/local#

Stack trace according to GDB (see bottom of issue) seems to indicate it happens the PandaTaintVisitor class, called by the taint2_enable_taint function

Am I missing something?

Details

Full (modified) script:
from pandare import Panda
panda = Panda(generic='x86_64')

@panda.queue_blocking
def driver():
    panda.revert_sync('root')
    print(panda.run_serial_cmd("grep root /etc/passwd"))
    panda.end_analysis()

panda.require("osi")
panda.require("osi_linux")

def fd_to_fname(cpu, fd):
    proc = panda.plugins['osi'].get_current_process(cpu)
    procname = panda.ffi.string(proc.name) if proc != panda.ffi.NULL else "error"
    fname_ptr = panda.plugins['osi_linux'].osi_linux_fd_to_filename(cpu, proc, fd)
    fname = panda.ffi.string(fname_ptr) if fname_ptr != panda.ffi.NULL else "error"
    return fname

@panda.ppp("syscalls2", "on_sys_read_return")
def read(cpu, tb, fd, buf, cnt):
    fname = fd_to_fname(cpu, fd)
    print(f"read {fname}")

    if fname == b"/etc/passwd": # <-- changed to bytes string
        for idx in range(cnt):
            panda.taint_label_ram(buf+idx, 1) # <-- added taint label 1 (not sure about the expected type?)

@panda.ppp("taint2", "on_branch2")
def something(addr, size, from_helper, tainted):
    print("Tainted branch")

# Added plugin loading/enabling
panda.load_plugin("taint2")
panda.taint_enable()

panda.run()
GDB stack trace
0x00007fc244615bb8 in llvm::PandaTaintVisitor::insertStateOp(llvm::Instruction&) ()
   from /usr/local/lib/panda/x86_64/panda_taint2.so
(gdb) bt
#0  0x00007fc244615bb8 in llvm::PandaTaintVisitor::insertStateOp(llvm::Instruction&) ()
   from /usr/local/lib/panda/x86_64/panda_taint2.so
#1  0x00007fc244619d35 in llvm::PandaTaintFunctionPass::runOnFunction(llvm::Function&) ()
   from /usr/local/lib/panda/x86_64/panda_taint2.so
#2  0x00007fc2446094c1 in taint2_enable_taint () from /usr/local/lib/panda/x86_64/panda_taint2.so
#3  0x00007fc259622ff5 in ?? () from /lib/x86_64-linux-gnu/libffi.so.7
#4  0x00007fc25962240a in ?? () from /lib/x86_64-linux-gnu/libffi.so.7
#5  0x00007fc2588810a7 in cdata_call (cd=<optimized out>, args=<optimized out>, kwds=<optimized out>)
    at src/c/_cffi_backend.c:3201
#6  0x00000000005f7506 in _PyObject_MakeTpCall ()
#7  0x0000000000570b8e in _PyEval_EvalFrameDefault ()
#8  0x00000000005f6ce6 in _PyFunction_Vectorcall ()
#9  0x000000000056b619 in _PyEval_EvalFrameDefault ()
#10 0x00000000005697da in _PyEval_EvalCodeWithName ()
#11 0x000000000068e547 in PyEval_EvalCode ()
#12 0x000000000067dbf1 in ?? ()
#13 0x000000000067dc6f in ?? ()
#14 0x000000000067dd11 in ?? ()
#15 0x000000000067fe37 in PyRun_SimpleFileExFlags ()
#16 0x00000000006b7c82 in Py_RunMain ()
#17 0x00000000006b800d in Py_BytesMain ()
#18 0x00007fc259d69083 in __libc_start_main (main=0x4ef140 <main>, argc=2, argv=0x7fffa36cf438, init=<optimized out>,
    fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffa36cf428) at ../csu/libc-start.c:308
#19 0x00000000005fb85e in _start ()
@jstarink
Copy link
Author

jstarink commented Dec 6, 2023

Based on taint_x86_64.py I have come to realize I should enable the taint analysis after the machine is set up, e.g., inside of a @panda.cb_after_machine_init callback. Is this the typical approach to take or is there another (better) way?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant