Python – Stop printing functions using disassembler in Python?

Stop printing functions using disassembler in Python?… here is a solution to the problem.

Stop printing functions using disassembler in Python?

I have this feature here, after disassembling it looks like this :

def game_on():    
    def other_function():
        print('Statement within a another function')
    print("Hello World")
    sys.exit()
    print("Statement after sys.exit")

8           0 LOAD_CONST               1 (<code object easter_egg at 0x0000000005609C90, file "filename", line 8>)
              3 LOAD_CONST               2 ('game_on.<locals>.other_function')
              6 MAKE_FUNCTION            0
              9 STORE_FAST               0 (other_function)

10          12 LOAD_GLOBAL              0 (print)
             15 LOAD_CONST               3 ('Hello World')
             18 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             21 POP_TOP

11          22 LOAD_GLOBAL              1 (sys)
             25 LOAD_ATTR                2 (exit)
             28 CALL_FUNCTION            0 (0 positional, 0 keyword pair)
             31 POP_TOP

12          32 LOAD_GLOBAL              0 (print)
             35 LOAD_CONST               4 ('second print statement')
             38 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             41 POP_TOP
             42 LOAD_CONST               5 (None)
             45 RETURN_VALUE

Is there a way to modify the bytecode so that it does not print “Hello world”. It’s like I want to skip line 10 and move on to line 11.

There is a lot of information like inspectors and settraces, but it’s not very straightforward. Does anyone have any information on this or can someone point out what I can do?

Solution

The best way to modify the function bytecode (well, first assume that anything can be called a good way …) is to use a third-party library. Currently< a href="https://github.com/vstinner/bytecode" rel="noreferrer noopener nofollow">bytecode seems to be best, but for older versions of Python, you may need byteplay—for 3.4 ( You seem to be using), specifically Seprex’s version of the 3.x port

But you can do it all manually. It’s worth doing this at least once, just to make sure you understand everything (and understand why bytecode is such a cool library).

From > inspect, it can be seen that in the documentation, the function is basically __code__ An object’s wrapper, with extra stuff (closure units, default values, and reflective things like name and type annotations), while code objects are wrapper for objects>co_code bytestring full of bytecode and a whole bunch of extra stuff.

So you think removing some bytecode is just a problem:

del func.__code__.co_code[12:22]

Unfortunately, bytecode does everything in terms of offset, from jump instructions to line number tables for generating backtraces. You can fix everything, but it’s a pain. So you can replace the command you want to kill with NOP. (Behind the scenes, the compiler and peephole optimizer put NOPs everywhere and then do a big fix at the end.) However, the code that performs the fix is not exposed to Python. )

Also, bytecode is stored in immutable

bytes, not mutable bytearrays, and the code objects themselves are immutable (and trying to change them behind the interpreter via C API hacks is a very bad idea). Therefore, you must build a new code object around the modified bytecode. But functions are mutable, so you can modify the function to point to a new code object.


So, here is a function that can NOP out a series of instructions by offset:

import dis
import sys
import types

NOP = bytes([dis.opmap['NOP']])

def noprange(func, start, end):
    c = func.__code__
    cc = c.co_code
    if sys.version_info >= (3,6):
        if (end - start) % 2:
            raise ValueError('Cannot nop out partial wordcodes')
        nops = (NOP + b'\0') * ((end-start)//2)
    else:
        nops = NOP * (end-start)
    newcc = cc[:start] + nops + cc[end:]
    newc = types. CodeType(
        c.co_argcount, c.co_kwonlyargcount, c.co_nlocals, c.co_stacksize,
        c.co_flags, newcc, c.co_consts, c.co_names, c.co_varnames,
        c.co_filename, c.co_name, c.co_firstlineno, c.co_lnotab,
        c.co_freevars, c.co_cellvars)
    func.__code__ = newc

If you want to know about version checking: in Python 2.x and 3.0-3.5, each instruction is 1 or 3 bytes long depending on whether it requires any parameters, so the NOP is 1 byte; In 3.6+, each instruction is 2 bytes long, including NOP.

Anyway, I

actually only tested on 3.6, not 3.4 or 3.5, so hopefully I didn’t get that part wrong. Hopefully I haven’t added any functionality added to dis after 3.4. Well, pray, then:

noprange(game_on, 12, 22)

… will be done exactly as you wish. Or does it modify your function to raise a runtimeError or crash when you try to call it, but segfault is part of learning, right? Anyway, if you use dis.dis(noprange), you should see the four instructions in line 10 replaced by a string of NOP lines, and then the rest of the function remains the same, so try it before calling it.


Once you’re sure you’ve got this working, you can use if you want to remove all instructions from one source line without having to dis functions and read them manually findlinestarts programmatically:

def nopline(func, line):
    linestarts = dis.findlinestarts(func.__code__)
    for offset, lineno in linestarts:
        if lineno > line:
            raise ValueError('No code found for line')
        if lineno == line:
            try:
                nextoffset, _ = next(linestarts)
            except StopIteration:
                raise ValueError('Do not nop out the last return')
            noprange(func, offset, nextoffset)
            return
    raise ValueError('No line found')

Now it’s just:

nopline(game_on, 10)

This has a nice advantage that you can use it in your code, which will work (or crash) the same way in 3.4 and 3.8 because the offset between Python versions may change, but the way line numbers are counted clearly not.

Related Problems and Solutions