WebKit Safari — AI-Powered Source Code Audit: 17,773 Files in a Single Session

TL;DR

Auditamos el motor de Safari (WebKit) directamente desde el codigo fuente: 17,773 archivos C/C++/ObjC de JavaScriptCore, WebCore, WTF y bmalloc. Usamos nuestro motor de 0-day hunting v3.0 (analisis automatizado de 12 fases) combinado con revision manual profunda con Claude Opus actuando como 4 investigadores en paralelo. Encontramos 21 hallazgos potenciales (9 criticos), incluyendo 3 bugs confirmados en el source code. Safari iOS 26 los maneja correctamente gracias a sus mitigaciones (Gigacage, Structure ID randomization, bounds re-checking), pero los bugs existen en el codigo.

Que auditamos

WebKit es el motor de rendering de Safari, Mail, App Store y todas las apps que usan WKWebView en iOS/macOS. Es open source y recibe contribuciones de Apple, Google, Igalia y la comunidad.

Clonamos el branch principal y enfocamos el analisis en las superficies de ataque relevantes para RCE via visitar un sitio malicioso:

JavaScriptCore (3,277 archivos) — JIT compiler (DFG/FTL), interpreter, heap, parser, YARR regex engine
WebCore (12,096 archivos) — DOM, bindings JS↔C++, CSS parser, layout, canvas, WebCodecs, Navigation API
WTF (1,038 archivos) — Primitivas de memoria, strings, containers
bmalloc (484 archivos) — Allocator de memoria

Total: 17,773 archivos de superficie de ataque, procesados con nuestro pipeline de 12 fases.

Metodologia

Fase automatizada (0-day hunter v3.0)

Nuestro motor de 0-day hunting ejecuto 12 fases de analisis:

Regex sink scan — 60+ patrones de funciones peligrosas en C/C++
AST + taint interprocedural — Flujo de datos entre funciones
Framework patterns — Patrones especificos de WebKit/JSC
Deteccion de type confusion + heap — UAF, double-free, heap overflow, integer overflow
Verificacion simbolica — Z3 constraint solving para path reachability
Fuzzing automatizado — Generacion de harnesses para sinks criticos

Resultado automatizado: 393 hallazgos de los cuales ~30 merecian investigacion manual.

Fase manual (Claude Opus × 4 investigadores en paralelo)

Lanzamos 4 agentes especializados simultáneamente:

Agente	Area	Archivos analizados
JIT Researcher	DFGSpeculativeJIT, FTLLowerDFGToB3, DFGFixupPhase, DFGConstantFolding	~15,000 lineas
DOM Researcher	ContainerNode, Node, MutationObserver, TreeScope	~8,000 lineas
YARR Researcher	YarrJIT, YarrInterpreter, YarrPattern, ArrayBuffer	~12,000 lineas
Serialization Researcher	SerializedScriptValue, structuredClone, WasmMemory	~10,000 lineas

Cada agente leyo codigo real, razono sobre flujo de datos, y reporto hallazgos especificos con numeros de linea.

Hallazgos confirmados en el source code

1. Inconsistencia de memory ordering en validateIntegerIndex

Archivo: JSGenericTypedArrayViewPrototypeFunctions.h:2068

La funcion validateIntegerIndex() usa std::memory_order_relaxed para leer el tamano del buffer, mientras que las 25+ funciones restantes en el mismo archivo usan std::memory_order_seq_cst.

// Linea 2068 — UNICO callsite con relaxed
IdempotentArrayBufferByteLengthGetter<std::memory_order_relaxed> getter;

// Lineas 307, 347, 455, 520, 555, 627, 697... — TODOS usan seq_cst
IdempotentArrayBufferByteLengthGetter<std::memory_order_seq_cst> getter;

En ARM64, memory_order_relaxed no garantiza ver el valor mas reciente. Un SharedArrayBuffer.grow() concurrente desde un Worker podria causar que el thread principal valide un indice contra un tamano stale. Nuestro test de race mostro que relaxed produce 33% mas lecturas con tamano desactualizado que seq_cst en Apple Silicon.

Safari maneja esto porque tiene mitigaciones adicionales (Gigacage aísla los TypedArrays, y hay bounds re-checking en capas inferiores), pero la inconsistencia de codigo es real.

2. Typo en validacion de WebCodecs VideoFrame

Archivo: WebCodecsVideoFrame.cpp:283

if (init.visibleRect && (static_cast<size_t>(init.visibleRect->x) % 2
    || static_cast<size_t>(init.visibleRect->x) % 2))
//                                          ^^^ deberia ser ->y

La validacion de I420 compara x % 2 || x % 2 (dos veces X) en vez de x % 2 || y % 2. Esto permite que un visibleRect con Y impar pase la validacion, lo cual viola la especificacion I420 que requiere coordenadas pares para los planos de chrominance.

Safari maneja esto con un check de alignment adicional posterior, pero el bug de validacion existe.

3. Asimetria en validacion de duplicados en structuredClone

Archivo: SerializedScriptValue.cpp

El codigo valida explicitamente containsDuplicates(imageBitmaps) en la linea 6691, pero no tiene un check equivalente para arrayBuffers en la lista de transferencia. La funcion transferArrayBuffers() (linea 6388) salta duplicados con continue pero podria dejar slots sin inicializar.

Safari maneja esto con una validacion en runtime que detecta “Duplicate transferable” antes de llegar al path vulnerable.

Hallazgos adicionales (no confirmados como explotables)

#	Area	Tipo	Descripcion
4-5	FTL JIT	Type confusion	`CheckInBounds` usa `lowInt32()` sin verificar tipo real
6-7	FTL JIT	Integer overflow	`signExt32To64` en Int52 podria truncar
8-11	DOM	UAF potencial	Callbacks durante `replaceChild`, `insertBefore`, `removeAllChildren`
12-14	YARR	OOB potencial	Backreference subpattern ID sin bounds check
15-17	DFG	Type narrowing	Phi nodes merging tipos incompatibles
18-21	Navigation	Re-entrancy	Navigate durante navigate callback

Por que Safari no crashea

Probamos 20+ vectores de ataque en Safari iOS 26 en un iPhone real:

Race conditions con SharedArrayBuffer.grow() + Workers
valueOf/Symbol.toPrimitive callbacks que modifican buffers durante operaciones
DOM mutation durante callbacks DOMNodeInserted
YARR regex con patrones malformados
JIT type confusion via Proxies que mienten sobre length
structuredClone con buffers duplicados
WebCodecs con dimensiones extremas

Resultado: cero crashes. El equipo de seguridad de Apple tiene multiples capas de defensa:

Gigacage — Aisla TypedArrays y ArrayBuffers en una region de memoria separada
Structure ID randomization — Previene type confusion en el JIT
Bounds re-checking — Despues de cada callback que podria modificar estado
JIT bailout correcto — Proxies y objetos exoticos triggean deoptimizacion
Duplicate validation — structuredClone detecta buffers duplicados en runtime

Conclusion

WebKit es uno de los codebases mas auditados del mundo. Los bugs que encontramos en el source code son reales pero no explotables en Safari 2026 gracias a las mitigaciones. Esto no significa que sean inofensivos — cada inconsistencia de codigo es una deuda tecnica que podria convertirse en vulnerabilidad si las mitigaciones cambian.

Nuestra metodologia — analisis automatizado de 12 fases + revision manual con IA — proceso 17,773 archivos en una sesion. El mismo analisis le tomaria a un equipo humano semanas. Es el futuro del source code auditing.

Este tipo de analisis es parte de nuestro pipeline de pentesting profesional.

Metodologia completa

Analisis automatizado con nuestro 0-day hunter v3.0: 17,773 archivos C/C++ en 12 fases (regex → AST → taint → symbolic → fuzzing → calibration). 393 hallazgos automatizados → 30 investigados manualmente → 21 reportados → 3 confirmados en source. Seguido de testing en Safari iOS 26 en dispositivo fisico ARM64 con 20+ vectores de ataque.

TL;DR

We audited Safari’s engine (WebKit) directly from source code: 17,773 C/C++/ObjC files from JavaScriptCore, WebCore, WTF, and bmalloc. We used our 0-day hunting engine v3.0 (12-phase automated analysis) combined with deep manual review using Claude Opus acting as 4 parallel researchers. We found 21 potential findings (9 critical), including 3 confirmed bugs in the source code. Safari iOS 26 handles them correctly thanks to its mitigations (Gigacage, Structure ID randomization, bounds re-checking), but the bugs exist in the code.

What we audited

WebKit is the rendering engine behind Safari, Mail, App Store, and all apps using WKWebView on iOS/macOS. It’s open source with contributions from Apple, Google, Igalia, and the community.

We cloned the main branch and focused analysis on attack surfaces relevant for RCE via visiting a malicious website:

JavaScriptCore (3,277 files) — JIT compiler (DFG/FTL), interpreter, heap, parser, YARR regex engine
WebCore (12,096 files) — DOM, JS↔C++ bindings, CSS parser, layout, canvas, WebCodecs, Navigation API
WTF (1,038 files) — Memory primitives, strings, containers
bmalloc (484 files) — Memory allocator

Total: 17,773 attack surface files, processed through our 12-phase pipeline.

Methodology

Automated phase (0-day hunter v3.0)

Our 0-day hunting engine executed 12 analysis phases:

Regex sink scan — 60+ dangerous function patterns in C/C++
AST + interprocedural taint — Cross-function data flow
Framework patterns — WebKit/JSC-specific patterns
Type confusion + heap detection — UAF, double-free, heap overflow, integer overflow
Symbolic verification — Z3 constraint solving for path reachability
Automated fuzzing — Harness generation for critical sinks

Automated result: 393 findings of which ~30 warranted manual investigation.

Manual phase (Claude Opus × 4 parallel researchers)

We launched 4 specialized agents simultaneously:

Agent	Area	Lines analyzed
JIT Researcher	DFGSpeculativeJIT, FTLLowerDFGToB3, DFGFixupPhase, DFGConstantFolding	~15,000 lines
DOM Researcher	ContainerNode, Node, MutationObserver, TreeScope	~8,000 lines
YARR Researcher	YarrJIT, YarrInterpreter, YarrPattern, ArrayBuffer	~12,000 lines
Serialization Researcher	SerializedScriptValue, structuredClone, WasmMemory	~10,000 lines

Each agent read actual code, reasoned about data flow, and reported specific findings with line numbers.

Confirmed source code bugs

1. Memory ordering inconsistency in validateIntegerIndex

File: JSGenericTypedArrayViewPrototypeFunctions.h:2068

The validateIntegerIndex() function uses std::memory_order_relaxed to read buffer size, while all other 25+ functions in the same file use std::memory_order_seq_cst.

// Line 2068 — ONLY callsite with relaxed
IdempotentArrayBufferByteLengthGetter<std::memory_order_relaxed> getter;

// Lines 307, 347, 455, 520, 555, 627, 697... — ALL use seq_cst
IdempotentArrayBufferByteLengthGetter<std::memory_order_seq_cst> getter;

On ARM64, memory_order_relaxed doesn’t guarantee seeing the latest value. A concurrent SharedArrayBuffer.grow() from a Worker could cause the main thread to validate an index against a stale size. Our race test showed relaxed produces 33% more stale-length reads than seq_cst on Apple Silicon.

Safari handles this because of additional mitigations (Gigacage isolates TypedArrays, lower-layer bounds re-checking), but the code inconsistency is real.

2. Typo in WebCodecs VideoFrame validation

File: WebCodecsVideoFrame.cpp:283

if (init.visibleRect && (static_cast<size_t>(init.visibleRect->x) % 2
    || static_cast<size_t>(init.visibleRect->x) % 2))
//                                          ^^^ should be ->y

The I420 validation compares x % 2 || x % 2 (X twice) instead of x % 2 || y % 2. This allows a visibleRect with odd Y to pass validation, violating the I420 spec that requires even coordinates for chrominance planes.

Safari handles this with an additional alignment check downstream, but the validation bug exists.

3. Asymmetric duplicate validation in structuredClone

File: SerializedScriptValue.cpp

The code explicitly validates containsDuplicates(imageBitmaps) at line 6691, but has no equivalent check for arrayBuffers in the transfer list. The transferArrayBuffers() function (line 6388) skips duplicates with continue but could leave uninitialized slots.

Safari handles this with a runtime validation that detects “Duplicate transferable” before reaching the vulnerable path.

Additional findings (not confirmed exploitable)

#	Area	Type	Description
4-5	FTL JIT	Type confusion	`CheckInBounds` uses `lowInt32()` without verifying actual type
6-7	FTL JIT	Integer overflow	`signExt32To64` on Int52 could truncate
8-11	DOM	Potential UAF	Callbacks during `replaceChild`, `insertBefore`, `removeAllChildren`
12-14	YARR	Potential OOB	Backreference subpattern ID without bounds check
15-17	DFG	Type narrowing	Phi nodes merging incompatible types
18-21	Navigation	Re-entrancy	Navigate during navigate callback

Why Safari doesn’t crash

We tested 20+ attack vectors on Safari iOS 26 on a physical iPhone:

Race conditions with SharedArrayBuffer.grow() + Workers
valueOf/Symbol.toPrimitive callbacks that modify buffers during operations
DOM mutation during DOMNodeInserted callbacks
Malformed YARR regex patterns
JIT type confusion via Proxies that lie about length
structuredClone with duplicate buffers
WebCodecs with extreme dimensions

Result: zero crashes. Apple’s security team has multiple layers of defense:

Gigacage — Isolates TypedArrays and ArrayBuffers in a separate memory region
Structure ID randomization — Prevents type confusion in JIT
Bounds re-checking — After every callback that could modify state
Correct JIT bailout — Proxies and exotic objects trigger deoptimization
Duplicate validation — structuredClone detects duplicate buffers at runtime

Conclusion

WebKit is one of the most audited codebases in the world. The bugs we found in source code are real but not exploitable in Safari 2026 thanks to mitigations. This doesn’t mean they’re harmless — every code inconsistency is technical debt that could become a vulnerability if mitigations change.

Our methodology — 12-phase automated analysis + AI-powered manual review — processed 17,773 files in a single session. The same analysis would take a human team weeks. This is the future of source code auditing.

This type of analysis is part of our professional pentesting pipeline.

Full methodology

Automated analysis with our 0-day hunter v3.0: 17,773 C/C++ files through 12 phases (regex → AST → taint → symbolic → fuzzing → calibration). 393 automated findings → 30 manually investigated → 21 reported → 3 confirmed in source. Followed by live testing on Safari iOS 26 on a physical ARM64 device with 20+ attack vectors.

WebKit Safari — Auditoria de codigo fuente con IA: 17,773 archivos en una sesion

TL;DR

Que auditamos

Metodologia

Fase automatizada (0-day hunter v3.0)

Fase manual (Claude Opus × 4 investigadores en paralelo)

Hallazgos confirmados en el source code

1. Inconsistencia de memory ordering en validateIntegerIndex

2. Typo en validacion de WebCodecs VideoFrame

3. Asimetria en validacion de duplicados en structuredClone

Hallazgos adicionales (no confirmados como explotables)

Por que Safari no crashea

Conclusion

Metodologia completa

TL;DR

What we audited

Methodology

Automated phase (0-day hunter v3.0)

Manual phase (Claude Opus × 4 parallel researchers)

Confirmed source code bugs

1. Memory ordering inconsistency in validateIntegerIndex

2. Typo in WebCodecs VideoFrame validation

3. Asymmetric duplicate validation in structuredClone

Additional findings (not confirmed exploitable)

Why Safari doesn’t crash

Conclusion

Full methodology

¿Querés un pentest similar?

WebKit Safari — Auditoria de codigo fuente con IA: 17,773 archivos en una sesion WebKit Safari — AI-Powered Source Code Audit: 17,773 Files in a Single Session

TL;DR

Que auditamos

Metodologia

Fase automatizada (0-day hunter v3.0)

Fase manual (Claude Opus × 4 investigadores en paralelo)

Hallazgos confirmados en el source code

1. Inconsistencia de memory ordering en validateIntegerIndex

2. Typo en validacion de WebCodecs VideoFrame

3. Asimetria en validacion de duplicados en structuredClone

Hallazgos adicionales (no confirmados como explotables)

Por que Safari no crashea

Conclusion

Metodologia completa

TL;DR

What we audited

Methodology

Automated phase (0-day hunter v3.0)

Manual phase (Claude Opus × 4 parallel researchers)

Confirmed source code bugs

1. Memory ordering inconsistency in validateIntegerIndex

2. Typo in WebCodecs VideoFrame validation

3. Asymmetric duplicate validation in structuredClone

Additional findings (not confirmed exploitable)

Why Safari doesn’t crash

Conclusion

Full methodology

Artículos relacionados. Related articles.

WordPress 6.7 AI Abilities API — Analisis de superficie de ataque por Prompt Injection WordPress 6.7 AI Abilities API — A Prompt Injection Attack Surface Analysis

Que es un pentest y por que tu fintech lo necesita What is a pentest and why your fintech needs one

Scanner vs pentest: por que Nessus no es suficiente Scanner vs pentest: why Nessus is not enough

¿Querés un pentest similar? Want a pentest like this?

WebKit Safari — Auditoria de codigo fuente con IA: 17,773 archivos en una sesion

Artículos relacionados.

WordPress 6.7 AI Abilities API — Analisis de superficie de ataque por Prompt Injection

Que es un pentest y por que tu fintech lo necesita

Scanner vs pentest: por que Nessus no es suficiente

¿Querés un pentest similar?