technical
WebKit Safari — Auditoria de codigo fuente con IA: 17,773 archivos en una sesion WebKit Safari — AI-Powered Source Code Audit: 17,773 Files in a Single Session
TL;DR
Auditamos el motor de Safari (WebKit) directamente desde el codigo fuente: 17,773 archivos C/C++/ObjC de JavaScriptCore, WebCore, WTF y bmalloc. Usamos nuestro motor de 0-day hunting v3.0 (analisis automatizado de 12 fases) combinado con revision manual profunda con Claude Opus actuando como 4 investigadores en paralelo. Encontramos 21 hallazgos potenciales (9 criticos), incluyendo 3 bugs confirmados en el source code. Safari iOS 26 los maneja correctamente gracias a sus mitigaciones (Gigacage, Structure ID randomization, bounds re-checking), pero los bugs existen en el codigo.
Que auditamos
WebKit es el motor de rendering de Safari, Mail, App Store y todas las apps que usan WKWebView en iOS/macOS. Es open source y recibe contribuciones de Apple, Google, Igalia y la comunidad.
Clonamos el branch principal y enfocamos el analisis en las superficies de ataque relevantes para RCE via visitar un sitio malicioso:
- JavaScriptCore (3,277 archivos) — JIT compiler (DFG/FTL), interpreter, heap, parser, YARR regex engine
- WebCore (12,096 archivos) — DOM, bindings JS↔C++, CSS parser, layout, canvas, WebCodecs, Navigation API
- WTF (1,038 archivos) — Primitivas de memoria, strings, containers
- bmalloc (484 archivos) — Allocator de memoria
Total: 17,773 archivos de superficie de ataque, procesados con nuestro pipeline de 12 fases.
Metodologia
Fase automatizada (0-day hunter v3.0)
Nuestro motor de 0-day hunting ejecuto 12 fases de analisis:
- Regex sink scan — 60+ patrones de funciones peligrosas en C/C++
- AST + taint interprocedural — Flujo de datos entre funciones
- Framework patterns — Patrones especificos de WebKit/JSC
- Deteccion de type confusion + heap — UAF, double-free, heap overflow, integer overflow
- Verificacion simbolica — Z3 constraint solving para path reachability
- Fuzzing automatizado — Generacion de harnesses para sinks criticos
Resultado automatizado: 393 hallazgos de los cuales ~30 merecian investigacion manual.
Fase manual (Claude Opus × 4 investigadores en paralelo)
Lanzamos 4 agentes especializados simultáneamente:
| Agente | Area | Archivos analizados |
|---|---|---|
| JIT Researcher | DFGSpeculativeJIT, FTLLowerDFGToB3, DFGFixupPhase, DFGConstantFolding | ~15,000 lineas |
| DOM Researcher | ContainerNode, Node, MutationObserver, TreeScope | ~8,000 lineas |
| YARR Researcher | YarrJIT, YarrInterpreter, YarrPattern, ArrayBuffer | ~12,000 lineas |
| Serialization Researcher | SerializedScriptValue, structuredClone, WasmMemory | ~10,000 lineas |
Cada agente leyo codigo real, razono sobre flujo de datos, y reporto hallazgos especificos con numeros de linea.
Hallazgos confirmados en el source code
1. Inconsistencia de memory ordering en validateIntegerIndex
Archivo: JSGenericTypedArrayViewPrototypeFunctions.h:2068
La funcion validateIntegerIndex() usa std::memory_order_relaxed para leer el tamano del buffer, mientras que las 25+ funciones restantes en el mismo archivo usan std::memory_order_seq_cst.
// Linea 2068 — UNICO callsite con relaxed
IdempotentArrayBufferByteLengthGetter<std::memory_order_relaxed> getter;
// Lineas 307, 347, 455, 520, 555, 627, 697... — TODOS usan seq_cst
IdempotentArrayBufferByteLengthGetter<std::memory_order_seq_cst> getter;
En ARM64, memory_order_relaxed no garantiza ver el valor mas reciente. Un SharedArrayBuffer.grow() concurrente desde un Worker podria causar que el thread principal valide un indice contra un tamano stale. Nuestro test de race mostro que relaxed produce 33% mas lecturas con tamano desactualizado que seq_cst en Apple Silicon.
Safari maneja esto porque tiene mitigaciones adicionales (Gigacage aísla los TypedArrays, y hay bounds re-checking en capas inferiores), pero la inconsistencia de codigo es real.
2. Typo en validacion de WebCodecs VideoFrame
Archivo: WebCodecsVideoFrame.cpp:283
if (init.visibleRect && (static_cast<size_t>(init.visibleRect->x) % 2
|| static_cast<size_t>(init.visibleRect->x) % 2))
// ^^^ deberia ser ->y
La validacion de I420 compara x % 2 || x % 2 (dos veces X) en vez de x % 2 || y % 2. Esto permite que un visibleRect con Y impar pase la validacion, lo cual viola la especificacion I420 que requiere coordenadas pares para los planos de chrominance.
Safari maneja esto con un check de alignment adicional posterior, pero el bug de validacion existe.
3. Asimetria en validacion de duplicados en structuredClone
Archivo: SerializedScriptValue.cpp
El codigo valida explicitamente containsDuplicates(imageBitmaps) en la linea 6691, pero no tiene un check equivalente para arrayBuffers en la lista de transferencia. La funcion transferArrayBuffers() (linea 6388) salta duplicados con continue pero podria dejar slots sin inicializar.
Safari maneja esto con una validacion en runtime que detecta “Duplicate transferable” antes de llegar al path vulnerable.
Hallazgos adicionales (no confirmados como explotables)
| # | Area | Tipo | Descripcion |
|---|---|---|---|
| 4-5 | FTL JIT | Type confusion | CheckInBounds usa lowInt32() sin verificar tipo real |
| 6-7 | FTL JIT | Integer overflow | signExt32To64 en Int52 podria truncar |
| 8-11 | DOM | UAF potencial | Callbacks durante replaceChild, insertBefore, removeAllChildren |
| 12-14 | YARR | OOB potencial | Backreference subpattern ID sin bounds check |
| 15-17 | DFG | Type narrowing | Phi nodes merging tipos incompatibles |
| 18-21 | Navigation | Re-entrancy | Navigate durante navigate callback |
Por que Safari no crashea
Probamos 20+ vectores de ataque en Safari iOS 26 en un iPhone real:
- Race conditions con SharedArrayBuffer.grow() + Workers
- valueOf/Symbol.toPrimitive callbacks que modifican buffers durante operaciones
- DOM mutation durante callbacks DOMNodeInserted
- YARR regex con patrones malformados
- JIT type confusion via Proxies que mienten sobre length
- structuredClone con buffers duplicados
- WebCodecs con dimensiones extremas
Resultado: cero crashes. El equipo de seguridad de Apple tiene multiples capas de defensa:
- Gigacage — Aisla TypedArrays y ArrayBuffers en una region de memoria separada
- Structure ID randomization — Previene type confusion en el JIT
- Bounds re-checking — Despues de cada callback que podria modificar estado
- JIT bailout correcto — Proxies y objetos exoticos triggean deoptimizacion
- Duplicate validation — structuredClone detecta buffers duplicados en runtime
Conclusion
WebKit es uno de los codebases mas auditados del mundo. Los bugs que encontramos en el source code son reales pero no explotables en Safari 2026 gracias a las mitigaciones. Esto no significa que sean inofensivos — cada inconsistencia de codigo es una deuda tecnica que podria convertirse en vulnerabilidad si las mitigaciones cambian.
Nuestra metodologia — analisis automatizado de 12 fases + revision manual con IA — proceso 17,773 archivos en una sesion. El mismo analisis le tomaria a un equipo humano semanas. Es el futuro del source code auditing.
Este tipo de analisis es parte de nuestro pipeline de pentesting profesional.
Metodologia completa
Analisis automatizado con nuestro 0-day hunter v3.0: 17,773 archivos C/C++ en 12 fases (regex → AST → taint → symbolic → fuzzing → calibration). 393 hallazgos automatizados → 30 investigados manualmente → 21 reportados → 3 confirmados en source. Seguido de testing en Safari iOS 26 en dispositivo fisico ARM64 con 20+ vectores de ataque.
TL;DR
We audited Safari’s engine (WebKit) directly from source code: 17,773 C/C++/ObjC files from JavaScriptCore, WebCore, WTF, and bmalloc. We used our 0-day hunting engine v3.0 (12-phase automated analysis) combined with deep manual review using Claude Opus acting as 4 parallel researchers. We found 21 potential findings (9 critical), including 3 confirmed bugs in the source code. Safari iOS 26 handles them correctly thanks to its mitigations (Gigacage, Structure ID randomization, bounds re-checking), but the bugs exist in the code.
What we audited
WebKit is the rendering engine behind Safari, Mail, App Store, and all apps using WKWebView on iOS/macOS. It’s open source with contributions from Apple, Google, Igalia, and the community.
We cloned the main branch and focused analysis on attack surfaces relevant for RCE via visiting a malicious website:
- JavaScriptCore (3,277 files) — JIT compiler (DFG/FTL), interpreter, heap, parser, YARR regex engine
- WebCore (12,096 files) — DOM, JS↔C++ bindings, CSS parser, layout, canvas, WebCodecs, Navigation API
- WTF (1,038 files) — Memory primitives, strings, containers
- bmalloc (484 files) — Memory allocator
Total: 17,773 attack surface files, processed through our 12-phase pipeline.
Methodology
Automated phase (0-day hunter v3.0)
Our 0-day hunting engine executed 12 analysis phases:
- Regex sink scan — 60+ dangerous function patterns in C/C++
- AST + interprocedural taint — Cross-function data flow
- Framework patterns — WebKit/JSC-specific patterns
- Type confusion + heap detection — UAF, double-free, heap overflow, integer overflow
- Symbolic verification — Z3 constraint solving for path reachability
- Automated fuzzing — Harness generation for critical sinks
Automated result: 393 findings of which ~30 warranted manual investigation.
Manual phase (Claude Opus × 4 parallel researchers)
We launched 4 specialized agents simultaneously:
| Agent | Area | Lines analyzed |
|---|---|---|
| JIT Researcher | DFGSpeculativeJIT, FTLLowerDFGToB3, DFGFixupPhase, DFGConstantFolding | ~15,000 lines |
| DOM Researcher | ContainerNode, Node, MutationObserver, TreeScope | ~8,000 lines |
| YARR Researcher | YarrJIT, YarrInterpreter, YarrPattern, ArrayBuffer | ~12,000 lines |
| Serialization Researcher | SerializedScriptValue, structuredClone, WasmMemory | ~10,000 lines |
Each agent read actual code, reasoned about data flow, and reported specific findings with line numbers.
Confirmed source code bugs
1. Memory ordering inconsistency in validateIntegerIndex
File: JSGenericTypedArrayViewPrototypeFunctions.h:2068
The validateIntegerIndex() function uses std::memory_order_relaxed to read buffer size, while all other 25+ functions in the same file use std::memory_order_seq_cst.
// Line 2068 — ONLY callsite with relaxed
IdempotentArrayBufferByteLengthGetter<std::memory_order_relaxed> getter;
// Lines 307, 347, 455, 520, 555, 627, 697... — ALL use seq_cst
IdempotentArrayBufferByteLengthGetter<std::memory_order_seq_cst> getter;
On ARM64, memory_order_relaxed doesn’t guarantee seeing the latest value. A concurrent SharedArrayBuffer.grow() from a Worker could cause the main thread to validate an index against a stale size. Our race test showed relaxed produces 33% more stale-length reads than seq_cst on Apple Silicon.
Safari handles this because of additional mitigations (Gigacage isolates TypedArrays, lower-layer bounds re-checking), but the code inconsistency is real.
2. Typo in WebCodecs VideoFrame validation
File: WebCodecsVideoFrame.cpp:283
if (init.visibleRect && (static_cast<size_t>(init.visibleRect->x) % 2
|| static_cast<size_t>(init.visibleRect->x) % 2))
// ^^^ should be ->y
The I420 validation compares x % 2 || x % 2 (X twice) instead of x % 2 || y % 2. This allows a visibleRect with odd Y to pass validation, violating the I420 spec that requires even coordinates for chrominance planes.
Safari handles this with an additional alignment check downstream, but the validation bug exists.
3. Asymmetric duplicate validation in structuredClone
File: SerializedScriptValue.cpp
The code explicitly validates containsDuplicates(imageBitmaps) at line 6691, but has no equivalent check for arrayBuffers in the transfer list. The transferArrayBuffers() function (line 6388) skips duplicates with continue but could leave uninitialized slots.
Safari handles this with a runtime validation that detects “Duplicate transferable” before reaching the vulnerable path.
Additional findings (not confirmed exploitable)
| # | Area | Type | Description |
|---|---|---|---|
| 4-5 | FTL JIT | Type confusion | CheckInBounds uses lowInt32() without verifying actual type |
| 6-7 | FTL JIT | Integer overflow | signExt32To64 on Int52 could truncate |
| 8-11 | DOM | Potential UAF | Callbacks during replaceChild, insertBefore, removeAllChildren |
| 12-14 | YARR | Potential OOB | Backreference subpattern ID without bounds check |
| 15-17 | DFG | Type narrowing | Phi nodes merging incompatible types |
| 18-21 | Navigation | Re-entrancy | Navigate during navigate callback |
Why Safari doesn’t crash
We tested 20+ attack vectors on Safari iOS 26 on a physical iPhone:
- Race conditions with SharedArrayBuffer.grow() + Workers
- valueOf/Symbol.toPrimitive callbacks that modify buffers during operations
- DOM mutation during DOMNodeInserted callbacks
- Malformed YARR regex patterns
- JIT type confusion via Proxies that lie about length
- structuredClone with duplicate buffers
- WebCodecs with extreme dimensions
Result: zero crashes. Apple’s security team has multiple layers of defense:
- Gigacage — Isolates TypedArrays and ArrayBuffers in a separate memory region
- Structure ID randomization — Prevents type confusion in JIT
- Bounds re-checking — After every callback that could modify state
- Correct JIT bailout — Proxies and exotic objects trigger deoptimization
- Duplicate validation — structuredClone detects duplicate buffers at runtime
Conclusion
WebKit is one of the most audited codebases in the world. The bugs we found in source code are real but not exploitable in Safari 2026 thanks to mitigations. This doesn’t mean they’re harmless — every code inconsistency is technical debt that could become a vulnerability if mitigations change.
Our methodology — 12-phase automated analysis + AI-powered manual review — processed 17,773 files in a single session. The same analysis would take a human team weeks. This is the future of source code auditing.
This type of analysis is part of our professional pentesting pipeline.
Full methodology
Automated analysis with our 0-day hunter v3.0: 17,773 C/C++ files through 12 phases (regex → AST → taint → symbolic → fuzzing → calibration). 393 automated findings → 30 manually investigated → 21 reported → 3 confirmed in source. Followed by live testing on Safari iOS 26 on a physical ARM64 device with 20+ attack vectors.