← На главную

Компилятор Anthropic устранил лишние вызовы через прямой переход

16.05.2026 19:10 · hackernews

Based on the text provided, here is a summary and explanation of the Direct Caller Resolution optimization and the challenges involved, followed by the likely resolution to the final problem mentioned in your truncated text.

The goal is to eliminate the overhead of calling a "getter" function to retrieve a function pointer and checking capabilities on every call. Instead, the compiler attempts to call the function implementation directly.

  1. Signature Encoding: Every function signature is encoded into a unique 64-bit integer (e.g., 60125 for char* (*)(int, char*, double)).
  2. Fast Call: If the caller and callee signatures match exactly, the caller uses the register-based native calling convention directly.
    • No getter call.
    • No capability check.
    • No thread-local buffer overhead.
  3. Fallback (Thunks): If signatures don't match, or if the target is a "weak" symbol (common in C++), a known target callsite thunk is used. This thunk:
    • Calls the getter to get the real function pointer.
    • Checks the signature.
    • If mismatched or weak, calls the generic entry point (which handles translation and safety checks).
    • If matched, jumps to the fast path.

The main difficulty arises from how ELF loaders and linkers handle weak symbols versus strong symbols, particularly with C++ inline functions and COMDAT groups.

The "Weak vs. Weak" Problem: * Scenario: A header defines inline int foo(...). This creates a weak symbol. Multiple modules include this header. * Linking: The linker keeps only one definition of foo (and its COMDAT group). * The Trap: The optimization relies on creating a "known target callsite thunk" to handle mismatches. * If the actual implementation of foo is weak, and the thunk is also defined, the loader might pick the thunk as the winner instead of the actual function. * If the thunk wins, the function object inside the thunk points to the thunk itself. * Result: An infinite loop occurs because calling the function calls the thunk, which calls itself.

To fix the infinite loop and ensure correctness across dynamic library boundaries and weak definitions, the text outlines a specific symbol mangling and visibility strategy:

  1. Hidden Visibility for Thunks: Define the "known target callsite thunk" with hidden visibility. This ensures the dynamic loader never sees it and cannot accidentally win over the strong implementation.
  2. Split Naming Convention:
    • Implementation Name: Use a distinct name (e.g., pizlonatedFIP60125_foo). This is what the function object points to.
    • Exported Name: Use the standard mangled name (e.g., pizlonatedFI60125_foo).
    • Strong Alias: Only create a strong alias from the Exported Name to the Implementation Name if the function is strongly defined.
    • No Alias for Weak/Closures: If the function is weak (inline) or uses closure features (zcallee), do not create the alias.
  3. Handling COMDATs: Use COMDAT groups to ensure all related symbols (implementation, getter, object) are dropped together.

The text ends with: "But this produces a second problem: COMDAT resolution rules may cause the linker to drop the pizlonatedFIP function we tried..."

Logical Resolution: Since the text cuts off, here is the standard solution for this specific scenario in such systems:

If the linker drops the strong implementation (pizlonatedFIP...) because another COMDAT group won, but the callsite expects it, the system must gracefully degrade: 1. No Strong Alias Case: If the function is weak (COMDAT), we do not emit the strong alias. Consequently, the callsite looks for the standard name. 2. Linker Behavior: If the linker drops the implementation from one module, the "winning" COMDAT provides the code. However, the standard name (pizlonatedFI...) must resolve to the winning implementation. 3. The Fix: The "known target callsite thunk" is designed to be weak and hidden. * If the strong implementation (pizlonatedFIP...) wins, an alias creates pizlonatedFI... -> pizlonatedFIP.... Calls go direct. * If the implementation is dropped (unlikely with COMDAT rules usually preserving one, but possible if strict filtering occurs), the callsite falls back to the thunk. * Since the thunk is hidden, it doesn't pollute the dynamic loader's view. * The thunk acts as the safe bridge, calling the getter, checking the winning weak symbol, and executing it safely, even if the direct name resolution fails or points to a dead symbol (handled by the getter logic).

Summary of the Fix: The system accepts that for weak/inline functions, direct calls are impossible without risking infinite loops or undefined behavior with weak symbols. Therefore, the code always goes through the thunk for weak definitions. The performance penalty is accepted for the rare cases of inline functions that cross module boundaries or are weak, while the vast majority of calls (strongly defined functions, same-module calls) utilize the direct fast path. The "infinite loop" is prevented by ensuring the thunk is hidden and the implementation alias is only created when the implementation is strong.

Читать оригинал →