Close Menu
VLSI Web
  • Home
    • About Us
    • Contact Us
    • Privacy Policy
  • Analog Design
  • Digital Design
    • Digital Circuits
    • Verilog
    • VHDL
    • System Verilog
    • UVM
  • Job Roles
    • RTL Design
    • Design Verification
    • Physical Design
    • DFT
    • STA
  • Interview Questions
  • Informative
Facebook X (Twitter) Instagram LinkedIn
Instagram LinkedIn WhatsApp Telegram
VLSI Web
  • Home
    • About Us
    • Contact Us
    • Privacy Policy
  • Analog Design
  • Digital Design
    • Digital Circuits
    • Verilog
    • VHDL
    • System Verilog
    • UVM
  • Job Roles
    • RTL Design
    • Design Verification
    • Physical Design
    • DFT
    • STA
  • Interview Questions
  • Informative
VLSI Web
Interview Questions

CDC Interview Questions for VLSI Interviews

Raju GorlaBy Raju Gorla8 March 2024Updated:21 March 2026No Comments13 Mins Read
CDC interview questions
Share
Facebook Twitter LinkedIn Email Telegram WhatsApp

I’ve compiled 40 CDC (Clock Domain Crossing) interview questions from real tapeouts, bug hunts, and the painful lessons learned when metastability bugs slipped into silicon. CDC is where the sneakiest failures hide—timing looks clean, logic is correct, but when the chip wakes up at 3 AM in production, clock domains are crossing wrong. These questions cover everything from the physics of metastability to formal verification techniques that catch synchronization bugs before fabrication.

💡 Who This Is For: Verification engineers, RTL designers doing cross-domain interfaces, and anyone prepping for roles at Synopsys, Cadence, Mentor, or companies like Intel, Apple, and NVIDIA where multi-clock SoCs are the norm. If you think “just add two flops” is enough CDC knowledge, you need this guide.

Table of Contents

  • Quick Navigation
  • Section 1: CDC Fundamentals (Q1–Q10)
    • Q1. What is a clock domain crossing? Why is it dangerous?
    • Q2. What is metastability? When does it occur? (with timing violation diagram showing setup window)
    • Q3. What is MTBF? Write the formula and explain each term
    • Q4. What does a two-flip-flop synchronizer actually do? (with timing diagram showing sync1, sync2, resolution window)
    • Q5. Why do we need 2 flops and not just 1? How many flops does military/space-grade design use?
    • Q6. What makes a flip-flop good for synchronization? (low tau, metastability-resistant cells)
    • Q7. Single-bit vs multi-bit CDC — what’s the fundamental problem with multi-bit?
    • Q8. What is the difference between a CDC and a timing constraint? (set_false_path vs actual synchronizer)
    • Q9. What are the failure modes of CDC design? (metastability propagation, data coherency, glitch)
    • Q10. What is clock domain crossing in the context of SoC design? (multiple VDD domains, gated clocks)
  • Section 2: CDC Synchronization Techniques (Q11–Q20)
    • Q11. How does a pulse synchronizer work? When should you use it? (req-ack handshake diagram)
    • Q12. Async FIFO for multi-bit CDC — architecture overview (write pointer in wclk, read pointer in rclk)
    • Q13. Why must async FIFO use Gray code pointers? (explain why only 1 bit changes)
    • Q14. How do you calculate async FIFO depth for a given CDC interface?
    • Q15. What is a bus encoding synchronizer? (one-hot vs binary encoding risk)
    • Q16. What is an enable synchronizer (open-loop vs closed-loop)?
    • Q17. How do you synchronize a reset across clock domains? (reset synchronizer with async assert, sync deassert)
    • Q18. Fast-to-slow domain crossing — what special precautions are needed?
    • Q19. Slow-to-fast domain crossing — is a 2-flop synchronizer sufficient?
    • Q20. What happens at CDC boundaries with power gating? How do isolation cells affect sync?
  • Section 3: CDC Verification (Q21–Q30)
    • Q21. How do CDC lint tools work? (SpyGlass CDC, Cadence JasperGold CDC)
    • Q22. What is structural CDC analysis vs functional CDC analysis?
    • Q23. What are common false CDC paths that tools flag? (how to waive correctly)
    • Q24. How do you simulate CDC behavior? (X-propagation model, forced metastability)
    • Q25. What is clock domain identification in CDC analysis?
    • Q26. What is a CDC reconvergence path and why is it dangerous?
    • Q27. How do you verify an async FIFO? (functional checks + formal)
    • Q28. What does a CDC sign-off flow look like? (lint → formal → simulation → review)
    • Q29. How do you handle generated clocks in CDC analysis?
    • Q30. What are black-box CDC issues when integrating third-party IP?
  • Section 4: Advanced CDC (Q31–Q40)
    • Q31. Token-based synchronization protocol for complex multi-bit CDC
    • Q32. CDC in NoC (Network-on-Chip) — how packets handle domain crossings
    • Q33. Multi-clock SoC CDC strategy — who owns CDC verification?
    • Q34. CDC and timing closure interaction — why CDC paths must be false-pathed
    • Q35. CDC with clock mux — challenges when source clock switches dynamically
    • Q36. FIFO-based CDC throughput analysis — when does latency become a problem?
    • Q37. ACE/CHI coherence protocol and CDC — how do snoop responses handle clock domains
    • Q38. CDC in a DDR PHY — how read data crosses from phy clock to system clock
    • Q39. What is the "same edge" vs "opposite edge" launch-capture in CDC context?
    • Q40. CDC whiteboard problem — interviewer gives you: 32-bit bus, 200MHz source, 333MHz destination, what do you do?
  • Interview Cheatsheet: CDC Most-Asked Topics by Company
  • Resources & Further Reading

Quick Navigation

Section 1: CDC Fundamentals (Q1–Q10) | Section 2: CDC Synchronization Techniques (Q11–Q20) | Section 3: CDC Verification (Q21–Q30) | Section 4: Advanced CDC (Q31–Q40) | Interview Cheatsheet

Section 1: CDC Fundamentals (Q1–Q10)

Q1. What is a clock domain crossing? Why is it dangerous?

A clock domain crossing (CDC) occurs when a signal crosses from one clock domain to another—meaning a signal generated in clock domain A must be safely sampled by a flip-flop in clock domain B, where clocks A and B are asynchronous (no defined phase relationship).

It’s dangerous because if you naively pass a signal directly from domain A to domain B without synchronization, the receiving flip-flop samples the signal near a clock edge. If the signal changes near the clock edge, the flip-flop may metastasize: it enters a forbidden state where the output oscillates unpredictably, taking milliseconds to settle. This metastability can propagate through subsequent logic, causing incorrect data or state corruption. In a real chip, this might manifest as a random hang, data corruption, or spurious interrupt. CDC bugs are particularly insidious because they’re rare (happen when timing aligns badly), hard to simulate (need to force metastability), and catastrophic when they occur. This is why CDC verification is a separate discipline: you cannot catch these bugs with functional simulation alone—you need specialized analysis and formal tools.

📌 Note: CDC bugs are the sneakiest failures I’ve ever debugged. They don’t fail consistently, they don’t appear in simulation at slow clock speeds, and they can take weeks to manifest in the field. Prevention is infinitely better than trying to find and fix a CDC bug post-silicon.

Q2. What is metastability? When does it occur? (with timing violation diagram showing setup window)

Metastability is a transient state where a flip-flop’s output voltage sits between logic 0 and 1, oscillating unpredictably. It occurs when a flip-flop’s input violates setup or hold time: the data changes too close to the clock edge for the flip-flop to resolve cleanly.

In normal operation, setup time requires data to be stable at least T_setup before the clock edge. If data changes within the setup window (the T_setup period before the clock), the flip-flop cannot capture a clean value—the internal cross-coupled latches of the flip-flop reach an unstable equilibrium where Q and Q-bar are both at intermediate voltage (around VDD/2). From this unstable state, the flip-flop oscillates wildly at extremely high frequency (nanosecond-scale oscillations). Eventually (after nanoseconds to microseconds depending on process variation), random noise or symmetry-breaking causes the flip-flop to resolve to 0 or 1. This recovery time is called the resolution time or metastability window. Across a large population of flip-flops, different ones resolve differently—some snap to 0, some to 1, some take longer. This is why metastability cannot be designed away; it can only be managed with time: give the flip-flop enough cycles to resolve before using its output.

Timing Violation Diagram (Setup Time Violation):

SAFE REGION (Setup Time Met):
CLK:  ___|‾‾‾‾|___|‾‾‾‾|___|‾‾‾
D:    XXXXX|‾‾‾‾‾‾|XXXXX      <- Data stable T_setup before clock
      <--T_setup-->
      Data is captured cleanly into Q (0 or 1, deterministic)


METASTABILITY REGION (Setup Time Violated):
CLK:  ___|‾‾‾‾|___|‾‾‾‾|___|‾‾‾
D:    XXXXX|~~‾‾|XXXXX         <- Data changes DURING setup window!
           <-Violation->
      Flip-flop output Q oscillates between 0 and 1 (metastable)
      Recovery time = time for Q to settle

Q:    ‾‾‾‾‾|////|‾‾‾‾|______ <- Oscillation, then resolution
          <-Metro ->  Resolves to 0 or 1 (unpredictable)

💡 Tip: The MTBF formula looks scary but the concept is simple: metastability is exponentially less likely if you give the flip-flop time to resolve. Two flops = ~microsecond MTBF for modern processes. Three flops = ~year MTBF. This is why CDC uses flops as a commodity.

Q3. What is MTBF? Write the formula and explain each term

MTBF (Mean Time Between Failures) quantifies the probability that a flip-flop will metastasize and cause a logic error. It’s a statistical measure: out of N synchronization flip-flops, how many are expected to fail per unit time?

The formula is: MTBF = (e^(2*f_sync*τ)) / (2*C*f_data)

Where: f_sync = frequency of the synchronizing clock (MHz), τ = recovery time available to the flip-flop (nanoseconds), C = proportionality constant (depends on process, typically 0.5–1.0), f_data = frequency of data transitions (how often the CDC interface toggles). The exponential in the numerator grows with recovery time—if you double τ, MTBF increases exponentially. The denominator shows MTBF decreases with higher data frequency (more opportunities for metastability). Example calculation: two synchronizer flops, f_sync = 400 MHz, τ = 2.5 ns (1 clock cycle), f_data = 1 MHz (conservative), C = 0.5 gives MTBF ~10^9 seconds (~30 years). This is acceptable. If f_data = 100 MHz, MTBF drops to ~10^7 seconds (100 days)—risky. This is why high-bandwidth CDC needs more flops or special circuits like FIFOs.

📌 Note: MTBF is conservative. In practice, CDC failures are rare even if calculated MTBF seems low, because the formula uses worst-case constants. But in aerospace or medical devices, you calculate MTBF pessimistically and add margin—a 30-year MTBF is not acceptable if the device must run 50 years reliably.

Q4. What does a two-flip-flop synchronizer actually do? (with timing diagram showing sync1, sync2, resolution window)

A two-flip-flop synchronizer (also called a “two-stage synchronizer”) takes an asynchronous input signal from domain A and produces a synchronized output in domain B by chaining two flip-flops in the B clock domain, allowing the first flop to metastasize without affecting the output.

The first flip-flop (sync1) is the “risky” one: it samples the asynchronous input and may metastasize. The second flip-flop (sync2) samples sync1’s output one clock cycle later. By the time sync1 drives sync2’s input, sync1 has had a full clock cycle (the recovery time) to resolve from metastability. The resolution time of modern flip-flops is typically 100–500 ps, well under one clock cycle (e.g., 2.5 ns at 400 MHz). So sync1 settles to a stable 0 or 1 before sync2 samples it, and sync2 experiences a clean input. The output of sync2 is metastability-safe and can be used in domain B without further delays. The cost: one cycle of latency from input to output. If the input toggles every cycle, sync1 keeps being asked to sample without resolution time—this increases failure probability, which is why MTBF degrades with higher data rates.

Two-Flop Synchronizer Timing Diagram:

async_in:   ______|‾‾‾‾|_______|‾‾‾‾‾|___
                  ^sample1     ^sample2

clk_B:      _|‾|_|‾|_|‾|_|‾|_|‾|_|‾|_|‾|_
            cycle0 1   2   3   4

sync1:      ______|//|‾‾‾‾|___|‾‾‾‾|___
            metro ^   resolves to 0
                   <-recovery->

sync2:      ______________|‾‾‾‾|___|‾‾‾|
                          (sync1 settled, clean sample)

OUTPUT:     ______________|‾‾‾‾|___|‾‾‾|
            (safe for domain B logic)

Latency: 2 cycles (input to sync2 output)

💡 Tip: Two flops is the minimum for reasonable MTBF in commercial designs. Military specs often require three flops. Space applications (radiation hardened) might need four or more to handle SEU (single-event upsets) on top of metastability.

Q5. Why do we need 2 flops and not just 1? How many flops does military/space-grade design use?

One flip-flop is insufficient because a single synchronizer doesn’t provide adequate recovery time. If the input toggles frequently, sync1 metastabilizes on nearly every cycle with no time to resolve before the next sample. The MTBF with a single flop is unacceptably low for commercial electronics (milliseconds to seconds).

With two flops, sync1 has at least one full clock cycle to recover before its output is sampled by sync2. Modern processes at commercial voltages (typical corner) achieve recovery times of 100–300 ps, well under one clock period (>1.5 ns at 400 MHz). So two flops provides a sufficient margin. However, this assumes: (1) the input doesn’t change every cycle (if it does, MTBF drops); (2) process variation doesn’t push recovery time closer to the clock edge; (3) voltage/temperature stay in nominal range. Military and aerospace designs are far more conservative. Military-grade specs often require three flops for MTBF of years, with formal proof of safety margins. Space applications (NASA, ESA) frequently use four or more flops, because cosmic radiation can cause bit flips (SEUs) on top of metastability—extra flops provide redundancy. Some space missions use four synchronized copies with majority voting to ensure a single radiation-induced error doesn’t corrupt the signal.

📌 Note: The “two flops minimum” rule comes from industrial experience, not theory. In theory, if you wait long enough after sync1 metastabilizes, any number of downstream logic is safe. But in practice, two flops provides a reasonable buffer against process/voltage/temperature variation without excessive latency overhead.

Q6. What makes a flip-flop good for synchronization? (low tau, metastability-resistant cells)

A good synchronizer flip-flop minimizes recovery time (τ) and has design features that help it resolve quickly from metastability. These are typically different from the fastest flip-flops used in datapaths.

Key properties: (1) Low tau (τ) — the internal cross-coupled latch must have low RC time constant, so stored charge/voltage settles quickly. Wider gate fingers (larger transistors) reduce parasitic capacitance per stage. (2) Balanced latch design — the two nodes of the SR latch (storing Q and Q-bar) must be perfectly matched; asymmetry slows resolution. (3) Weak drive strength — sounds counter-intuitive, but lower drive strength means lower capacitive loading, reducing τ. A DFF_X1 (weak flop) often has better τ than DFF_X4 (strong flop). (4) Dedicated synchronizer cells — some libraries offer DFFSYNC or DFFM (metastability-hardened flops) specifically designed for CDC with optimized tau. Using a general-purpose fast flop for CDC can actually be worse than a specialized weak flop. When choosing synchronizer cells, check the liberty file for cells marked “synchronizer” or check tau specifications. If not available, use X1 (minimum) strength flops—they’re weaker drivers but better for synchronization.

💡 Tip: I’ve seen projects spec’d with “fast flops for sync” and then wondered why CDC failure rates were high. Check your library data sheet for tau (recovery time) on candidate cells. A 40 nm process might have tau ~400 ps for a standard flop but only 150 ps for a specialized DFF_SYNC_X1. Use the better cell for CDC.

Q7. Single-bit vs multi-bit CDC — what’s the fundamental problem with multi-bit?

Single-bit CDC (one signal crossing domains) is straightforward: use a two-flop synchronizer, accept one-cycle latency, move on. Multi-bit CDC (an N-bit bus crossing domains) has a critical problem: metastability can resolve differently on different bits, leading to corrupted values.

Example: you send an 8-bit address {A7, A6, … A0} from domain A to domain B. If you naively put two synchronizers (one per bit) in parallel, each bit independently samples and synchronizes. Due to clock skew and different setup/hold margins, one bit might stabilize to the correct value while another is still metastable. The receiving domain sees a corrupted address (e.g., 0x42 when 0x45 was sent). Now the bus reads the wrong memory location, corrupting data. This is the fundamental problem: parallel synchronizers don’t guarantee data coherency. Solutions: (1) Encode the address — use Gray code encoding where only one bit changes per transition, so even partial corruption is limited; (2) Async FIFO — use separate pointer synchronizers in each direction with Gray-code comparison, allowing multi-bit data to flow safely; (3) Handshake protocol — send single-bit “valid” signal, wait for single-bit “ack” back, transfer only after both sides agree. Single-bit control signals are always safer than multi-bit buses without proper CDC architecture.

📌 Note: “Just add two flops to every bit” is a beginner mistake that causes mysterious data corruption bugs. Use an async FIFO or handshake for multi-bit data. Single-flop synchronizers are only for single control bits (enable, valid, etc.).

Q8. What is the difference between a CDC and a timing constraint? (set_false_path vs actual synchronizer)

A CDC and a timing constraint are often confused, but they’re opposite approaches: a CDC is a circuit (a real synchronizer) that safely handles asynchronous signals; a timing constraint (set_false_path) tells the tool to ignore a timing path because you don’t care when the signal arrives.

set_false_path says: “this path never matters for timing, so don’t optimize it.” Example: a mode_sel signal that selects between mode A and mode B—only one is active at any time, so the path through the inactive mode is false. The tool ignores this path, potentially leaving it very slow. This works fine for paths that truly don’t execute. But if you apply set_false_path to an actual CDC path (thinking the tool will ignore the metastability problem), you’re wrong—the path will be routed without synchronizers, and CDC failures will occur. An actual CDC (a synchronizer circuit) solves the problem physically: the two-flop chain ensures metastability is resolved before the signal reaches domain B logic. The difference: false_path is a lie to the tool; a CDC is a truth about the circuit. For cross-domain paths, always use a real CDC (synchronizer). Reserve false_path for paths that logically cannot execute (mode selects, test modes), not for asynchronous boundaries.

💡 Tip: Here’s the actual question I was asked at a CDC review: “If I have a signal crossing clock domains, should I use set_false_path or build a synchronizer?” The correct answer is always “synchronizer,” but I’ve seen engineers confused about this for years.

Q9. What are the failure modes of CDC design? (metastability propagation, data coherency, glitch)

CDC design has three distinct failure modes, each with different causes and symptoms:

1. Metastability propagation: A flip-flop metastabilizes and its corrupted output drives subsequent logic before settling. Example: sync1 flop is metastable, and before it settles, a downstream AND gate uses its output, producing glitchy combinational logic that false-triggers logic downstream. The fix: ensure sufficient recovery time between synchronizer stages. (2) Data coherency: Multiple bits of a bus synchronize to different values due to skew, causing a corrupted word. Example: address 0xAB intended, but bits [7:4] arrive on-time while bits [3:0] are still metastable, resulting in 0xA? (corrupted). The fix: use Gray-coded pointers (only one bit changes) or async FIFOs (dedicated pointer synchronizers) for multi-bit data. (3) Glitch during metastability: A signal that’s metastable oscillates (Q alternates 0, 1, 0, 1, …), creating multiple transitions. If downstream logic samples this (e.g., an edge detector), it might incorrectly detect multiple edges when only one was intended. The fix: let metastability settle fully before using the signal in edge-sensitive logic (e.g., pulse detectors need extra flops to debounce).

📌 Note: These failure modes are independent. You can fix metastability propagation but still have data coherency issues, or vice versa. Complete CDC verification requires addressing all three.

Q10. What is clock domain crossing in the context of SoC design? (multiple VDD domains, gated clocks)

In simple terms, a CDC is when signal crosses between two clock domains. In a real SoC, the problem is far more complex because there are often multiple clock domains, voltage domains, and even gated clocks that add subtle timing dependencies.

Clock domains (obvious): A system clock (clk_sys), a peripheral clock (clk_peri), a DDR clock (clk_ddr). Each has its own frequency and phase. Any signal crossing between them is a CDC. Voltage domains (subtle): Some blocks run at 0.7V (low power), others at 1.2V (performance). A signal crossing voltage boundaries might have different logic thresholds, slowing transitions and affecting setup/hold timing. This is a voltage-domain crossing (VDC), harder to model than a pure CDC. Gated clocks (very subtle): If a clock is gated (conditionally stopped), the synchronizer might see clock edges disappear temporarily. If sync1’s clock is gated off while sync2’s clock is still running, sync1 holds its value and sync2 samples a stale synchronized result. This is a power domain crossing that requires special handling (isolation cells, asynchronous reset coordination). Real SoCs have all three: multiple clocks, multiple voltages, multiple power states. CDC verification in such designs is exponentially more complex. This is why companies like Apple and Qualcomm have dedicated CDC teams reviewing every cross-domain interface before tapeout.

💡 Tip: If you’re interviewing at a company with complex power management (mobile SoCs, AI chips), they’ll grill you on voltage and power domain crossings. Study the interaction between isolation cells, level shifters, and CDC synchronizers—this is where most bugs hide.

Section 2: CDC Synchronization Techniques (Q11–Q20)

Q11. How does a pulse synchronizer work? When should you use it? (req-ack handshake diagram)

A pulse synchronizer is used to safely pass a pulse (a one-cycle or brief event) from domain A to domain B. Simple synchronizers (two flops) introduce latency and miss short pulses, so pulse synchronization uses a handshake protocol: a request signal (req) in domain A, an acknowledgment signal (ack) in domain B, and feedback.

Operation: (1) Domain A asserts req when an event occurs. (2) req is synchronized through two flops to ack in domain B. (3) Domain B logic detects ack = 1 and processes the event, then asserts ack_back (synchronized back to domain A). (4) Domain A detects ack_back and deasserts req. (5) req_sync in domain B returns to 0 once req_back is released. This is a request-acknowledge handshake: neither side releases until the other acknowledges. Use pulse synchronizers when: (1) events are brief (< 1 clock period), (2) missing an event is unacceptable (e.g., interrupt priority arbitration), (3) you want guaranteed event capture. Downside: latency is 4+ cycles (req asserts, waits for ack, waits for ack_back, deasserts req). So pulse synchronizers are slow but reliable for sporadic events, not for high-bandwidth continuous data.

Pulse Synchronizer (Request-Ack Handshake):

Domain A Clock:     _|‾|_|‾|_|‾|_|‾|_|‾|_|‾|_|‾|_|‾
Domain B Clock:     _|‾|_|‾|_|‾|_|‾|_|‾|_|‾|_|‾|_|‾ (asynchronous)

req (A):            _____|‾‾‾‾|_____
req_sync (B):       _____|~|‾‾‾‾|___  <- 2-cycle latency
ack (B):            ___________|‾‾‾‾|_
ack_back_sync (A):  _______________|‾‾|_

Time: 0  1  2  3  4  5  6  7  8  9  10 11

Total latency: ~4–5 cycles for a pulse to be recognized and handshaked

📌 Note: Pulse synchronizers are overkill for most design. For a simple one-time event (like power-on reset), the latency is acceptable. But for high-frequency events, use an async FIFO instead—it's more efficient.

Q12. Async FIFO for multi-bit CDC — architecture overview (write pointer in wclk, read pointer in rclk)

An asynchronous FIFO (async FIFO) is a dual-clock FIFO that safely transfers multi-bit data from a write clock domain (wclk) to a read clock domain (rclk) without losing data or introducing corruption.

Architecture: Memory storage (typically SRAM, N words of M bits each), write pointer (wptr, incremented in wclk), read pointer (rptr, incremented in rclk), and two synchronizers: one to synchronize wptr to the read domain (wptr_sync), one to synchronize rptr to the write domain (rptr_sync). The write side asserts write_en when wptr != wptr_sync (FIFO not full). The read side asserts read_en when rptr != rptr_sync (FIFO not empty). Pointers are compared locally (within each domain) without cross-domain arithmetic, which would introduce CDC issues. The pointers are kept in Gray code (see Q13) so that synchronization errors affect only one bit at a time, limiting errors to a single-word boundary. Async FIFOs are the workhorse of multi-bit CDC: they handle high-bandwidth data transfer (every cycle if needed) with well-defined latency (1–2 cycles) and proven correctness. Almost every SoC with multiple clock domains uses async FIFOs for data path interconnects.

Async FIFO Block Diagram:

WRITE DOMAIN (wclk)          READ DOMAIN (rclk)
─────────────────────        ──────────────────
write_data[M-1:0]               read_data[M-1:0]
    |                               ^
    v                               |
    ┌─────────────────────────────┐
    |   SHARED MEMORY (SRAM)      |
    |   Addr: wptr (write)        |
    |   Addr: rptr (read)         |
    └─────────────────────────────┘

wptr_gray ──[2-FF Sync]──> wptr_gray_rclk
rptr_gray ──[2-FF Sync]──> rptr_gray_wclk

write_en = (wptr_gray != wptr_sync_gray)  <- Checked in wclk domain
read_en = (rptr_gray != rptr_sync_gray)   <- Checked in rclk domain

💡 Tip: Async FIFOs are robust for continuous data transfer, but they have limitations: finite depth (memory size), and latency (you can't know the write state instantly in the read domain). If you need very low latency (< 1 cycle) or very high bandwidth (> 1 word/cycle), async FIFO might not be the right tool—consider pipelining or special handshake protocols.

Q13. Why must async FIFO use Gray code pointers? (explain why only 1 bit changes)

Gray code is a binary number system where consecutive values differ in only one bit. This property is essential for async FIFO: when a pointer is synchronized across clock domains, only one bit can be in a metastable state, limiting errors to a single bit position rather than corrupting multiple bits.

Example: Standard binary pointers use values 0, 1, 2, 3, 4, ... (binary: 000, 001, 010, 011, 100, ...). Incrementing from 3 (011) to 4 (100) changes three bits simultaneously. If synchronization metastabilizes on one of those bits, the receiving domain might see 011 (3), 101 (5), 111 (7), or 100 (4)—multiple valid values, causing data coherency corruption. Gray code increments differently: 0, 1, 3, 2, 6, 7, 5, 4, 12, ... (binary: 000, 001, 011, 010, 110, 111, 101, 100, 1100, ...). Each step changes exactly one bit. Incrementing from 3 (010) to 2 (110) flips only bit 2. Even if that bit is metastable during synchronization, the error is bounded: the receiving domain sees either 010 (3, correct) or 110 (6, off by 3). This single-bit error is predictable and safe. Async FIFOs compare Gray-coded pointers to detect full/empty, and the single-bit error model ensures the FIFO full/empty decision is conservative: if anything, the FIFO reports slightly full-er or slightly empty-er than the truth, but never corrupts data flow. Gray code is so fundamental to CDC that it's hard to imagine an async FIFO without it.

Binary vs Gray Code Transition (2-bit to 3-bit):

BINARY:   00 -> 01 -> 10 -> 11 -> 100
Changes:      1 bit   2 bits  1 bit 3 bits (risky!)

GRAY:     00 -> 01 -> 11 -> 10 -> 110 -> 111 -> 101 -> 100
Changes:      1 bit  1 bit  1 bit  1 bit  1 bit  1 bit  1 bit (safe!)

If synchronization metastabilizes on a Gray bit:
  Expected: 10 (2)
  Possible: 10 (2) or 11 (3) <- Only 1 bit difference, predictable error

If binary had 3 bits changing:
  Expected: 100 (4)
  Possible: 000, 001, 011, 100, 101, 110, 111 <- Multiple bit errors!

📌 Note: Gray code is so standard in async FIFOs that if an interviewer asks why a FIFO design didn't use it, the answer should be "it was a mistake." Use Gray code for any pointer or counter that crosses clock domains.

Q14. How do you calculate async FIFO depth for a given CDC interface?

Async FIFO depth must account for the maximum burst of data that can arrive while the read side is slow or stalled. The calculation depends on the clock ratios and data rate imbalance.

Formula: Minimum FIFO Depth = (Max Write Burst Length) + (Latency Margin)

Latency margin accounts for: (1) Pointer synchronization latency — the write side has 2–3 cycles of latency before seeing rptr_sync, so it doesn't know immediately if the read side is making progress; (2) Clock ratio — if wclk is 4x faster than rclk, and the write side sends for 10 cycles while read side sends for 1 cycle, the FIFO depth must hold 40 words. Example: wclk = 400 MHz, rclk = 100 MHz (4:1 ratio). Write side can send 4 words per rclk cycle. If there's a 10-cycle rclk burst, FIFO must hold 40 words. Add 4 words margin (for synchronization latency), so depth = 44 (round up to 64 for power-of-2 SRAM sizing). In practice, designers often over-provision: if calculations say 44 words, use 128 or 256. Underprovisioning (FIFO overflow) is a catastrophic bug; over-provisioning is just wasted silicon. Always err on the side of depth.

💡 Tip: If you're unsure about FIFO depth, run a simulation with worst-case clock patterns (write side maxes out, read side stalls) and measure the peak occupancy. Use that +margin as your depth. Post-silicon, you can observe FIFO depth with debug signals and confirm you didn't run out of space.

Q15. What is a bus encoding synchronizer? (one-hot vs binary encoding risk)

A bus encoding synchronizer is used when you must cross a multi-bit control bus (like a mode selector or command) where you want to ensure coherent data across all bits after synchronization.

The problem: if you naively synchronize each bit independently, metastability can resolve differently on different bits, resulting in invalid states. Example: you have a 3-bit command {cmd2, cmd1, cmd0}. Valid values are 001, 010, 100 (one-hot encoding). If you synchronize each bit with independent two-flop synchronizers, and metastability affects bits 1 and 2, you might end up with 110 (two hot), which is invalid. The solution: use an encoding where corruption is limited. One-hot encoding (only one bit high) naturally limits errors: if synchronization corrupts one bit, you get at worst two bits high, which is detectable as an error. Gray encoding (single-bit changes between values) similarly limits corruption to a single-bit error. Binary encoding (0, 1, 2, 3, ...) is risky because transitions change multiple bits. For multi-bit control buses, use one-hot or Gray-coded encoding, never plain binary. Then, use a decoder or checker to validate the value after synchronization—if it's invalid (e.g., zero-hot or multi-hot in one-hot encoding), drive the bus to a safe default state (often 001 or 100, depending on the command).

📌 Note: One-hot encoding uses more bits than binary (3 bits for 8 states vs 3 bits for 8 states, so it's the same size). Gray code is more compact and is preferred for wide buses. But one-hot's error detection property (zero-hot or multi-hot is invalid) makes it appealing for safety-critical interfaces.

Q16. What is an enable synchronizer (open-loop vs closed-loop)?

An enable synchronizer is used to safely pass an asynchronous enable signal (e.g., power-on-enable, clock enable, block enable) from one domain to another. Enable synchronizers can be open-loop (simpler, no feedback) or closed-loop (with acknowledgment).

Open-loop synchronizer: The enable signal is directly synchronized using a two-flop chain. Simple and low-latency (2 cycles). Use when: latency is critical and you're sure the enable signal is stable (doesn't toggle while being synchronized). Downside: if the enable signal changes near a clock edge, the first flop metastabilizes, and there's no acknowledgment that synchronization succeeded. Closed-loop synchronizer: The synchronized enable drives a logic block, which sends back an acknowledgment signal that is synchronized back to the source domain. The source domain waits for acknowledgment before releasing the enable signal, ensuring the destination domain has fully adopted the new state. Use when: you must guarantee the enable has taken effect (e.g., flushing a pipeline before power-gating, ensuring a block is disabled before isolation cells are inserted). Closed-loop adds latency (4+ cycles) but provides certainty. In power management, closed-loop enable synchronizers are standard: you don't power-gate a block until you're 100% sure it's idle and the receiving domain has acknowledged the enable=0 state.

💡 Tip: For power domain transitions (clock gating, isolation cell insertion), always use closed-loop enables. Open-loop is fine for simple clock enable or test mode selection, but power management demands certainty.

Q17. How do you synchronize a reset across clock domains? (reset synchronizer with async assert, sync deassert)

Asynchronous resets (active-low or active-high) must be synchronized very carefully: assertion can be asynchronous (immediate), but deassertion must be synchronized to each clock domain to avoid metastability.

Design: Async assert: When system reset is asserted, all flops across all domains are reset immediately to their initial state (asynchronous reset). This is safe because reset is typically a synchronous global event (happens rarely, and all flops should reset regardless of clock alignment). Sync deassert: When reset is released, each clock domain must see reset deassert synchronized to its own clock edge. A two-flop synchronizer in each clock domain samples the reset release. This ensures each flop waits for a clean clock edge before resuming operation, avoiding metastability from reset deassert. If you deassert reset asynchronously, flops might sample reset=0 before a clock edge, potentially causing glitches in early logic. Architecture: A global async_reset_n signal is asserted immediately by the reset controller. For deassertion, the controller has a synchronizer in each clock domain (sync1, sync2) that captures the release. Domain logic uses reset_n = async_reset_n | (~sync2), so reset asserts immediately but deasserts only after sync2 samples it as released. This is the universal CDC reset strategy, used in virtually every SoC.

Async Reset Synchronizer (Async Assert, Sync Deassert):

reset_button:        ‾‾‾‾|__________|‾‾‾  (press briefly)

async_reset_n:       ‾‾‾‾|‾|__________|‾‾‾  (some debounce, async)
                           ^assert (immediate)

sync1 (clk_A):       ‾‾|/|‾|___|‾‾  (samples async_reset_n)
sync2 (clk_A):       ‾‾__|/|‾‾‾   (sync1's stable output)
reset_clk_A:         ‾‾‾|__|/|‾‾‾  (AND: async OR sync)

                           async   sync
                           assert  deassert (via clock)

Domain B:
sync1 (clk_B):       ‾‾|/|‾|___|‾‾  (different clock, different timing)
sync2 (clk_B):       ‾‾__|/|‾‾‾   (deasserts one clk_B cycle later)
reset_clk_B:         ‾‾‾|__|/|‾‾‾  (deasserts relative to clk_B edge)

📌 Note: Reset synchronizers are so critical that most companies have dedicated reset distribution IP blocks. Don't roll your own—use the library's verified reset synchronizer if available.

Q18. Fast-to-slow domain crossing — what special precautions are needed?

Fast-to-slow domain crossing (e.g., 400 MHz to 100 MHz) is actually simpler than slow-to-fast in terms of raw synchronization, but introduces challenges with data sampling and overflow.

The fast domain produces data at a high rate; the slow domain receives it at a lower rate. If you just put a simple synchronizer, the slow domain will miss some fast-domain transitions because the synchronizer output might not toggle fast enough for the slow clock to sample every change. Example: fast clock toggles 4 times per slow clock cycle. If you send a pulse from fast to slow, the slow clock might miss it entirely if the pulse is too short. Solution: Coalesce data — either buffer the data (use an async FIFO) or ensure the data signal stays high for at least 2 slow clock cycles so the slow synchronizer is guaranteed to capture it. Validate with MTBF calculations — fast-domain data has more transitions, so MTBF is worse. Check that you still have acceptable MTBF (years to decades) even with high data rate. In practice, fast-to-slow is often handled by an async FIFO (write side in fast domain, read side in slow domain), which naturally handles the rate imbalance and provides backpressure (slow side can tell fast side when it's full).

💡 Tip: If you're crossing from a fast clock to a slow clock, prefer an async FIFO over a simple synchronizer. The FIFO naturally handles rate adaptation and data buffering, preventing data loss.

Q19. Slow-to-fast domain crossing — is a 2-flop synchronizer sufficient?

Slow-to-fast domain crossing (e.g., 100 MHz to 400 MHz) is more forgiving in terms of MTBF but requires careful consideration of data stability and sampling.

The slow domain produces data infrequently; the fast domain must reliably sample it. A two-flop synchronizer in the fast clock domain is sufficient for metastability protection (recovery time is less than one fast clock cycle). However, there's a subtlety: the data might be held low for multiple slow clock cycles, and the fast clock will sample it during every cycle. This is fine—the synchronizer will just repeatedly synchronize the same stable value, which is correct. The MTBF of slow-to-fast crossing is typically very good (years to centuries) because the data transition rate is low, so sync1 has ample time to resolve before another transition occurs. The catch: if the slow domain's data changes near a clock edge, sync1 might metastasize, but the recovery time in a fast clock domain is typically shorter (fast clocks have small periods), so metastability resolves quickly. In practice, two flops is almost always sufficient for slow-to-fast CDC. You might use three flops for extra margin or if the two clocks are very close in frequency (which reduces recovery time), but in most cases, slow-to-fast is the "easy" CDC problem.

📌 Note: Slow-to-fast is asymmetrically simpler than fast-to-slow. If you have a choice in your interface direction, pick slow-to-fast when possible. It's more forgiving and less likely to introduce surprises.

Q20. What happens at CDC boundaries with power gating? How do isolation cells affect sync?

Power gating (turning off a clock and/or power to a block) introduces additional CDC complexity: when a block powers down, its flops lose state, and synchronizers to/from that block must be isolated and reset properly.

Scenario: Block A (always-on) communicates with Block B (power-gated). When Block B enters power-gating, its clock is gated off and its power rails are cut. Block A's synchronizers that receive data from Block B no longer see valid transitions (Block B's flops are offline). Isolation cells (special gates) at the power domain boundary hold outputs to safe values (usually 0 or 1) so that Block A doesn't see floating signals. The synchronizers to Block B must be reset or held when Block B powers down, because the synchronized data is stale and no longer reflects Block B's state (which is gone). On power-up, Block B's clock and reset must be re-synchronized in Block A's domain via a reset synchronizer (see Q17) before data communication resumes. This requires careful state machine design: Block A must know when Block B is up, down, or in transition. Synchronizers spanning power domains are substantially more complex than simple clock-domain synchronizers. Many teams use dedicated power-domain crossing IP (like ARM's Power Domain Isolation IP) rather than rolling their own, because the state machine logic is intricate.

💡 Tip: Power domain crossings are a whole category of their own—often called PDC (Power Domain Crossing) separate from CDC. If you're interviewing at a company doing power management (mobile, edge AI), study isolation cells and power domain sequencing carefully. This is beyond basic CDC and is a key differentiator.

Section 3: CDC Verification (Q21–Q30)

Q21. How do CDC lint tools work? (SpyGlass CDC, Cadence JasperGold CDC)

CDC lint tools (like Synopsys SpyGlass CDC, Cadence JasperGold, Siemens Questa CDC) perform static analysis of Verilog/VHDL RTL to identify CDC violations: signals crossing clock domains without proper synchronizers, missing constraints, unsafe reconvergence paths.

Operation: (1) Clock domain identification — the tool identifies all clocks in the design and partitions logic into clock domains based on clock source tracing. (2) CDC path detection — identifies every wire that crosses from one clock domain to another (signals assigned in domain A but sampled in domain B). (3) Synchronizer detection — recognizes standard two-flop synchronizers, async FIFOs, and other CDC structures by pattern matching (consecutive flip-flops with specific configurations). (4) Violation reporting — flags CDC paths that don't have recognized synchronizers. (5) Waiver management — allows engineers to declare safe paths (e.g., false paths, hand-synchronized paths) to avoid false positives. The tools are not perfect: they can miss unsafe reconvergence (two synchronized copies that reconverge) or fail to recognize custom synchronizers. This is why CDC lint is a first-pass filter, not a complete solution—formal CDC verification (Q24, Q28) is required for confidence.

📌 Note: CDC lint tools ship with design rules (like "two-flop synchronizers are OK, single-flop is not"). These rules are conservative but configurable. If you're using specialized synchronizers (e.g., a custom low-latency sync), you might need to add design rules to your lint tool to avoid false warnings.

Q22. What is structural CDC analysis vs functional CDC analysis?

Structural CDC analysis checks the design structure for CDC issues (missing synchronizers, unsafe topology); functional CDC analysis checks whether the application logic is safe across clock boundaries.

Structural: "Is there a two-flop synchronizer between domain A and domain B?" This is what lint tools do—they check the circuit structure. Fast, automated, catches 80% of obvious mistakes. Functional: "Even with a synchronizer, is the data being used correctly?" Example: you have a synchronized counter, but the receiving domain assumes the counter increments by 1 each cycle—in reality, the counter might hold the same value for 2+ cycles while synchronization happens, then jump. If the receiving domain does math assuming monotonic increments, the logic is broken despite having a synchronizer. Functional CDC verification requires understanding the protocol and using formal methods or careful simulation to verify the protocol remains sound across clock domains. Most companies do structural CDC lint first (catches 80% of bugs quickly), then focused functional CDC verification on high-risk interfaces (e.g., pointer comparisons in async FIFOs, request/ack protocols).

💡 Tip: If a lint tool reports "CDC violation" and you know there's a synchronizer, check if it's a functional issue. The synchronizer structure might be correct, but the downstream logic might have incorrect assumptions about data timing. Review the protocol, not just the circuit.

Q23. What are common false CDC paths that tools flag? (how to waive correctly)

False CDC paths are signals that appear to cross clock domains but logically never execute simultaneously in both domains, or are already synchronized elsewhere. Lint tools flag them conservatively, and engineers must waive (declare as safe) to avoid noise in reports.

Example false paths: (1) Mode select: A signal selects between path A (running in clk_A) and path B (running in clk_B). The multiplexer output crosses clocks, but in reality, only one path is ever active, so the unused path's CDC is irrelevant. (2) Test mode: A signal gated by test_en crosses clocks in the test datapath, but test mode is disabled in functional operation (test_en=0 in normal flow). (3) Reset distribution: A reset signal is broadcast to multiple domains—the lint tool might flag it as an unsynced CDC, but in reality, all resets are derived from a single reset controller and can be waived as a known safe pattern. (4) Alias or buffer: A signal is assigned from domain A to domain B locally within a flop already synchronized—the lint tool might flag a second CDC, but it's actually a buffer of an already-synchronized signal. To waive: most tools support pragmas or SDC files to declare a path as safe. Example: set_false_path -from [get_pins mux_sel] -to [get_pins path_b_logic/*] tells the tool this path is false (never executes) and should be ignored. Waiving incorrectly is dangerous (tool stops checking a real CDC bug), so waives must be reviewed and documented.

📌 Note: Never waive a CDC warning you don't fully understand. If the tool flags it and you're not 100% sure it's safe, leave it and add a synchronizer. A false positive (extra synchronizer) is safer than a false waiver (missing synchronizer).

Q24. How do you simulate CDC behavior? (X-propagation model, forced metastability)

Simulating CDC is challenging because metastability is probabilistic and rare—you can't just run functional simulation and expect to hit the metastable corner. Specialized techniques are needed:

X-propagation model: Many CDC simulators inject an 'X' (unknown value) into the first synchronizer flop during simulation, forcing the second flop to sample an undefined value. This models metastability by making the output indeterminate. If the downstream logic breaks with an 'X' input, you've found a bug. This is fast (1 simulation) but can produce false positives (real logic might be OK with X). Forced metastability: Instead of injecting 'X', force the synchronizer's output to oscillate between 0 and 1 for a few nanoseconds (model of true metastability oscillation), then let it resolve to a legal value (0 or 1). This is closer to physics but slower. Randomization: Run Monte-Carlo simulations with random clock phase relationships between domains, trying to hit the metastable window. Formal CDC: Use formal tools (JasperGold, Questa, VCS Formal) to exhaustively prove that the design is safe for all possible clock relationships and data values (see Q28). Practical CDC verification usually combines: (1) lint for structural checks, (2) a few simulation runs with X injection to catch obvious functional bugs, (3) formal CDC on critical interfaces. Full simulation with random clocks is too slow for large designs.

💡 Tip: Don't rely on functional simulation alone for CDC. Run at least one formal CDC check on async FIFOs and critical synchronizers. It's faster and more thorough than simulation.

Q25. What is clock domain identification in CDC analysis?

Clock domain identification is the first step in CDC analysis: partitioning the design into groups of logic that share the same clock, so that we know which signals are crossing domains and which are local.

Method: CDC tools trace every flop's clock input back to a clock source (usually a primary input, a PLL output, or a derived clock). Flops clocked by the same source are in the same domain. Then, any combinational logic fed by flops in domain A and feeding flops in domain B is a potential CDC path. Challenges: (1) Generated clocks — if clk_slow = clk_fast / 2 (generated by a divider), is logic clocked by clk_slow in the same domain as clk_fast? Technically no, they're different clocks, so it's a CDC. But functionally, they might be safe (if the divider is deterministic). Tools require careful specification of clock relationships via create_generated_clock. (2) Gated clocks — if clk_gated = clk AND enable, is it a new clock domain? Tools usually treat gated clocks as the same domain as the source clock (clock gating doesn't create a new domain semantically). (3) Multiplexed clocks — if you mux between clk_A and clk_B and use the output, that's a multi-clock domain boundary requiring special handling. Clock domain identification is critical: if the tool identifies domains wrongly, all downstream CDC analysis is garbage. Always validate tool-generated domain reports against your design intent.

📌 Note: Clock domain identification is often the weakest link in CDC tools. Always manually review and correct the tool's domain partition. If the tool says A and B are the same domain but they're actually asynchronous, CDC will be missed.

Q26. What is a CDC reconvergence path and why is it dangerous?

A CDC reconvergence path occurs when a signal crosses a clock domain boundary, is synchronized, but then later reconverges with another copy of the same unsynchronized signal (or vice versa). This can reintroduce metastability or data corruption.

Example: A control signal crosses from domain A to domain B, is synchronized through two flops to create sig_sync. Later, sig_sync is combined (in combinational logic) with a delayed or processed version of the original unsync signal. If those two arrive near a timing edge, you've reintroduced a CDC problem. More subtle example: sig crosses domain A to B and is synchronized. But sig also fans out to another path that eventually loops back through logic into domain B. If the two copies of sig arrive at a flop with different delays, metastability reappears. Reconvergence is dangerous because it defeats the purpose of the synchronizer: you thought you were safe by using a synchronized copy, but a reconvergence path re-exposes the domain crossing to metastability. CDC lint tools try to detect reconvergence by tracking which signals are synchronized and flagging uses of unsynchronized copies of those signals. But detection is imperfect—complex reconvergence paths can be missed. The fix: always use the synchronized copy, never the raw signal. If you need the raw signal for timing, add another synchronizer to that path too. It's redundant but safe.

💡 Tip: If you're worried about reconvergence, ask yourself: "Are there two paths to the same flop input, one synchronized and one not?" If yes, fix it. Use only the synchronized path, or synchronize both.

Q27. How do you verify an async FIFO? (functional checks + formal)

Async FIFO verification combines functional simulation (checking that data flows correctly) and formal verification (proving that the Gray-code pointer logic and synchronizers are correct).

Functional checks: (1) Write and read pointers increment correctly. (2) Empty and full flags are asserted at correct times (FIFO is empty when wptr == rptr, full when next write pointer equals synchronized read pointer). (3) Data written to address wptr appears on read_data when address rptr is read. (4) Burst patterns (write at max rate, read at max rate, then reverse). (5) Simultaneous write/read (read and write in same cycle). Test with constrained random generation of write and read patterns, checking that no data is lost or duplicated. Formal verification: Use formal tools to prove that: (1) empty and full flags never both assert simultaneously (mutual exclusion). (2) The Gray-code comparison is correct (wptr_gray == rptr_gray means FIFO empty, accounting for synchronization latency). (3) Pointer synchronizers correctly resolve metastability without data corruption. This requires writing assertions like: always @(posedge rclk) assert(~(empty & ~write_en)) to prevent reading from an empty FIFO. Formal CDC tools can verify these automatically. In practice, most projects do: functional simulation (quick sanity checks), then formal CDC on the pointer logic (comprehensive proof). Async FIFOs are so critical that formal verification is almost mandatory for production designs.

📌 Note: An async FIFO can pass all functional tests and still have a subtle CDC bug (e.g., a reconvergence of wptr and rptr before synchronizers). Formal verification catches these; simulation often misses them.

Q28. What does a CDC sign-off flow look like? (lint → formal → simulation → review)

A complete CDC sign-off flow combines multiple verification techniques in sequence, each catching different classes of bugs:

1. Lint (SpyGlass CDC, JasperGold): Fast, automated. Runs in minutes on million-gate designs. Identifies obvious unsynchronized CDC paths, recognizes standard synchronizers, flags suspicious patterns. Output: report of violations and waivers. Acceptable coverage: 95%+ of CDC paths checked. 2. Formal CDC (JasperGold CDC, Questa CDC): Exhaustive proof of correctness on selected interfaces (async FIFOs, critical synchronizers). Runs overnight for complex modules. Proves that metastability cannot occur or is properly contained. Output: formal proof or trace of bug. 3. Simulation (random/directed): Functional testing with constrained random patterns, X-injection on synchronizers, edge-case tests (power-up/down, clock mux, isolation cells). Catches protocol-level bugs that lint/formal miss. Runs for hours. 4. Design review: Domain experts manually review CDC architecture, synchronizers, and waivers. Check that lint/formal/simulation scope is adequate. Final gate-keeper before tapeout. 5. Documentation: CDC checklist, review sign-off, waivers justified. If a CDC bug appears post-silicon, you must defend why it wasn't caught pre-silicon. Complete flow: weeks for a complex SoC, but mandatory for shipping.

💡 Tip: Don't skip formal CDC. It's the only way to be confident in async FIFOs and critical synchronizers. Lint + simulation alone leave subtle bugs. Budget for formal CDC review—it's one of the best uses of verification time.

Q29. How do you handle generated clocks in CDC analysis?

Generated clocks (divide-by-N, PLL output, muxed clocks) require explicit declaration in your SDC constraints so that CDC tools can identify clock domain relationships correctly.

Example: clk_main is 400 MHz; clk_div4 is generated by a divider (clk_main / 4 = 100 MHz). Are they the same domain or different? Technically different frequencies = different clock domains, so any signal crossing between clk_main and clk_div4 logic is a CDC. But they're not asynchronous—the divider ensures deterministic relationships. CDC tools need to know this via create_generated_clock. If you don't declare it, the tool might either (1) treat them as unrelated (conservative, flags false CDC warnings) or (2) treat them as the same (wrong, misses real CDCs). Use SDC: create_generated_clock -name clk_div4 -source [get_ports clk_main] -divide_by 4 [get_pins divider/Q]. This tells the tool that clk_div4 is derived from clk_main with a 4:1 ratio. The tool can then recognize synchronizers between the two domains as valid (they're special synchronizers for known frequency relationships). Another case: PLL output. If your PLL multiplies a reference clock by 16, declare it: create_generated_clock -multiply_by 16 -source [get_ports ref_clk] [get_pins pll/clk_out]. CDC tools use this info to calculate safe synchronization depths and identify domain relationships.

📌 Note: Missing or incorrect generated_clock declarations are a common CDC mistake. Always validate your clock definitions in the CDC tool's output (report_clocks or equivalent).

Q30. What are black-box CDC issues when integrating third-party IP?

When integrating third-party IP (e.g., memory controller, PHY, PCIe core), you don't have source RTL, so CDC analysis is limited. The IP appears as a black box with documented interfaces, and CDC verification must be done at the integration level.

Challenges: (1) Unknown internal domains — the IP might have multiple internal clock domains, but you only see the interface pins. Signals might cross domains internally without visible synchronizers. (2) Undocumented CDC assumptions — the IP documentation might not specify which ports are CDC boundaries or what synchronization is expected on external interfaces. (3) Clock frequency assumptions — the IP might assume certain clock relationships between input clocks; if you drive it with unexpected clock patterns, internal CDCs might fail. (4) Missing Liberty/timing data — without accurate timing models, CDC tools can't analyze the IP's interface correctly. Mitigation: (1) Request CDC documentation — reputable IP vendors provide CDC analysis reports and clock domain diagrams. (2) Treat IP interfaces as black boxes — assume any interface might be a CDC boundary and add synchronizers on your side. (3) Test clocking assumptions — if the IP assumes clk_A and clk_B have a 2:1 frequency ratio, verify your design maintains that. (4) Review CDC at integration — even if IP is CDC-safe internally, the interfaces to the rest of the SoC must be analyzed. Use lint/formal at the SoC level, not just at IP level. Black-box IP is a risk; many post-silicon CDC bugs trace back to misunderstandings between IP vendor assumptions and SoC integration.

💡 Tip: When evaluating IP, always ask the vendor for CDC documentation. If they can't provide it, red flag. You don't want to integrate an IP with hidden CDC issues.

Section 4: Advanced CDC (Q31–Q40)

Q31. Token-based synchronization protocol for complex multi-bit CDC

Token-based synchronization is an advanced protocol for synchronizing multi-bit data (like complex control commands or status bundles) when neither simple synchronizers nor async FIFOs are suitable.

Protocol: Instead of sending raw data, the sender transmits a "token" (a small identifier, e.g., 2–3 bits) that indexes into a shared table of pre-determined values. The receiver synchronizes the token, looks up the corresponding data in a local table, and uses it. Advantages: (1) Synchronizing 2–3 bits is far safer than synchronizing 32 bits. (2) No FIFO overhead. (3) Works for high-bandwidth scenarios. Disadvantages: (1) Requires pre-agreed token-to-data mapping (inflexible). (2) Adds state machine complexity. Example: a 32-bit command is too wide to synchronize safely. Instead, define tokens {00, 01, 10, 11} mapped to four common commands {RESET, START, STOP, CONFIG}. The sender sends the appropriate token (2 bits, safe to sync). The receiver synchronizes the 2-bit token, decodes it to the command. This is elegant for well-defined interfaces with a small set of possible values. It's less common than async FIFOs but appears in specialized interconnects (e.g., NoC arbitration, where only a few predefined commands are needed). Rarely asked in interviews, but if a deep CDC architecture question comes up, token-based sync shows advanced knowledge.

📌 Note: Token-based sync is a niche technique. Most problems are solved with async FIFOs. But in some ultra-low-latency or power-constrained designs, token-based protocols are optimal.

Q32. CDC in NoC (Network-on-Chip) — how packets handle domain crossings

A NoC (Network-on-Chip) is an on-chip communication fabric (like Ethernet for processors). NoCs often span multiple clock and power domains, and packets must cross domain boundaries safely.

Architecture: A NoC has routers connected by links. Each router is clocked by a local clock (routers in different regions might have different clocks). A packet traveling from Router A (clk_A domain) to Router B (clk_B domain) must cross a clock domain boundary at the link. Methods: (1) Async FIFO on each link — each router-to-router link has a FIFO that buffers packets in the source domain and presents them in the destination domain. This is the standard approach. (2) Clock synchronization — routers near a CDC boundary might include synchronization logic to align to a common clock (less common, adds latency). (3) Virtual channels — some NoCs use virtual channels (independent FIFOs) to prevent deadlock when multiple packets traverse CDC boundaries. Challenges: latency increases (FIFO adds 2–3 cycles per CDC boundary), and deadlock prevention requires careful protocol design. A NoC spanning 10 clock domains might add 20–30 cycles of latency due to CDCs—significant in high-frequency designs. Advanced NoC architectures (like ARM ACE) have CDC handling built into the protocol, with explicit rules for synchronization and reset sequencing across domains.

💡 Tip: If you're working on a multi-domain SoC with a custom NoC, budget time for CDC verification on the NoC itself. It's complex, and bugs there propagate throughout the chip.

Q33. Multi-clock SoC CDC strategy — who owns CDC verification?

In a large SoC with dozens of clock domains, CDC verification is massive. The question is: who owns it, and how is it organized?

Ownership models: (1) Centralized CDC team — one team (often verification or design infrastructure) owns all CDC analysis. Teams building blocks run CDC lint on their designs, but the central team does formal CDC, integration-level verification, and sign-off. Advantage: consistent methodology, deep expertise. Disadvantage: bottleneck if CDC team is small. (2) Distributed ownership — each block team owns CDC for their interface. The integration team validates cross-block CDCs. Advantage: faster (parallel work), teams learn CDC. Disadvantage: inconsistent approaches, risks of missed CDCs. (3) Hybrid — most common in large companies. Each block team does lint and simulation; the CDC team does formal verification and sign-off. Methodology: (1) Define all clock domains early (define_clock, create_generated_clock in global SDC). (2) Each team provides clock domain documentation (which blocks are in which domains). (3) Run lint across the full design weekly (identifies new CDCs as design evolves). (4) Formal CDC on critical paths and async FIFOs. (5) Annual CDC design review (pre-tapeout). (6) Document all CDC decisions and waivers. Organization is critical: poor CDC strategy (no clear ownership, ad-hoc methodology) leads to pre-silicon bugs or post-silicon failures. Companies with strong CDC cultures (Intel, Apple, Qualcomm) have dedicated CDC infrastructure and strict review gates.

📌 Note: Large SoC projects often discover CDC bugs late (few weeks before tape-out) because they don't run lint regularly. Build CDC verification into the design schedule from day one. Weekly lint runs catch drift early.

Q34. CDC and timing closure interaction — why CDC paths must be false-pathed

CDC paths (unsynchronized signals crossing domains) must be declared as false paths in your SDC constraints. If you don't, the synthesis tool will try to optimize them for timing, which defeats the purpose of CDC.

Reason: The tool sees a signal crossing domains and tries to minimize its delay (by upsizing gates, short paths, etc.). But CDC paths are inherently asynchronous—the tool's timing assumptions (based on clock edges) don't apply. If the tool optimizes a CDC path, the synchronized output might arrive faster than expected, potentially creating a setup violation in the receiving flip-flop (if the data hasn't settled yet). In practice, CDC paths have intentional latency (synchronizers add cycles), so false-pathing prevents synthesis from removing the synchronizers or wasting effort. SDC declaration: set_false_path -from [get_pins sender_domain/*] -to [get_pins sync1_flop/D tells synthesis to ignore timing on that path (don't optimize it). Then, synthesize the synchronizer flops normally (the tool sizes them based on their real function, not based on false assumptions). The result: clean separation between CDC and timing optimization. If you forget to false-path CDC signals, synthesis might remove synchronizers (thinking they're redundant delays) or optimize them incorrectly, breaking the CDC.

💡 Tip: CDC paths are the ultimate false paths. Always false-path them. If a path has a synchronizer, set_false_path. Never let timing optimization touch a CDC path.

Q35. CDC with clock mux — challenges when source clock switches dynamically

A clock multiplexer is a circuit that selects between two or more clock sources (e.g., clk_main or clk_backup). When the mux output switches, the selected clock might have a different frequency or phase—this creates a transient CDC problem.

Challenge: When the clock mux output switches from clk_A to clk_B, all synchronizers expecting clk_A to clk_B become momentarily misaligned. Signals in the old domain (clk_A) might still be in flight when the clock switches to clk_B. This creates a glitch window where CDC safety is compromised. Solutions: (1) Glitch-free mux — a special multiplexer that switches on a rising edge of the selected clock, ensuring a clean transition. (2) Synchronized clock select — the select signal itself is synchronized, so the switch is managed safely. (3) Domain freeze before switch — before switching, halt all clock domains, wait for all pipelines to drain, then switch, then resume. This is conservative (adds latency) but safe. (4) Avoid clock switching in CDC domains — design the system so that clock muxes are in single-domain regions, not spanning CDC boundaries. Real systems: Some SoCs support dynamic clock switching for power management (e.g., frequency scaling). The CDC challenge is managed by freezing synchronizer stages during clock transitions, using verified glitch-free muxes, or separate CDC synchronizers for clock-switching scenarios. This is an advanced topic, rarely asked in interviews unless the company does dynamic frequency scaling (AMD, Intel) or mobile SoCs.

📌 Note: Clock mux CDC is tricky. If you're working on a design with dynamic frequency scaling, budget time for CDC analysis of the mux region—it's easy to introduce bugs.

Q36. FIFO-based CDC throughput analysis — when does latency become a problem?

An async FIFO introduces latency (typically 2–3 cycles from write to read) and adds area. For some interfaces, this latency is unacceptable, requiring alternative CDC strategies.

Throughput analysis: An async FIFO of depth D can hold D words. If the write side produces at rate R_w (words/cycle) and the read side consumes at rate R_r (words/cycle), the FIFO latency is: latency ≈ (D / (R_w - R_r)) cycles (if R_w > R_r, FIFO fills). If R_w = R_r (balanced rates), latency is approximately 2–3 cycles (pointer synchronization overhead). For most use cases, this is fine. But some interfaces demand very low latency: Example: A 32-bit address from domain A must reach domain B in < 2 cycles (including synchronization). An async FIFO adds 2–3 cycles, which is too much. Solution: Use a synchronizer directly (2 flops = 2-cycle latency) if the address is not multi-bit problematic (use Gray encoding or assume the address is already in a stable state). Another example: Streaming audio data (48 kHz sample rate) crossing 400 MHz and 100 MHz clock domains. The bandwidth is low (48,000 samples/sec * 24 bits = 1.152 Mbps << 100 MHz * 32 bits), so an async FIFO is overkill. A simple 2-flop synchronizer per sample suffices. Latency analysis: always calculate expected throughput and required latency for your interface. If latency is critical, use simple synchronizers (cheaper, faster). If you have headroom, async FIFO (more robust, better error handling).

💡 Tip: Don't default to async FIFO for every CDC. Analyze your throughput and latency budget first. A simple synchronizer is faster, smaller, and sufficient for low-bandwidth interfaces.

Q37. ACE/CHI coherence protocol and CDC — how do snoop responses handle clock domains

ACE (AXI Coherency Extensions) and CHI (Coherent Hub Interface) are ARM's protocols for cache-coherent multi-core systems. In a multi-domain SoC, snoops must cross clock boundaries, requiring special CDC handling.

Scenario: Core A (clk_A) performs a write that might invalidate Core B's cache (clk_B). The interconnect sends a snoop request (a control signal) from Core A to Core B's cache. Core B must respond (acknowledge, data, etc.) back to the interconnect. This snoop request and response are CDC events. Challenge: Snoop responses have deterministic latencies (must respond within N cycles). If CDC synchronizers add unpredictable delay, the snoop response might violate timing. Solution: CHI and ACE protocols define specific synchronization rules for snoop requests and responses, including: (1) Snoop requests are synchronized through dedicated synchronizers (typically 2 flops). (2) Responses are also synchronized. (3) Timeout logic ensures no snoop hangs forever waiting for a response. (4) Isolation cells ensure a powered-down domain doesn't respond to snoops (architectural safety). Coherent interconnects are complex: each port must handle snoops in/out (2 CDCs per port pair). For a system with 4 cores, the CDC complexity explodes. ARM provides coherent interconnect IP (like CCN, CMN) with all CDC logic built-in, so SoC designers don't have to invent it. If you're rolling your own coherent interconnect, CDC is a major verification concern.

📌 Note: Coherent SoCs (multi-core with L1/L2 caches) have CDC complexity that far exceeds simple synchronizers. This is why companies license coherent interconnect IP rather than building it in-house—the CDC is intricate.

Q38. CDC in a DDR PHY — how read data crosses from phy clock to system clock

A DDR (Double Data Rate) PHY translates external DDR interface signals (data, clocks, control) into internal signals synchronized to the system clock. Read data flows from a high-speed phy clock (usually 2x the DDR rate) to the system clock, crossing a domain boundary.

Data path: DDR input pin receives data at 400 MHz (DDR3-800 MHz, 2x = 400 MHz phy clock). The phy's DLL and internal circuits capture this at phy_clk. Read data is then synchronized to the system clock (e.g., 100 MHz memory controller clock). The challenge: phy_clk is derived from external DDR clocks (high jitter, high skew). System clocks are clean (PLL, crystal). Crossing between them is risky. Solution: DDR PHYs use specialized synchronizers for read data: (1) Multi-stage synchronizers — 3–4 flops (more conservative than standard 2-flop) to account for large jitter. (2) Delay-matched paths — read data bits are synchronized with matched delays so all bits arrive coherently in the system domain. (3) Calibration logic — during PHY initialization, the number of delay stages is tuned to match phy_clk and system clock phases, minimizing metastability risk. (4) DQS synchronization — the DDR data strobe (DQS) is synchronized separately, and its synchronized version gates the read data strobe for the memory controller. The PHY design is highly specialized. Most teams use vendor PHY IP (Synopsys DDR PHY, Cadence DDR PHY) with built-in CDC handling. Rolling your own DDR PHY is rarely done in-house due to complexity.

💡 Tip: DDR PHY CDC is a specialized domain. If you're working on memory interface design, study vendor PHY documentation to understand synchronization. Don't try to invent your own.

Q39. What is the "same edge" vs "opposite edge" launch-capture in CDC context?

In CDC context, "same edge" and "opposite edge" refer to the timing relationship between the launching and capturing clocks when a signal crosses domains.

Same-edge launch-capture: The launching flip-flop (in domain A) and capturing flip-flop (in domain B) are clocked by edges with a known phase relationship. Example: both on rising edges of synchronous clocks with fixed phase offset (both rise at the same time, within skew). This is actually not CDC—it's a regular synchronous path with a defined setup/hold relationship. Tools can analyze these normally. Opposite-edge launch-capture: The clocks have unknown or random phase relationship (truly asynchronous). The launching flop's Q output is sampled by a capturing flop at a random point in time relative to the launching clock edge. This is true CDC: the sample might violate setup/hold, causing metastability. This is what synchronizers address. In a mixed-domain SoC, you might have some synchronous clock pairs (derived clocks with known ratios) and some truly asynchronous pairs (PLL outputs, external clocks). CDC analysis must distinguish: same-edge paths can use timing constraints (set_false_path if they're safe, or normal timing if they're on a critical path). Opposite-edge paths must use synchronizers—no timing optimization helps. Documentation in SDC matters: use create_generated_clock to specify known relationships, so lint tools know which paths are same-edge vs opposite-edge.

📌 Note: Distinguishing same-edge from opposite-edge is crucial for CDC strategy. A same-edge path might need timing optimization; an opposite-edge path needs a synchronizer. Misclassifying them leads to bugs.

Q40. CDC whiteboard problem — interviewer gives you: 32-bit bus, 200MHz source, 333MHz destination, what do you do?

This is a classic real-interview question. You're given a concrete scenario and asked to design the CDC solution. Walk through your answer step-by-step, showing systematic CDC design thinking.

Problem Statement: A 32-bit address bus must cross from a 200 MHz domain (Address Generator, domain_A) to a 333 MHz domain (Memory Controller, domain_B). Design a safe CDC solution.

Step 1: Understand the data characteristics. The address is a multi-bit bus (32 bits). It's being sent from a slower clock (200 MHz) to a faster clock (333 MHz). Data rate: address might change every cycle or might hold for multiple cycles—this depends on the protocol. Assumption: address can change every cycle (worst-case for CDC). Frequency ratio: 333 MHz / 200 MHz = 1.665x.

Step 2: Identify the danger. A 32-bit bus has the multi-bit problem (Q7): if you synchronize each bit independently, different bits might resolve from metastability at different times, resulting in corrupted address. Example: address changes from 0x12345678 to 0x87654321. Due to setup violation at the clock boundary, some bits metastasize. Bits [7:0] resolve to 0x21, but bits [31:8] resolve to 0x123456 (old value). The receiving domain sees 0x12345621 (corrupted).

Step 3: Choose a CDC strategy. Options: (1) Simple synchronizer (NO—multi-bit risk). (2) Async FIFO (YES, safe and standard). (3) Gray-coded address with single-bit sync (Maybe, only if address has natural Gray encoding). (4) Handshake protocol (Possible, but overkill for a continuous address bus). Best choice: Async FIFO. Reason: 32-bit multi-bit data, continuous flow, proven solution.

Step 4: Design the async FIFO. (a) Write side: clk_200. Address Generator writes address to FIFO when write_en is asserted. FIFO asserts write_full when full (tells Address Generator to stall). (b) Read side: clk_333. Memory Controller reads address from FIFO when read_en is asserted. FIFO asserts read_empty when empty. (c) Pointer synchronizers: Gray-coded write pointer (wptr) is synchronized to read domain via 2-flop synchronizer. Gray-coded read pointer (rptr) is synchronized back to write domain. Full/empty logic: in write domain, full = (wptr_next == rptr_sync); in read domain, empty = (rptr == wptr_sync). (d) FIFO depth: estimate traffic. If Address Generator produces addresses every cycle for 100 cycles, then stalls. Memory Controller reads addresses at 333 MHz, which is 1.665x the address generation rate (200 MHz). Clock ratio margin: in 6 cycles at 200 MHz (30 ns), the FIFO can receive up to 6 addresses. At 333 MHz (3 ns/cycle), the Memory Controller might read only ~3.6 addresses in that 30 ns window. Net accumulation: ~2.4 addresses per sync latency window. Add safety margin: FIFO depth = 16 (power-of-2 for SRAM). Unlikely to overflow unless there's a pathological traffic pattern.

Step 5: Verify correctness. (a) No data corruption: Gray code ensures only one pointer bit changes per transition, limiting sync errors to a single bit. Full/empty logic is conservative. (b) MTBF: Pointer synchronizers, 2 flops each. Clock frequency: 333 MHz. Data transitions: pointer increments up to 333M times/sec (worst-case, every cycle). MTBF ≈ e^(2*f*τ) / (2*C*f_data) ≈ 10^13 seconds (very safe). (c) Design review: CDC lint should recognize the async FIFO pattern (Gray pointers, 2-flop synchronizers). Formal CDC should prove empty/full mutually exclusive and pointer comparison correct.

Step 6: Constraints (SDC). (a) False-path the CDC paths (pointer synchronizers are not timing-critical): set_false_path -from [get_pins wptr_gray_reg/Q] -to [get_pins wptr_sync_reg/D]. (b) Clock definitions: define both clocks in SDC so the tool knows they're asynchronous. create_clock -name clk_200 -period 5.0 [get_ports clk_200]. create_clock -name clk_333 -period 3.0 [get_ports clk_333]. (c) No timing between the two clocks: synthesis should not try to optimize paths from clk_200 domain to clk_333 domain (that's the FIFO's job).

Step 7: Implementation notes. (a) Use library's async FIFO IP if available (Synopsys, Cadence both have templates). (b) If rolling your own, be meticulous: Gray code conversion is error-prone. (c) Test with simulation: burst writes at 200 MHz, reads at various rates (100%, 50%, stalling). (d) Formal CDC on pointer logic. (e) Document the FIFO design (area, latency, depth rationale) for integration team.

Common follow-up questions: What if the address must be delivered within 1 cycle? (FIFO won't work, need a direct synchronizer + careful timing). What if the 333 MHz clock can turn off for power management? (Need power domain handling, isolation cells, reset coordination—much more complex). How do you know when to read? (Design a valid/ready handshake on top of the FIFO). What if only 10% of cycles have new addresses? (FIFO depth can be smaller, or address could be qualified with a valid signal, then only sync when valid—optimization).

💡 Interview Tip: For this problem, the interviewer is testing: (1) Do you recognize multi-bit CDC risk? (2) Do you know async FIFO? (3) Can you reason about FIFO depth and MTBF? (4) Do you consider verification (lint, formal, simulation)? (5) Are you methodical? A complete answer takes 10–15 minutes and shows all five. Partial credit for recognizing the danger, even if you don't implement the full solution.

Interview Cheatsheet: CDC Most-Asked Topics by Company

Company / Team Most-Asked Topics Key Questions
Synopsys (CDC Tools) Lint algorithms, synchronizer detection, formal methods Q21, Q22, Q25, Q28
Intel (Multi-Domain SoCs) Metastability, MTBF, async FIFOs, power domain CDC Q2, Q3, Q12, Q13, Q20, Q35
Apple (Custom Chip Design) Clock tree CDC, voltage domain crossing, advanced synchronizers Q6, Q10, Q18, Q36, Q37
Qualcomm (Mobile SoCs) Power gating, isolation cells, dynamic clock switching Q20, Q33, Q35, Q40
NVIDIA (GPU Design) High-bandwidth CDC, NoC domain crossings, formal verification Q12, Q32, Q33, Q38
Cadence (EDA/Verification) Formal CDC, lint methodology, verification flows Q21, Q22, Q24, Q27, Q28

Resources & Further Reading

  • Cliff Cummings' CDC Papers (Sunburst Design) — The definitive CDC reference. Read "Synchronization Design for Multi-Clock Systems" and "Async FIFO Design." Available free online.
  • Synopsys SpyGlass CDC User Guide — If you're using DC + SpyGlass, this is the manual. Focus on design rules and waiver guidelines.
  • Cadence JasperGold CDC Documentation — Formal CDC verification walkthrough. Understanding the proof methodology will deepen your CDC intuition.
  • IEEE Papers on Metastability — Search for "MTBF calculation," "Gray code verification," "synchronizer design" for theoretical depth.
  • ARM Documentation: AMBA ACE, CHI Protocol — If interviewing for coherent systems, these specs define CDC handling in cache-coherent SoCs.
  • On-the-Job Practice — Hands-on CDC lint/formal runs, designing synchronizers, debugging CDC failures. Nothing beats real project experience.

Final CDC Interview Tip: Interviewers want to hear that you respect CDC as a serious, non-obvious problem. Showing that you've debugged a real CDC bug, or that you've run formal verification on a synchronizer, sets you apart. Don't just memorize answers—understand the physics (metastability), the math (MTBF), and the practice (async FIFOs, formal tools). CDC is where silicon engineering separates from theory.

Share. Facebook Twitter LinkedIn Email Telegram WhatsApp
Previous ArticleVHDL Interview Questions for VLSI Interviews
Next Article RDC Interview Questions for VLSI Interviews
Raju Gorla
  • Website

Related Posts

Interview Questions

DFT Interview Questions and Answers for VLSI Engineers

19 March 2026
Interview Questions

STA Interview Questions: 52 Real-World Questions with Answers (2026)

18 March 2026
Interview Questions

TCL Interview Questions for VLSI Engineers

6 November 2024
Add A Comment
Leave A Reply Cancel Reply

Topics
  • Design Verification
  • Digital Circuits
  • Informative
  • Interview Questions
  • Physical Design
  • RTL Design
  • STA
  • System Verilog
  • UVM
  • Verilog
Instagram LinkedIn WhatsApp Telegram
© 2026 VLSI Web

Type above and press Enter to search. Press Esc to cancel.