Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Off-by-one in glasgow.gateware.clockgen.ClockGen #667

Open
mbikovitsky opened this issue Aug 15, 2024 · 6 comments
Open

Off-by-one in glasgow.gateware.clockgen.ClockGen #667

mbikovitsky opened this issue Aug 15, 2024 · 6 comments
Labels
gateware Component: gateware

Comments

@mbikovitsky
Copy link

The ClockGen module has an off-by-one error when deriving a clock using .derive or .calculate. The output clock period is one input clock cycle shorter than necessary. This happens only when the ratio between the input and output frequencies is >= 3.

Amaranth playground link.

Here, the input clock is 4Hz and the output clock is 1Hz. Since the input is evenly divisible by the output, we should get an output clock that is exactly 4 input clock cycles long. And, since the ratio is divisible by 2, we should get a 2 clock high pulse and a 2 clock low pulse on the output. Instead, we get 3 cycles - 1 high, 2 low.

The issue is that calculate computes cyc = round(input_hz // output_hz) - 1, and then the implementation uses this value for the counter if it's >= 2. If cyc == 0 or cyc == 1, then we get into one of the two special cases (same frequency or half frequency), and there everything works fine.

I think the fix should be something like:

--- 
+++ 
@@ -41,33 +41,33 @@
         self.stb_r = Signal()
         self.stb_f = Signal()
 
     def elaborate(self, platform):
         m = Module()
 
-        if self.cyc == 0:
+        if self.cyc == 1:
             # Special case: output frequency equal to input frequency.
             # Implementation: wire.
             m.d.comb += [
                 self.clk.eq(ClockSignal()),
                 self.stb_r.eq(1),
                 self.stb_f.eq(1),
             ]
 
-        if self.cyc == 1:
+        if self.cyc == 2:
             # Special case: output frequency half of input frequency.
             # Implementation: flip-flop.
             m.d.sync += [
                 self.clk.eq(~self.clk),
             ]
             m.d.comb += [
                 self.stb_r.eq(~self.clk),
                 self.stb_f.eq(self.clk),
             ]
 
-        if self.cyc >= 2:
+        if self.cyc >= 3:
             # General case.
             # Implementation: counter.
             counter = Signal(range(self.cyc))
             clk_r   = Signal()
 
             m.d.sync += counter.eq(counter - 1)
@@ -110,14 +110,14 @@
                              .format(output_hz / 1000, input_hz / 1000))
         if min_cyc is not None and output_hz * min_cyc > input_hz:
             raise ValueError("output frequency {:.3f} kHz requires a period smaller than {:d} "
                              "cycles at input frequency {:.3f} kHz"
                              .format(output_hz / 1000, min_cyc, input_hz / 1000))
 
-        cyc = round(input_hz // output_hz) - 1
-        actual_output_hz = input_hz / (cyc + 1)
+        cyc = round(input_hz // output_hz)
+        actual_output_hz = input_hz / cyc
         deviation_ppm = round(1000000 * (actual_output_hz - output_hz) // output_hz)
 
         if max_deviation_ppm is not None and deviation_ppm > max_deviation_ppm:
             raise ValueError("output frequency {:.3f} kHz deviates from requested frequency "
                              "{:.3f} kHz by {:d} ppm, which is higher than {:d} ppm"
                              .format(actual_output_hz / 1000, output_hz / 1000,
@@ -139,15 +139,15 @@
 
         if logger is not None:
             if clock_name is None:
                 clock = "clock"
             else:
                 clock = f"clock {clock_name}"
-            if cyc in (0, 1):
+            if cyc in (1, 2):
                 duty = 50
             else:
                 duty = (cyc // 2) / cyc * 100
             logger.debug("%s in=%.3f req=%.3f out=%.3f [kHz] error=%d [ppm] duty=%.1f%%",
                          clock, input_hz / 1000, output_hz / 1000, actual_output_hz / 1000,
                          deviation_ppm, duty)
 
         return cyc

Simulation with the fix.

On a related note, the special case where the output frequency is half the input frequency has a strobe behaviour different from the normal case. In the special case stb_r and stb_f rise one cycle before their respective clock edges. In the normal case they rise on the same cycle. Not sure which behaviour is the correct one, but I think they should at least be consistent.

In terms of actual impact the whole thing may not be a big deal. Assuming most applets derive clocks that are much slower than the system clock, the off-by-one deviation should be negligible.

@whitequark whitequark added the gateware Component: gateware label Aug 16, 2024
@whitequark
Copy link
Member

Thanks! The whole idea with ClockGen was a little misguided (I can say as the person who introduced it, four years later), so I'm not sure if it even should be used after all. The QSPI controller applet shows what I would consider the best current practice.

@whitequark
Copy link
Member

I think one big concern here is that a bunch of applets use ClockGen.derive without using ClockGen itself. In that case fixing an off-by-1 here could introduce an off-by-1 there, which is why I hesitate to just apply the fix.

@mbikovitsky
Copy link
Author

mbikovitsky commented Aug 16, 2024

I could search the codebase for any uses of derive that don't pass it to ClockGen and - 1 them to keep the original behaviour :) It's not exactly pretty, and if you say ClockGen shouldn't be used at all then maybe it's better to just leave it as-is?

@whitequark
Copy link
Member

It looks like these are the direct uses of ClockGen itself:

software/glasgow/applet/audio/dac/__init__.py:66:        m.submodules.clkgen = clkgen = ClockGen(self.pulse_cyc)
software/glasgow/applet/audio/yamaha_opx/__init__.py:181:        m.submodules.clkgen = clkgen = EnableInserter(self.clkgen_ce)(ClockGen(self.master_cyc))
software/glasgow/applet/interface/spi_controller/__init__.py:106:        m.submodules.clkgen = clkgen = ResetInserter(clkgen_reset)(ClockGen(self.period_cyc))
software/glasgow/applet/sensor/hx711/__init__.py:50:            m.submodules.clkgen = clkgen = ClockGen(self.osc_cyc)

and these are the indirect uses of ClockGen.derive:

software/glasgow/applet/audio/dac/__init__.py:179:            pulse_cyc = self.derive_clock(clock_name="modulation",
software/glasgow/applet/audio/dac/__init__.py:181:        sample_cyc = self.derive_clock(clock_name="sampling",
software/glasgow/applet/audio/yamaha_opx/__init__.py:1129:            master_cyc=self.derive_clock(
software/glasgow/applet/interface/spi_controller/__init__.py:345:            period_cyc=self.derive_clock(input_hz=target.sys_clk_freq,
software/glasgow/applet/interface/spi_controller/__init__.py:351:            delay_cyc=self.derive_clock(input_hz=target.sys_clk_freq,
software/glasgow/applet/interface/uart/__init__.py:175:        max_bit_cyc = self.derive_clock(
software/glasgow/applet/interface/uart/__init__.py:216:        manual_cyc = self.derive_clock(
software/glasgow/applet/program/m16c/__init__.py:335:            baud: self.derive_clock(input_hz=target.sys_clk_freq, output_hz=baud)
software/glasgow/applet/sensor/hx711/__init__.py:158:            osc_cyc = self.derive_clock(clock_name="osc",

@whitequark
Copy link
Member

All in all it's just five applets total, of which only one (UART) uses ClockGen.derive directly. So I guess it's probably fine to go either way.

@mbikovitsky
Copy link
Author

I'll create a PR then.

mbikovitsky added a commit to mbikovitsky/glasgow that referenced this issue Aug 16, 2024
mbikovitsky added a commit to mbikovitsky/glasgow that referenced this issue Aug 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
gateware Component: gateware
Projects
None yet
Development

No branches or pull requests

2 participants