Skip to content

Commit cf0ccef

Browse files
sweeksBigBluejcfarwecurtisan-intel
authored andcommitted
Floating point square root
Add a base class for floating point square root. Implement a basic floating point square root module. Mantissa is variable, but limited to odd numbers from 1 to 51. No rounding implemented. Implement a fixed point square root for values with three integral bits and up to 51 fractional bits. Add a fixed point square root method. Add tests for fixed-point and floating-point square root. Implements intel#171 Co-authored-by: James Farwell <james.c.farwell@intel.com> Co-authored-by: Curtis Anderson <curtis.anderson@intel.com> Signed-off-by: Stephen Weeks <stephen.weeks@intel.com> Signed-off-by: James Farwell <james.c.farwell@intel.com> Signed-off-by: Curtis Anderson <curtis.anderson@intel.com>
1 parent f1135e3 commit cf0ccef

10 files changed

+624
-0
lines changed

doc/components/fixed_point.md

+4
Original file line numberDiff line numberDiff line change
@@ -25,3 +25,7 @@ Currently, the FloatToFixed converter, when in lossy mode, is not performing any
2525
## Float8ToFixed
2626

2727
This component converts an 8-bit floating-point (FP8) representation ([FloatingPoint8E4M3Value](https://intel.github.io/rohd-hcl/rohd_hcl/FloatingPoint8E4M3Value-class.html) or [FloatingPoint8E5M2Value](https://intel.github.io/rohd-hcl/rohd_hcl/FloatingPoint8E5M2Value-class.html)) to a signed fixed-point representation. This component offers using the same hardware for both FP8 formats. Therefore, both input and output are of type [Logic](https://intel.github.io/rohd/rohd/Logic-class.html) and can be cast from/to floating point/fixed point by the producer/consumer based on the selected `mode`. Infinities and NaN's are not supported. The output width is 33bits to accommodate [FloatingPoint8E5M2Value](https://intel.github.io/rohd-hcl/rohd_hcl/FloatingPoint8E5M2Value-class.html) without loss.
28+
29+
## FixedPointSqrt
30+
31+
This component computes the square root of a 3.x fixed-point value, returning a result in the same format. The square root value is rounded to the ordered number of bits. The integral part must be 3 bits, and the fractional part may be any odd value <= 51. Even numbers of bits are currently not supported, integral bits in numbers other than 3 are currently not supported.

doc/components/floating_point.md

+11
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,17 @@ Currently, the [FloatingPointAdderSimple](https://intel.github.io/rohd-hcl/rohd_
8383

8484
A second [FloatingPointAdderRound](https://intel.github.io/rohd-hcl/rohd_hcl/FloatingPointAdderRound-class.html) component is available which does perform rounding. It is based on "Delay-Optimized Implementation of IEEE Floating-Point Addition", by Peter-Michael Seidel and Guy Even, using an R-path and an N-path to process far-apart exponents and use rounding and an N-path for exponents within 2 and subtraction, which is exact. If you pass in an optional clock, a pipe stage will be added to help optimize frequency; an optional reset and enable are can control the pipe stage.
8585

86+
## FloatingPointSqrt
87+
88+
A very basic [FloatingPointSqrtSimple] component is available which does not perform any
89+
rounding. It also only operates on variable mantissas of an odd value (1,3,5,etc) but these odd mantissas can be of variable length up to 51. It takes one
90+
[FloatingPoint](https://intel.github.io/rohd-hcl/rohd_hcl/FloatingPoint-class.html) [LogicStructure](https://intel.github.io/rohd/rohd/LogicStructure-class.html) and
91+
performs a square root on it, returning the [FloatingPoint](https://intel.github.io/rohd-hcl/rohd_hcl/FloatingPoint-class.html) value on the output.
92+
93+
Currently, the [FloatingPointSqrtSimple](https://intel.github.io/rohd-hcl/rohd_hcl/FloatingPointSqrtSimple-class.html) is close in accuracy (as it has no rounding) and is not
94+
optimized for circuit performance, but provides the key functionalities of floating-point square root. Still, this component is a starting point for more realistic
95+
floating-point components that leverage the the logical [FloatingPoint](https://intel.github.io/rohd-hcl/rohd_hcl/FloatingPoint-class.html) and literal [FloatingPointValue](https://intel.github.io/rohd-hcl/rohd_hcl/FloatingPointValue-class.html) type abstractions.
96+
8697
## FloatingPointMultiplier
8798

8899
A very basic [FloatingPointMultiplierSimple] component is available which does not perform any rounding. It takes two [FloatingPoint](https://intel.github.io/rohd-hcl/rohd_hcl/FloatingPoint-class.html) [LogicStructure](https://intel.github.io/rohd/rohd/LogicStructure-class.html)s and multiplies them, returning a normalized [FloatingPoint](https://intel.github.io/rohd-hcl/rohd_hcl/FloatingPoint-class.html) on the output 'product'.

lib/src/arithmetic/arithmetic.dart

+1
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ export 'arithmetic_utils.dart';
66
export 'carry_save_mutiplier.dart';
77
export 'compound_adder.dart';
88
export 'divider.dart';
9+
export 'fixed_sqrt.dart';
910
export 'fixed_to_float.dart';
1011
export 'float_to_fixed.dart';
1112
export 'floating_point/floating_point.dart';

lib/src/arithmetic/fixed_sqrt.dart

+100
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,100 @@
1+
// Copyright (C) 2025 Intel Corporation
2+
// SPDX-License-Indentifier: BSD-3-Clause
3+
//
4+
// fixed_point_sqrt.dart
5+
// An abstract base class defining the API for floating-point square root.
6+
//
7+
// 2025 March 3
8+
// Authors: James Farwell <james.c.farwell@intel.com>, Stephen
9+
// Weeks <stephen.weeks@intel.com>
10+
11+
/// An abstract API for fixed point square root.
12+
library;
13+
14+
import 'package:meta/meta.dart';
15+
import 'package:rohd/rohd.dart';
16+
import 'package:rohd_hcl/rohd_hcl.dart';
17+
18+
/// Abstract base class
19+
abstract class FixedPointSqrtBase extends Module {
20+
/// Width of the input and output fields.
21+
final int numWidth;
22+
23+
/// The value [a], named this way to allow for a local variable 'a'.
24+
@protected
25+
late final FixedPoint a;
26+
27+
/// getter for the computed output.
28+
late final FixedPoint sqrtF = a.clone(name: 'sqrtF')..gets(output('sqrtF'));
29+
30+
/// Square root a fixed point number [a], returning result in [sqrtF].
31+
FixedPointSqrtBase(FixedPoint a,
32+
{super.name = 'fixed_point_square_root', String? definitionName})
33+
: numWidth = a.width,
34+
super(
35+
definitionName:
36+
definitionName ?? 'FixedPointSquareRoot${a.width}') {
37+
this.a = a.clone(name: 'a')..gets(addInput('a', a, width: a.width));
38+
39+
addOutput('sqrtF', width: numWidth);
40+
}
41+
}
42+
43+
/// Implementation
44+
/// Algorithm explained here;
45+
/// https://projectf.io/posts/square-root-in-verilog/
46+
class FixedPointSqrt extends FixedPointSqrtBase {
47+
/// Constructor
48+
FixedPointSqrt(super.a) {
49+
Logic solution =
50+
FixedPoint(signed: a.signed, name: 'solution', m: a.m + 1, n: a.n + 1);
51+
Logic remainder =
52+
FixedPoint(signed: a.signed, name: 'remainder', m: a.m + 1, n: a.n + 1);
53+
Logic subtractionValue =
54+
FixedPoint(signed: a.signed, name: 'subValue', m: a.m + 1, n: a.n + 1);
55+
Logic aLoc =
56+
FixedPoint(signed: a.signed, name: 'aLoc', m: a.m + 1, n: a.n + 1);
57+
58+
solution = Const(0, width: aLoc.width);
59+
remainder = Const(0, width: aLoc.width);
60+
subtractionValue = Const(0, width: aLoc.width);
61+
aLoc = [Const(0), a, Const(0)].swizzle();
62+
63+
final outputSqrt = a.clone(name: 'sqrtF');
64+
output('sqrtF') <= outputSqrt;
65+
66+
// loop once through input value
67+
for (var i = 0; i < ((numWidth + 2) >> 1); i++) {
68+
// append bits from a, two at a time
69+
remainder = [
70+
remainder.slice(numWidth + 2 - 3, 0),
71+
aLoc.slice(aLoc.width - 1 - (i * 2), aLoc.width - 2 - (i * 2))
72+
].swizzle();
73+
subtractionValue =
74+
[solution.slice(numWidth + 2 - 3, 0), Const(1, width: 2)].swizzle();
75+
solution = [
76+
solution.slice(numWidth + 2 - 2, 0),
77+
subtractionValue.lte(remainder)
78+
].swizzle();
79+
remainder = mux(subtractionValue.lte(remainder),
80+
remainder - subtractionValue, remainder);
81+
}
82+
83+
// loop again to finish remainder
84+
for (var i = 0; i < ((numWidth + 2) >> 1) - 1; i++) {
85+
// don't try to append bits from a, they are done
86+
remainder =
87+
[remainder.slice(numWidth + 2 - 3, 0), Const(0, width: 2)].swizzle();
88+
subtractionValue =
89+
[solution.slice(numWidth + 2 - 3, 0), Const(1, width: 2)].swizzle();
90+
solution = [
91+
solution.slice(numWidth + 2 - 2, 0),
92+
subtractionValue.lte(remainder)
93+
].swizzle();
94+
remainder = mux(subtractionValue.lte(remainder),
95+
remainder - subtractionValue, remainder);
96+
}
97+
solution = solution + 1;
98+
outputSqrt <= solution.slice(aLoc.width - 1, aLoc.width - a.width);
99+
}
100+
}

lib/src/arithmetic/floating_point/floating_point.dart

+2
Original file line numberDiff line numberDiff line change
@@ -8,4 +8,6 @@ export 'floating_point_converter.dart';
88
export 'floating_point_multiplier.dart';
99
export 'floating_point_multiplier_simple.dart';
1010
export 'floating_point_rounding.dart';
11+
export 'floating_point_sqrt.dart';
12+
export 'floating_point_sqrt_simple.dart';
1113
export 'floating_point_utilities.dart';
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,83 @@
1+
// Copyright (C) 2025 Intel Corporation
2+
// SPDX-License-Indentifier: BSD-3-Clause
3+
//
4+
// floating_point_sqrt.dart
5+
// An abstract base class defining the API for floating-point square root.
6+
//
7+
// 2025 March 3
8+
// Authors: James Farwell <james.c.farwell@intel.com>,
9+
//Stephen Weeks <stephen.weeks@intel.com>,
10+
//Curtis Anderson <curtis.anders@intel.com>
11+
12+
import 'package:meta/meta.dart';
13+
import 'package:rohd/rohd.dart';
14+
import 'package:rohd_hcl/rohd_hcl.dart';
15+
16+
/// An abstract API for floating point square root.
17+
abstract class FloatingPointSqrt<FpType extends FloatingPoint> extends Module {
18+
/// Width of the output exponent field.
19+
final int exponentWidth;
20+
21+
/// Width of the output mantissa field.
22+
final int mantissaWidth;
23+
24+
/// The [clk] : if a non-null clock signal is passed in, a pipestage is added
25+
/// to the square root to help optimize frequency.
26+
@protected
27+
late final Logic? clk;
28+
29+
/// Optional [reset], used only if a [clk] is not null to reset the pipeline
30+
/// flops.
31+
@protected
32+
late final Logic? reset;
33+
34+
/// Optional [enable], used only if a [clk] is not null to enable the pipeline
35+
/// flops.
36+
@protected
37+
late final Logic? enable;
38+
39+
/// The value [a], named this way to allow for a local variable 'a'.
40+
@protected
41+
late final FpType a;
42+
43+
/// getter for the computed [FloatingPoint] output.
44+
late final FloatingPoint sqrtR = (a.clone(name: 'sqrtR') as FpType)
45+
..gets(output('sqrtR'));
46+
47+
/// getter for the [error] output.
48+
late final Logic error = Logic(name: 'error')..gets(output('error'));
49+
50+
/// The internal error signal to pass through
51+
late final Logic errorSig;
52+
53+
/// Square root a floating point number [a], returning result in [sqrtR].
54+
/// - [clk], [reset], [enable] are optional inputs to control a pipestage
55+
/// (only inserted if [clk] is provided)
56+
FloatingPointSqrt(FpType a,
57+
{Logic? clk,
58+
Logic? reset,
59+
Logic? enable,
60+
super.name = 'floating_point_square_root',
61+
String? definitionName})
62+
: exponentWidth = a.exponent.width,
63+
mantissaWidth = a.mantissa.width,
64+
super(
65+
definitionName: definitionName ??
66+
'FloatingPointSquareRoot_E${a.exponent.width}'
67+
'M${a.mantissa.width}') {
68+
this.clk = (clk != null) ? addInput('clk', clk) : null;
69+
this.reset = (reset != null) ? addInput('reset', reset) : null;
70+
this.enable = (enable != null) ? addInput('enable', enable) : null;
71+
this.a = (a.clone(name: 'a') as FpType)
72+
..gets(addInput('a', a, width: a.width));
73+
74+
addOutput('sqrtR', width: exponentWidth + mantissaWidth + 1);
75+
errorSig = Logic(name: 'error');
76+
addOutput('error');
77+
output('error') <= errorSig;
78+
}
79+
80+
/// Pipelining helper that uses the context for signals clk/enable/reset
81+
Logic localFlop(Logic input) =>
82+
condFlop(clk, input, en: enable, reset: reset);
83+
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,102 @@
1+
// Copyright (C) 2025 Intel Corporation
2+
// SPDX-License-Indentifier: BSD-3-Clause
3+
//
4+
// floating_point_sqrt.dart
5+
// An abstract base class defining the API for floating-point square root.
6+
//
7+
// 2025 March 4
8+
// Authors: James Farwell <james.c.farwell@intel.com>,
9+
//Stephen Weeks <stephen.weeks@intel.com>,
10+
//Curtis Anderson <curtis.anders@intel.com>
11+
12+
import 'package:rohd/rohd.dart';
13+
import 'package:rohd_hcl/rohd_hcl.dart';
14+
15+
/// An square root module for FloatingPoint values
16+
class FloatingPointSqrtSimple<FpType extends FloatingPoint>
17+
extends FloatingPointSqrt<FpType> {
18+
/// Square root one floating point number [a], returning results
19+
/// [sqrtR] and [error]
20+
FloatingPointSqrtSimple(super.a,
21+
{super.clk,
22+
super.reset,
23+
super.enable,
24+
super.name = 'floatingpoint_square_root_simple'})
25+
: super(
26+
definitionName: 'FloatingPointSquareRootSimple_'
27+
'E${a.exponent.width}M${a.mantissa.width}') {
28+
final outputSqrt = FloatingPoint(
29+
exponentWidth: exponentWidth,
30+
mantissaWidth: mantissaWidth,
31+
name: 'sqrtR');
32+
output('sqrtR') <= outputSqrt;
33+
34+
// check to see if we do sqrt at all or just return a
35+
final isInf = a.isAnInfinity.named('isInf');
36+
final isNaN = a.isNaN.named('isNan');
37+
final isZero = a.isAZero.named('isZero');
38+
final enableSqrt = ~((isInf | isNaN | isZero) | a.sign).named('enableSqrt');
39+
40+
// debias the exponent
41+
final deBiasAmt = (1 << a.exponent.width - 1) - 1;
42+
43+
// deBias math
44+
final deBiasExp = a.exponent - deBiasAmt;
45+
46+
// shift exponent
47+
final shiftedExp =
48+
[deBiasExp[-1], deBiasExp.slice(a.exponent.width - 1, 1)].swizzle();
49+
50+
// check if exponent was odd
51+
final isExpOdd = deBiasExp[0];
52+
53+
// use fixed sqrt unit
54+
final aFixed = FixedPoint(signed: false, m: 3, n: a.mantissa.width);
55+
aFixed <= [Const(1, width: 3), a.mantissa.getRange(0)].swizzle();
56+
57+
// mux if we shift left by 1 if exponent was odd
58+
final aFixedAdj = aFixed.clone()
59+
..gets(mux(isExpOdd, [aFixed.slice(-2, 0), Const(0)].swizzle(), aFixed)
60+
.named('oddMantissaMux'));
61+
62+
// mux to choose if we do square root or not
63+
final fixedSqrt = aFixedAdj.clone()
64+
..gets(mux(enableSqrt, FixedPointSqrt(aFixedAdj).sqrtF, aFixedAdj)
65+
.named('sqrtMux'));
66+
67+
// convert back to floating point representation
68+
final fpSqrt = FixedToFloat(fixedSqrt,
69+
exponentWidth: a.exponent.width, mantissaWidth: a.mantissa.width);
70+
71+
// final calculation results
72+
Combinational([
73+
errorSig < Const(0),
74+
If.block([
75+
Iff(isInf & ~a.sign, [
76+
outputSqrt < outputSqrt.inf(),
77+
]),
78+
ElseIf(isInf & a.sign, [
79+
outputSqrt < outputSqrt.inf(negative: true),
80+
errorSig < Const(1),
81+
]),
82+
ElseIf(isNaN, [
83+
outputSqrt < outputSqrt.nan,
84+
]),
85+
ElseIf(isZero, [
86+
outputSqrt.sign < a.sign,
87+
outputSqrt.exponent < a.exponent,
88+
outputSqrt.mantissa < a.mantissa,
89+
]),
90+
ElseIf(a.sign, [
91+
outputSqrt < outputSqrt.nan,
92+
errorSig < Const(1),
93+
]),
94+
Else([
95+
outputSqrt.sign < a.sign,
96+
outputSqrt.exponent < (shiftedExp + deBiasAmt),
97+
outputSqrt.mantissa < fpSqrt.float.mantissa,
98+
])
99+
])
100+
]);
101+
}
102+
}

lib/src/arithmetic/signals/fixed_point_logic.dart

+11
Original file line numberDiff line numberDiff line change
@@ -123,6 +123,13 @@ class FixedPoint extends Logic {
123123
}
124124
}
125125

126+
/// Multiply
127+
Logic fpMultiply(dynamic other) {
128+
_verifyCompatible(other);
129+
final product = Multiply(this, other).out;
130+
return FixedPoint.of(product, signed: false, m: product.width - n, n: n);
131+
}
132+
126133
/// Greater-than.
127134
@override
128135
Logic operator >(dynamic other) => gt(other);
@@ -131,6 +138,10 @@ class FixedPoint extends Logic {
131138
@override
132139
Logic operator >=(dynamic other) => gte(other);
133140

141+
/// multiply
142+
@override
143+
Logic operator *(dynamic other) => fpMultiply(other);
144+
134145
@override
135146
Logic eq(dynamic other) {
136147
_verifyCompatible(other);

0 commit comments

Comments
 (0)