Low parametric closed-loop sensitivity realizations using fixed-point and floating-point arithmetic

The Specialized Implicit Form provides a general framework for the analysis and design of digital controller implementations with minimal finite wordlength effects. This paper proposes a measure of the closed-loop transfer function sensitivity to finite wordlength effects that is generalized for both fixed-point and floating-point arithmetic and can be used with the Specialized Implicit Form to analyze the effects of quantization and rounding on the parameters of a digital controller implementation. The measure is computationally tractable and hence amenable to solving the problem of minimizing the parametric sensitivity FWL effect. Furthermore, the sensitivity to the rounding of each individual parameter can be easily obtained. The use of the measure is illustrated with examples.


I. INTRODUCTION
When control systems are implemented in digital hardware, rounding and quantization occurs on the variables and constants in the controller resulting from the finite-precision nature of the number storage in the computing device. This arises because a number must be represented with a finite wordlength (FWL). There are two main effects of this finiteprecision (often known as the FWL effects). The first is the addition of noise into the system resulting from the rounding of variables before and after each arithmetic operation -the "round-off noise". The second is the degradation in the performance and/or the stability resulting from quantization of the controller coefficients/parameters, known as "parametric sensitivity", about which this paper is concerned. For most low-order controllers, the FWL effects are insignificant, but for higher-order controllers, particularly when fast sampling is used, the FWL effects can become significant. However, it is well-known that the FWL effects are dependent upon the controller realization, hence there has been a great deal of work in determining realizations that minimize the FWL effects in some sense, e.g. [1], [2], [3]. It is also wellknown that the FWL effects are dependent on the choice Email: thibault.hilaire@irisa.fr, philippe.chevrel@emn.fr and j.f.whidborne@cranfield.ac.uk of dynamic operator. The δ-operator, for example, generally has much better numerical properties than the usual delay operator, q, for control systems with fast sampling, e.g. [4]. Unsurprisingly, the FWL effects are also dependent upon the choice of arithmetic, either fixed-point or floating-point.
The problem of addressing the optimal realization for minimal FWL effects is usually addressed in the state space, e.g. [5], [1], [3]. Briefly, if the controller is K(s) = C(σI − A) −1 B + D where σ is usually the transform of the δ or q-operator, the problem is to search over the set CT (σI − T −1 AT ) −1 T B + D : T a non-singular matrix to find a T and hence controller realization that is insensitive to FWL effects. The limitation of this approach is that (i) there are many realizations that cannot be expressed in standard state space form and (ii) the search is confined to a single operator. The δ-operator is more complex to implement than the q-operator, so in some circumstances, it may be best to have a mix of operators. These limitations may be overcome by use of the Specialized Implicit Form for the controller [6], [7], [8].
This paper proposes a measure of the sensitivity of the closed-loop transfer function to the controller parameter rounding that is based on a measure proposed by [1] and is suitable for use with the specialized implicit form. The measure is generalized for either fixed or floating-point arithmetic and can be easily computed. In the next section, the specialized implicit form is introduced. In section III, the quantization of the controller coefficients is discussed, and in the following section, the sensitivity measure is proposed and the means of its evaluation provided. In section V, the problem of determining a realization with minimal parametric sensitivity is posed and some examples provided.

A. The Specialized Implicit Form
Many controller forms, such as lattice filters and δoperator controllers, use intermediate variables, and hence cannot be expressed in the traditional state-space form. The Specialized Implicit Form allows a description, in a single equation, of almost any implementation. Furthermore, it provides an explicit description of the parametrization, and allows the analysis of the FWL effects, but is still a macroscopic description. The description takes the form of an implicit state-space system [9] and is given by of step k (the column of 0's in the second matrix shows that T (k) is not used for the calculation at step k -this characterizes the concept of an intermediate variable), • X(k + 1) is the stored state-vector (X(k) is effectively stored from one step to the next, in order to compute X(k + 1) at step k).
T (k+1) and X(k+1) form the descriptor-vector: X(k+1) is stored from one step to the next, while T (k +1) is computed and used within one time step. It is implicitly assumed throughout the paper that the computations associated with the realization (1) are executed in row order giving the following algorithm: Note that in practice, steps [ii] and [iii] could be exchanged to reduce the computational delay. Also note that because the computations are executed in row order and J is lower triangular with 1's on the main diagonal, there is no need to compute J −1 .
Equation (1) is equivalent in infinite precision to the classical state-space form Note that (2) corresponds to a different parametrization than (1). The system transfer function is given by

B. Definitions
To complete the framework, the following definitions are required. For further details, see [8], [10].
Definition 1 A realization, R, is defined by the specific set of matrices J, K, L, M , N , P , Q, R and S used to describe a realization with the implicit form of (1) : R : (J, K, L, M, N, P, Q, R, S).
Remark 1 R can also be defined by the matrix Z ∈ R (l+n+p)×(l+n+m) and the dimensions l, m, n and p, so R could be defined by R := (Z, l, m, n, p).
Definition 2 R H denotes the set of realizations with transfer function H. These realizations are said to be equivalent.
In order to encompass realizations with some special structure (q-operator state-space, δ-operator state-space, direct form, cascade, lattice filters, etc.), we define a set of realizations that possess a particular structure.
Definition 3 A structuration S is a set of realizations having a common structure: some coefficients or some dimensions are fixed a priori.
Some examples of common structurations are given in the next section.

Definition 4 R S
H is the set of equivalent structured realizations. Realizations from R S H are structured according to S and have a transfer function H. Hence Some coefficients of Z are, for example, always set to unity or zero, and are hence not subject to rounding. These are not 'significant coefficients' and hence are not included in the parametrization.
Definition 5 A parametrization of a realization R is the set of coefficients of Z that are significant for the realization.

C. Some examples
The following specialized implicit form describes the realization of a state-space controller using the δ-operator   I n 0 0 −∆I n I n 0 0 So, the δ-structuration S δ is formally defined by The cascade form is a common realization for filter implementation. It generally has good FWL properties compared to the direct forms. For cascade form, the filter is decomposed into a number of lower order (usually first and second-order) transfer function blocks connected in series. For the next example, we consider two standard q-operator filter blocks connected in series as shown in fig. 1.
If the two state-space realizations R 1 and R 2 are defined by (A 1 , B 1 , C 1 , D 1 ) and (A 2 , B 2 , C 2 , D 2 ), then cascading R 1 with R 2 leads to the following realization from which definition of the structuration S immediately follows. The output of R 1 is computed in the intermediate variable, and used as the input of R 2 .
The main point is that if we consider the equivalent statespace realization, with parameters the parametrization is not the one used in the computations. For a given form, it is generally straightforward to define the structuration. A number of other examples are given in [8] and [6].

III. COEFFICIENT QUANTIZATION
A coefficient's quantization depends both on their value and their representation.
Firstly if the value of a coefficient is such that it will be quantized without error, then that parameter makes no contribution to the overall coefficient sensitivity. Hence we introduce weighting matrices W J to W S (and also W Z ) respectively associated with matrices J to S of a realization, such that Secondly, different representation schemes may be considered. Here we consider both fixed-point and floating-point representations of coefficients expressed using β bits.
A fixed-point coefficient x is represented by , N is an integer coded with β g bits and β f an integer (not stored in the representation) such that β g + β f + 1 = β. The quantized [ for a normalized floatingpoint representation) and e is an integer coded with β e bits 1 (β e + β w + 1 = β). The quantized x † of x is, in this case, such that x The choice of β f and e can be unique for each coefficient (e = log 2 |x| and β f = β − 1 − log 2 |x| , where · is the ceiling operator). Alternatively, β f and e are defined for a group of coefficients (in order to reduce the required bitshifts and the subsequent computational cost). This defines the block-fixed-point and block-floating-point schemes. Following [11], we introduce the generalized dynamic range bit β r (β r = β g or β e ) and the precision bit length β p (β p = β f or β w ).
Usually, the blocks used in block-representation correspond to the matrices J to S, but there is no necessity for this, and blocks can be chosen at will. To define the blocks of a realization R := (Z, l, m, n, p), we introduce the matrix η Z such that This allows a completely general definition of the blocks. Thus there could be just a single unique block, or every block could consist of only one coefficient. For example, denoting E a,b ∈ R a×b as a matrix of 1s and then using a block-representation corresponding to the matrices J to S gives With a single unique block for Z we get and for one block per coefficient we get Proposition 1 During the quantization process, Z is per- ∆ is a matrix dependant on the β p precision bit length, and × denotes the Schur product. If β p i,j is the precision bitlength of Z i,j , then Proof: The proof comes from the quantized error expressed in (13) and (14).
Remark 2 With this formalism for the different representation schemes, note that the choice of the scale parameter (e or β f ) is defined for each coefficient (e i,j = log 2 η Z i,j and β f i,j = β−1− log 2 η Z i,j ) and that it is also possible to define the minimum bit length β to code each coefficient without overflow or underflow [8].

IV. CLOSED-LOOP SENSITIVITY MEASURES
In order to determine the optimal realization, some measure of the closed-loop sensitivity to parametric perturbations is required. A fair number of these have been proposed over the years. Ideally, the chosen measure should be computationally tractable but reasonably representative of the actual perturbations that occur in implementation. Probabilistic measures have been proposed, e.g. [12], as well as smallgain measures [13]. A number of measures of the closed-loop pole sensitivity have been proposed [14], [3], [15], [16]. Here though, we use a closed-loop transfer function sensitivity measure which extends that originally proposed by [1].
Consider a plant, P, with controller, C, in the standard form shown in fig. 2, where W (k) ∈ R p1 and Z(k) ∈ R m1 are the exogenous input and output respectively and U (k) ∈ R p2 and Y (k) ∈ R m2 are the plant control and measured signals respectively.
The plant P is defined by where A ∈ R n P ×n P , B 1 ∈ R n P ×p1 , B 2 ∈ R n P ×p2 , C 1 ∈ R m1×n P , C 2 ∈ R m2×n P , D 11 ∈ R m1×p1 , D 12 ∈ R m1×p2 , D 21 ∈ R m2×p1 and D 22 ∈ R m2×p2 is assumed to be zero only in order to simplify the expressions. The controller is defined by a realization C := (Z, l, n, m 2 , p 2 ) with transfer function The closed-loop system S is then given by where F l(·, ·) is the well-known lower fractional transform operation [17] and whereĀ ∈ R n P +n×n P +n ,B ∈ R n P +n×p1 ,C ∈ R m1×n P +n andD ∈ R m1×p1 such that The closed-loop transfer function is

B. Transfer function sensitivity
In order to evaluate how much the digital approximation of the controller's coefficients (due to FWL implementation) affects the closed-loop transfer function, the transfer function sensitivity ∂H ∂Z can be used. LetH † H Z+r Z ×∆ denote the closed-loop transfer functionH perturbed by the quantization process (Z is perturbed in Z + r Z × ∆ according to proposition 1). Then, for the Single Input Single Output (SISO) case, ∀z ∈ C and where · 2 is the L 2 -norm. It is easy to see that and so (29) leads to the following transfer function sensitivity measure: Definition 6 Let consider a realization R := (Z, l, m, n, p) with representation matrix r Z . The closed-loop transfer function sensitivity, with respect to all the non-trivial coefficients of R is defined in the SISO case bȳ Remark 3 This measure must be linked to the open-loop transfer function sensitivity ∂H ∂Z × r Z 2 2 previously defined in [7], [18], [10] that derives from Gevers' definition [1].
Remark 4 From (31) and (29), it is possible to ensure that the closed-loop transfer function perturbation is smaller than a certain constant in an L 2 -norm sense. It is also possible to include a frequency weighting to emphasize certain frequency ranges. Furthermore, it is also possible to use an H ∞ -norm to ensure that the closed-loop degradation is constrained over a given frequency range.
This measure can be extended to the Multiple Input Multiple Output (MIMO) case. However, it is also useful to be able to consider the contribution of each coefficient to the overall sensitivity. The closed-loop transfer function sensitivity matrix, denoted by δH δZ , is the matrix of the L 2norm of the sensitivity of the transfer functionH with respect to each coefficient Z i,j . It is defined by and allows the evaluation of the overall impact of each coefficient. It can be used to evaluate the overall sensitivity. From the properties of L 2 -norms, we have where · F is the Frobenius norm. Definition 6 can now be extended to the general case (MIMO and SISO) :

Definition 7 The closed-loop transfer function sensitivity measure is defined bȳ
The transfer function sensitivity ∂H ∂Z is given by the following proposition where is an operator defined by Vec(·) is the classical operator that vectorizes a matrix, and Proof: The proof is based on the following lemma : Lemma 1 Let X be a matrix in R p×l and G, H be two transfer function with values respectively in C m×p and C l×n . G and H are supposed to be independent w.r.t. X. Then ∂(GXH) ∂X = G H (41) The proofs can be found in [8].

V. OPTIMAL DESIGN
Since the closed-loop sensitivity measure depends on the realization chosen to numerically realize the controller, it is of interest to find, among the equivalent realizations, those with good closed-loop FWL properties.

A. Equivalent realizations
In order to exploit the potential offered by the specialized implicit form in improving implementations, it is necessary to describe sets of equivalent system realizations. In [10], the Inclusion Principle introduced by Šiljak and Ikeda [19], [20] in the context of decentralized control, is extended to the Specialized Implicit Form in order to characterize equivalent classes of realizations. Although this extension gives the formal description of equivalent classes, it is of practical interest to consider only realizations with the same dimensions, where transformation from one realization to another is only a similarity transformation.

Proposition 3
Consider a realization R 0 := (Z 0 , l, m, n, p). All realizations R := (Z, l, m, n, p) such that with U ∈ R n×n , Y ∈ R l×l and W ∈ R l×l non-singular, are equivalent.
Remark 5 Given a realization Z 0 in the cascade form of (10), it is possible to characterize a subset of similarity transformation that preserves the cascade structure. The equivalent realizations with this particular structure are given by the particular similarity transform (specialization of eq. (43)) In the SISO case, the transformation matrix Y is only a scale factor between the first and the second stages of the realization, that can be considered to bound coefficients into acceptable limits. Note that in the MIMO case, there is an extra degree freedom.

B. Optimal realizations problem
The problem of determining the realization that is best in some sense can be posed as follows: Problem 1 (Optimal realization problem) Consider a transfer function H and a sensitivity measure J . The optimal design problem is to find the best realization R opt with transfer function H according to the criteria J , that is Due to the size of R H , this problem generally cannot be solved practically. Hence the following problem is introduced to restrict the search to some particular structuration, each one is searched over using proposition 3.

Problem 2 (Optimal structured realization problem)
Consider some structurations (S i ) 1 i N . The problem to find the optimal structured realization R S opt is Since the measure J could be non-smooth and/or nonconvex, the Adaptive Simulated Annealing (ASA) [21], [22] method has been chosen to solve Problem 2. This method has worked well for other optimal realization problems [15].

C. Example
The first numerical example, used here to evaluate the closed-loop sensitivity under various parametrizations, is a SISO fluid power control system studied in [23], [13]. The discrete-time (sampled at 2 kHz) plant P is given by (A p , B p , C p ) in (48), and transformed to the standard form.
The initial realization R 0 := (Z 0 , 0, 4, 1, 1) of the controller C is given in the controllable canonical form in equation (49). It is important to note that the coefficients are given with only 4 digits, but, due to the sensitivity of this example, this could be not sufficient to define correctly the system. Parameters that may be approximated by the quantization process required for implementation are shown in bold font, and the weighting matrix is built accordingly.
The following realizations are considered : Z 0 : Direct Form II : it corresponds to the canonical form Z 1 : optimal (according toM W L2 ) classical state-space realization Z 2 : Direct Form II with δ-operator (∆ = 2 −5 ) Z 3 : optimal δ-realization Each realization, and its sensitivity matrix δH δZ , is presented in equations (49) to (52). The values of the sensitivity measures for the various realizations are given in Table I. Only the fixed-point format, with block-representation corresponding to the matrices J to S, is considered here.
The results obtained are consistent with existing results on open-loop sensitivity -it is interesting to note that realization Z 1 has a greater sensitivity than Z 2 , which shows the extremely good numerical properties of the δ-operator.
The second numerical example is an active control of longitudinal vehicle oscillations studied in [24]. One significant aspect of vehicle driveability is the attenuation of the first torsional mode (resonance in the elastic parts) which produces unpleasant (0 to 10 Hz) longitudinal oscillations of the vehicle, known as shuffle. They can be reduced by means of a controller acting on the engine torque.
The powertrain was modeled in continuous-time form, and a continuous-time H ∞ optimal controller was designed [24].  The discretized model P (z) is given by equations (53) and (54), and a discrete-time realization of the controller is given by (54) and (55) -it corresponds to an internally balanced realization.
The different forms studied here are : The Observer-State-Feedback form allows an enrichment of the observer model with a physical meaning but also because these states estimate the states of the physical system. Thus it improves the readability of the signals, and the initialization of the controller states is based on the physical states of the system. The Observer-State-Feedback form is given by and can be written in the Specialized Implicit Form according to equation (56). The transformation from the state-space form to the Observer-State-Feedback form requires the solution of a generalized Riccati equation [25]. The controller poles must be classified in three categories, which are the observation gain, the filter gain and the Youla parameter (static here). This repartition (according to some rules [18]) determines the parameters K f , K c and Q. It is possible to numerically implement equation (47) in various ways, depending on the choice of the partition of the poles. Here, with 20 poles, there exist 184756 partitions, but only 120 are actually possible. The optimal realization is chosen from these realizations. Table II shows the different sensitivity values. Note that the Observer-State-Feedback form is not significantly more sensitive than the state-space form, and with only a small increase in the number of arithmetic operations.
From Tables I and II it is clear that the optimal realizations, which are fully parametrized and non-sparse, have a higher computational cost. This motivates the search for optimal sparse realizations [1]. Interestingly, the additional computational cost of the δ-realizations is small compared to the very large decrease in sensitivities. This supports the view that δ-realizations are generally superior [4].

VI. CONCLUSIONS
The Specialized Implicit Form provides a general framework for the analysis and design of digital controller implementations with minimal finite wordlength effects. This paper has presented the closed-loop sensitivity measure in this context. In this sense, it generalizes the results of [1]. Moreover, the present development applies for both fixedpoint and floating-point arithmetic, enabling one to analyze more precisely the effects of quantization and rounding on the parameters of a digital controller implementation.
As shown on two examples, the measure is computationally tractable and hence amenable to solving the problem of finding a good realization with regard to the parametric sensitivity FWL effect. Furthermore, the sensitivity to the rounding of each individual parameter can be easily obtained, which may be very useful from a methodological point of view.
Although the computational cost has been given for each presented realization, they have not been discussed. The way to manage the compromise between the sensitivity measure, the computational effort, the rounding noise and the risk of overflow will be considered in a future paper.