Flexible parametric models, part 1: Introduction

Recall that the Cox model is considered semi-parametric because it leaves the baseline hazard function unspecified. This approach allows the hazard to follow any shape that best fits the observed data, without assuming a specific mathematical form. However, since the baseline hazard is not estimated, the Cox model can only provide relative effects of covariates (e.g., hazard ratios), and cannot be used directly to estimate absolute risks, make survival predictions, or extrapolate beyond the observed follow-up period.

In standard parametric models, the baseline hazard is defined by a limited set of interpretable parameters. For example, the Weibull baseline hazard function is defined based on the scale parameter \(\lambda>0\) and the shape parameter \(\delta>0\). This baseline hazard is monotonic; i.e., it is either strictly increasing or strictly decreasing, depending on the value of the shape parameter \(\delta>0\).

\[ h_0(t) = \lambda \gamma t^{\gamma - 1} \]

Flexible parametric models (FPM) were developed by Royston and Parmar (2002) as an alternative to standard parametric models, while retaining the flexibility of the Cox model in modelling the baseline hazard. Consider a Weibull model for the baseline hazard, fitted on the log cumulative hazard scale.¹

¹ How did we get here? Integrating the Weibull hazard function \(h_0(t) = \lambda \gamma t^{\gamma - 1}\) from 0 to \(t\) gives the Weibull cumulative hazard function \(H_0(t) = \lambda t^\gamma\). Taking the natural log of both sides of the equation gives \(\ln H_0(t) = \ln \lambda + \gamma \ln t\)

\[ \ln H_0(t) = \ln \lambda + \gamma \ln t \notag \]

This result shows that the log cumulative hazard function is a linear function of the log time. Setting, \(\ln\lambda=\gamma_0\) and \(\gamma=\gamma_1\), the Weibull model with covariates \(\mathbf{x}=(x_1, x_2, ..., x_k)\), with corresponding parameters \(\boldsymbol{\beta}=(\beta_1, \beta_2, ..., \beta_k)\) is given below.²

² For example, if there are only two covariates \(x_1\) and \(x_2\), the corresponding parameters will be \(\beta_1\) and \(\beta_2\). The log cumulative hazard function then becomes \(\ln H(t, x_1, x_2) = \ln H_0(t) + x_1\beta_1 + x_2\beta_2\)

\[\begin{align} \ln H(t, \mathbf{x}) &= \gamma_0 + \gamma_1 + \mathbf{x}\boldsymbol{\beta} \notag \\ &= \ln H_0(t) + \mathbf{x}\boldsymbol{\beta} \end{align}\]

Rather than assuming linearity, flexible parametric models use splines for \(\ln t\). Spline functions are used to model non-linear functions of time, and are constructed by dividing the time scales into a number of intervals, and within each interval, assigning a cubic polynomial function of survival time.³ The interval boundaries are known as knots. Typically, the smallest and largest observed survival times serve as boundary knots, while additional internal knots are placed at specific quantiles of the data. For example, if internal knots are placed at quartiles of log survival time, there would be two boundary knots (at the minimum and maximum values) and three internal knots (at the 25th, 50th, and 75th percentiles). This partitions the log survival time axis into four intervals.

³ A polynomial function is a mathematical expression made by adding terms that consist of a variable raised to non-negative integer powers, each multiplied by a constant coefficient. These functions are differentiable, meaning they have a well-defined slope (called the first derivative) at every point, and this slope itself changes smoothly. The second derivative, which represents the rate of change of the slope, describes the curvature of the function and helps determine whether the graph is bending upward or downward. Spline functions of degree 3, also known as cubic splines, are commonly used in flexible parametric models.

⁴ Natural splines and restricted cubic splines both apply linearity constraints beyond the boundary knots, though they do so in slightly different ways. As I understand it, they ultimately yield the same fitted values in a statistical model. However, I’m still unclear on exactly how the methods differ in the way they enforce those constraints.

To ensure smoothness, the individual cubic polynomial functions are constrained to join continuously at the knots. These constraints mean the polynomial pieces cannot vary independently. Instead, the cubic spline is represented as a linear combination of \(m\) basis functions, with \(m\) being the number of internal knots. Each basis function is defined over the entire range of the time variable, with the shape controlled by the placement of the knots. A cubic spline that is additionally constrained to be linear beyond the boundary knots (i.e., before the first knot and after the last knot), is known as a restricted cubic spline (or a natural spline).⁴ This restriction enhances numerical stability toward the minimum and maximum observed values, where data are typically sparse (Royston and Parmar 2002; Collett 2023) . This is where there is often less data and standard cubic splines tend to be sensitive to a few extreme values We nearly always put the boundary knots at the minimum and maximum event times and so the restriction of linearity is not actually within the range of our data, but the linear restrictions help to stabilize the estimated function.

We can therefore extend the baseline function of the Weibull model above, to a spline function of \(\ln{t}\) as follows

\[\begin{align} \ln H(t, \mathbf{x}) = s(\ln{t}, \gamma) + \mathbf{x}\boldsymbol{\beta} \end{align}\]

where the baseline cumulative hazard function \(s(\ln{t;} \gamma)=\ln H_0(t)\) is a restricted cubic spline function. With five knots \(k_1, k_2, ..., k_5\), there are two boundary knots (\(k_1, k_5\)) and three internal knots (\(k_2,k_3, k_4\)). The flexible parametric model is expressed as

\[\begin{align} \ln H(t, \mathbf{x}) = \gamma_0 + \gamma_1 z_1 + \gamma_2 z_2 + \gamma_3 z_3 + \gamma_4 z_4 + \mathbf{x} \boldsymbol{\beta} \notag \end{align}\]

where (\(z_1\) = \(\ln t\)) is the linear term, and (\(z_2, z_3, z_4\)) are restricted cubic spline basis functions of \(\ln t\), nonlinear components capturing curvature around internal knots \(k_2\), \(k_3\), and \(k_4\).⁵

⁵ Here, \(z_1 = x\), \(z_j = (x - k_j)_+^3 - \frac{k_K - k_j}{k_K - k_1}(x - k_1)_+^3 + \frac{k_j - k_1}{k_K - k_1}(x - k_K)_+^3, \quad \text{for } j = 2,3,4\)

\(\text{where } (x - k)_+^3 = \begin{cases} (x - k)^3, & \text{if } x > k \\ 0, & \text{otherwise} \end{cases}\)

Since \(\mathbf{x}\boldsymbol{\beta}\) does not depend on time, this is a proportional cumulative hazards model. To allow non-proportional hazards (time-dependent covariate effects), we include interactions between covariates and restricted cubic splines for log time into the model:

\[\begin{align} \ln H(t, \mathbf{x}) = s(\ln{t}, \gamma) + \mathbf{x}\boldsymbol{\beta} + \sum_{i=1}^{D} s(\ln{t},\gamma_i) x_i \end{align}\]

where \(D\) is the number of time-dependent covariate effects, and \(s(x; \gamma\_i)\) is the restricted cubic spline function for the \(i\)th time-dependent effect.

We can transform the log cumulative hazard functions to the survival scale or the hazard scale [REF]

\[\begin{align} S(t, \mathbf{x}) = \exp\{−\exp [\ln H(t,\mathbf{x})])\} \end{align}\] \[\begin{align} h(t, \mathbf{x}) = \frac{d \, \ln H_{0}(t)}{dt} \exp [\ln H(t,\mathbf{x})] \end{align}\]

Finally, it is also possible to fit FPMs on scales other than the hazard scale (Royston and Parmar 2002). In the proportional odds FPM, the response is the log cumulative odds of failure, that is, \(\log\left(\frac{1 - S(t)}{S(t)}\right)\), and when no internal knots are included, the model reduces to the log-logistic parametric model. In the probit-scale FPM, the response is the inverse of the standard normal cumulative distribution function applied to the survival probability, i.e., \(\Phi^{-1}(S(t))\), where \(\Phi^{-1}\) is the probit function. When no internal knots are used, this model is equivalent to the log-normal parametric survival model.

References

Collett, David. 2023. Modelling Survival Data in Medical Research. Fourth edition. Boca Raton: CRC Press Taylor & Francis Group.

Royston, Patrick, and Mahesh K. B. Parmar. 2002. “Flexible Parametric Proportional-Hazards and Proportional-Odds Models for Censored Survival Data, with Application to Prognostic Modelling and Estimation of Treatment Effects.” Statistics in Medicine 21 (15): 2175–97. https://doi.org/10.1002/sim.1203.