Staying Alive: Uncensored Survival Analysis with Tabular Foundation Models
By Mariana Vargas Vieyra, Rhizome Labs

Introduction
Tabular Foundation Models (TFMs) have emerged as a significant research direction, extending the foundation model paradigm beyond large language models. TFMs deliver impressive benchmark results when applied directly — but not all predictive tasks can be addressed with off-the-shelf TFMs. One such domain is Survival Analysis. This research is the work of Mariana Vargas Vieyra and the team at Rhizome Labs.
Survival Analysis (SA) is a statistical framework for modelling the time until an event of interest occurs. Real-world applications include clinical trials tracking patient outcomes or predicting user churn in streaming services. Practitioners typically establish a defined time window for analysis, which creates a key challenge: event times remain unobserved for subjects whose events did not occur within that window — a phenomenon called right censoring.
Right censoring presents a substantial modelling challenge. Classical imputation mechanisms don't apply directly. Traditional SA frameworks like Cox models and Accelerated Failure Time (AFT) models accommodate censoring in their parameter fitting formulations — but TFMs have frozen parameters. The authors address a critical question: "How can we reformulate SA with censored data as a purely predictive task, without having to train a survival model from scratch?"
Their answer: use TFMs to construct AFT models requiring only a single scalar parameter to be fitted. They show that ignoring censored data through Complete Case Analysis biases models toward underestimating survival times, and introduce an iterative in-context imputation method that progressively refines performance.
Survival Analysis Background
Time-to-event datasets contain feature vectors, observed times, event times, censoring times, and event indicators for each subject. When the event indicator equals zero, the subject is censored. The survival function represents the probability that an observation with specific features survives past a given time — in clinical contexts, this reveals a patient's chances of surviving a specified duration following treatment.
The authors selected Accelerated Failure Time models, which linearly regress logarithmic event times. Under right censoring, the log-likelihood becomes a function of scalar parameters, enabling efficient fitting through standard optimisation.
A key component of their method is the Buckley-James (BJ) estimator — a technique that replaces censored outcomes with expected values given true event times exceed censoring times, estimated non-parametrically from model residuals via Kaplan-Meier techniques. The procedure refits regression on the resulting dataset, recomputes imputations, and repeats until stabilisation. The authors replace the ordinary least squares regression component of BJ with in-context TFM predictions, preserving the imputation loop.
Survival Analysis with TFMs
The method leverages TFMs by first estimating log-times, then estimating scale parameters by maximising log-likelihood across the full training set — involving only a single scalar parameter. Once obtained, survival curves emerge from applying the survival function formula across a predefined time grid.
Naively disregarding censored observations introduces bias toward underestimating survival times. The authors address this through a Buckley-James-inspired in-context estimator. Since TFM weights are frozen, survival regression must be framed as a purely in-context prediction task.
TFMs act as non-parametric in-context estimators that iteratively impute survival times as pseudo-targets. These initialise through data-driven warm starts using Kaplan-Meier jackknife estimators. At each iteration, contexts form by subsampling censored and uncensored subjects, ensuring predictions occur out-of-sample and preventing degenerate self-prediction. The scale parameter is then fitted via maximum likelihood across the full training set. This repeats until targets stabilise or maximum iterations are reached.
Experiments and Results
The authors benchmarked their method against classical survival models on five publicly available datasets — spanning cardiovascular disease, breast cancer, critical care, and serum biomarker studies — using TabPFN and TabICL as backbones. Censoring rates range from 31.9% to 72.5%.
Baselines include Cox proportional hazards, Weibull AFT, Log-Normal AFT, and Random Survival Forest. The authors' method appears in three variants: TabSA-CCA (complete case analysis), TabSA-PO (Kaplan-Meier jackknife pseudo-observations), and TabSA-BJ (the full iterative Buckley-James-inspired approach).
TabSA-BJ improved over simpler variants on Harrell's C-index, with the largest gains on high-censoring datasets. Classical methods remain strong on calibration metrics, though the authors' method frequently matches or exceeds them on discrimination — without any dataset-specific training.
Scale parameters converge to stable values across iterations. On several datasets, parameters approach one, indicating TFMs capture meaningful log-time variance under AFT models without survival-specific training. Performance tends to increase with iterations, demonstrating that practitioners can benefit from iteratively refining the context fed to the TFM.
Conclusion
The authors introduced a TFM-based mechanism requiring no dataset-specific training beyond fitting a single scalar parameter — enabling zero-shot survival regression. They demonstrated how TFMs can iteratively impute censored data, providing survival algorithms with a complete dataset to work from.
The full workshop paper is available on arXiv, with code available on GitHub.