About Me:

I am a Professor in the department of statistics and Operations Research at Tel Aviv University, Israel. My research interest are in survival data; non-parametric statistics; biostatistics; machine learning. My current research focuses on problems related to inference based on Survival Deep Learning, natural experiment methods for causal inference and various challenges of causal inference with survival outcomes.

Open positions:

Postdoctoral Research Fellowship:

Applications are invited for a Postdoctoral Research Fellowship. The Fellowship is available for two years. The Fellow is expected to carry out methodological research. Applicants should provide a cover letter describing their qualifications with long term career goals, and a Curriculum Vitae including relevant publications. Applicants should also arrange for three reference letters to be sent directly to the same email address.

Editorial Boards:

Associate Editor of JASA Applications and Case Studies (2022 - present)

Associate Editor of EJS (2022 - present)

Associate Editor of Scandinavian Journal of Statistics (2021 - present)

Co-Editor of Biometrics (2017 - 2019)

Associate Editor of Biometrics (2009 - 2016)

Member of Editorial Board of Lifetime Data Analysis (2013 - 2016)

Students:

Current MSc students:
Yaara Saad, Amit Jaffe, Shilo Horev, Omri Wertheimer, Tomer Weiss, Yoav Ben Azar.
Current PhD students:
Lea Katz, Asaf Ben Arie.
Former MSc students:
Nurit Lipshtat, Einat Aviel, Sarit Natan, Reut Klopshtock, Anna-Graber-Naidich, Rottem De-Piccoitto, Banjamin Sedacca, Nadia Pograbinsky, Rony Ghebali, Dotan Tzor, Yael Hershkovitz, Ariel Sason, Matan Schlesinger, Nir Keret, Nathalie Hauser, Asaf Ben Arie, Lea Katz, Ron Litman, Shoval Marton, Tal Agassi.
Former PhD students:
Esther Hagai, Gitit Shahaf, Polyna Chodyakov, Michal Ben Noach, Nir Keret.
Portrait

Contact:

Malka Gorfine
School of Mathematical Sciences
Tel Aviv University
Tel Aviv, Israel
gorfinem[at]tauex.tau.ac.il
Phone (work): (972)3-640-8391



Site design by Shay Yaacoby ; Front page photo by Ilana Shkolnik.

CURRICULUM VITEA

Education

1994
MA, Hebrew University, Department of Statistics
1999
PhD, ​Hebrew University, Department of Statistics
​Supervisors: Prof. Banjamin Yakir, Prof. David Zucker

​Academic and Professional Experience

2016 - present
Professor, Department of Statistics and Operations Research, Tel Aviv University.
2014 - 2016
Associate Professor, Department of Statistics and Operations Research, Tel Aviv University.
2003 - present
Affiliate Investigator (Affiliate Faculty), Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, USA.
2002 - present
Returning Visiting Scientist, Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, USA.
2011 - 2012
Associate Professor, Visiting Faculty Member, Department of Biostatistics and Computational Biology, Harvard School of Public Health, Boston, USA.
2010 - 2014
Associate Professor, Faculty of Industrial Engineering and Management, Technion - Israel Institute of Technology.
2005 - 2010
Senior Lecturer, Faculty of Industrial Engineering and Management, Technion - Israel Institute of Technology.
2004 - 2007
Senior Lecturer, Mathematics and Statistics Department, Bar Ilan University.
2001 - 2004
Lecturer, Mathematics and Statistics Department, Bar Ilan University.
1999 - 2001
Staff Scientist, Advanced Post Doctoral Position, Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, USA.

STRATOS TG8 - Survival Analysis

The Survival Analysis Topic Group is one of the 9 topic groups within the STRengthening Analytical Thinking for Observational Studies (STRATOS) Initiative. The objective of STRATOS is to provide accessible and accurate guidance in the design and analysis of observational studies. The guidance is intended for applied statisticians and other data analysts with varying levels of statistical education, experience and interests.

Aims of Survival Analysis Topic Group

In a large proportion of observational studies, including prospective or retrospective longitudinal cohort studies, the outcome of primary interest is the time to the occurrence of a specific event or endpoint, such as death or hospitalization. Because often the events of interest are observable for only some study participants, specialized analytical methods are required to deal with ‘censored observations’, i.e. those subjects who had no event until the end of their follow-up. The development of statistical methods able to handle such censored time-to-event outcomes is the main focus of survival analysis, which is increasingly applied in longitudinal studies across a broad spectrum of empirical sciences. Whereas some methods developed for other types of outcomes, such as continuous, normally distributed or binary variables, can be easily adapted to the analyses of censored data, several important conceptual and analytical challenges are specific to survival analysis. Accordingly, this is a very active but also a rather specialized area of statistical research. Indeed, the end-users (empirical researchers and data analysts) are often either unaware of new survival analytical methods, typically published in statistical journals, or unable to understand why, when and how these methods should be implemented. As a result, a vast majority of real-life applications of survival analysis use only a few, very popular statistical methods, such as Kaplan-Meier curves, log rank test, or Cox proportional hazards (PH) model (Cox 1972), which is employed in >90% of the multivariable time-to-event analyses of clinical data (Altman et al 1995). Yet many end-users may not understand the important assumptions on which such conventional methods rely and do not recognize the impact of violations of these assumptions, and most do not know what alternative, more robust methods can be employed in such cases.

TG8 attempts to help the understanding of the analytical issues, frequently encountered in real-life applications of survival analysis, and provide practical guidance regarding the validated methods and the user-friendly software that can be used to address these issues. To this end, we will draw on both earlier published reviews of the main issues and methods of survival analysis (e.g., Andersen et al 2012, Clark et al 2003, Clayton 1988) and expertise of the TG8 members.

Members

Chairs: Michal Abrahamowicz, Malka Gorfine, Terry Therneau

Members: Federico Ambrogi, Richard Cook, Pierre Joly, Per Kragh Andersen, Torben Martinussen, Maja Pohar-Perme, Hein Putter, Michael Schell, Jeremy Taylor

publications

Andersen PK, Perme MP, van Houwelingen HC, Cook RJ, Joly P, Martinussen T, Taylor JMG, Abrahamowicz M, Therneau TM for the STRATOS TG8 topic group (2020): Analysis of time-to-event for observational studies: Guidance to the use of intensity models. Statistics in Medicine. DOI: 10.1002/sim.8757

Andersen PK, Abrahamowicz M, Therneau TM on behalf of STRATOS TG8 (2019): STRengthening Analytical Thinking for Observational Studies (STRATOS): Introducing the Survival Analysis Topic Group (TG8). Biometric Bulletin; 36(3): 12-13.

Presentations

Software:

PyMSM

PyMSM is a Python package for fitting competing risks and multistate models, with a simple API which allows user-defined model, predictions at a single or population sample level, statistical summaries and figures. Features include: (1) Fit a Competing risks Multistate model based on survival analysis (time-to-event) models. (2) Deals with right censoring, competing events, recurrent events, left truncation, and time-dependent covariates. (3) Run Monte-carlo simulations for paths emitted by the trained model and extract various summary statistics and plots. (4) Load or configure a pre-defined model and run path simulations. (5) Modularity and compatibility for different time-to-event models such as Survival Forests and other custom models.

The relevant link can be found here.

COVID-19 - predicting illness trajectory

This code implements our multi-state survival model. The paper can be found here.

R code for our model, code for fitting similar models to data elsewhere can be found here.

A web-app for patient-level predictions here.

frailty-LTRC

This code implements Gorfine et. al. JASA 2020 marginalized frailty-based illness-death model. The code is comprised of two sections: data generation and analysis.

R Package in R code.

HHG

Heller-Heller-Gorfine (HHG) tests are a set of powerful statistical tests of independence between two random vectors of arbitrary dimensions. For the case of testing independence between random variables (rather than vectors), the package also offers implementations of the DDP and ADP tests, which are consistent against all continuous alternatives but are distribution-free and thus much faster to apply.

R Package in CRAN.

frailtySurv - Survival analysis using farilty models

The package frailtySurv implements semi-parametric consistent estimators for variety of frailty distributions (gamma, log-normal, inverse Gaussian and power variance function), and provides consistent estimators of the standard errors of the parameters’ estimators. The parameters’ estimators are asymptotically normally distributed, and therefore statistical inference based on the results of this package (e.g. hypothesis testing, confidence intervals) can be performed using the normal distribution.

R Package in CRAN.

KONPSURV

The package implements our K-sample omnibus non-PH tests for right-censored data.

R Package in R code

Marginal Survival

The package implements our non-parametric estimation of the marginal survival curve based on data from a case-control family study.

R Package in R code

Quantile Regression

Quantile regression for failure-time data with time-dependent covariates R code.

HERRA

Heritability Estimation using a Regularized Regression Approach R code.

Efficient Study Design with Multiple Measurement Instruments

A Shiny App can be found here.

Teaching:

Survival Analysis (0365-4032-01)

syllabus

Topics in Survival Analysis (0365-4080-01)

syllabus

Statistical Computing (0365-2101-01)

syllabus

Statistics Seminar (0365-3344-01)

syllabus

Publications:

Matching entries: 0
settings...

Technical Reports:

Comment on "Detecting Novel Associations In Large Data Sets" by Reshef Et Al, Science Dec 16, 2011.
M. Gorfine, R. Heller and Y. Heller (2012).
Abstract: Reshef et al. presented a novel measure of dependence - the maximal information coefficient (MIC) aimed to capture a wide range of associations between pairs of variables, and a statistical test for independence based on MIC. They defined a concept of equitability and claim that non-equitable methods are less practical for data exploration. By simple power comparisons, we show that this conclusion is wrong.

Published Research Papers:

2024:
Discrete-time Competing-Risks Regression with or without Penalization
T. Meir, M. Gorfine (2024)
https://arxiv.org/abs/2303.01186
Abstract: Many studies employ the analysis of time-to-event data that incorporates competing risks and right censoring. Most methods and software packages are geared towards analyzing data that comes from a continuous failure time distribution. However, failure-time data may sometimes be discrete either because time is inherently discrete or due to imprecise measurement. This paper introduces a novel estimation procedure for discrete-time survival analysis with competing events. The proposed approach offers two key advantages over existing procedures: first, it accelerates the estimation process; second, it allows for straightforward integration and application of widely used regularized regression and screening methods. We illustrate the benefits of our proposed approach by conducting a comprehensive simulation study. Additionally, we showcase the utility of our procedure by estimating a survival model for the length of stay of patients hospitalized in the intensive care unit, considering three competing events: discharge to home, transfer to another medical facility, and in-hospital death.
Unveiling Challenges in Mendelian Randomization for Gene-Environment Interaction
M. Gorfine, C. Qu, U. Peters, L. Hsu (2024)
https://arxiv.org/abs/2309.12152
Genetic Epidemiology
Abstract: Many diseases and traits involve a complex interplay between genes and environment, generating significant interest in studying gene-environment interaction through observational data. However, for lifestyle and environmental risk factors, they are often susceptible to unmeasured confounding factors and as a result, may bias the assessment of the joint effect of gene and environment. Recently, Mendelian randomization (MR) has evolved into a versatile method for assessing causal relationships based on observational data to account for unmeasured confounders. This approach utilizes genetic variants as instrumental variables (IVs) and aims to offer a reliable statistical test and estimation of causal effects. MR has gained substantial popularity in recent years largely due to the success of large-scale genome-wide association studies in identifying genetic variants associated with lifestyle and environmental factors. Many methods have been developed for MR; however, little work has been done for evaluating gene-environment interaction. In this paper, we focus on two primary IV approaches: the 2-stage predictor substitution (2SPS) and the 2-stage residual inclusion (2SRI), and extend them to accommodate gene-environment interaction under both the linear and logistic regression models for the continuous and binary outcomes, respectively. Extensive simulation and analytical derivations show that finding solutions in the linear regression model setting is relatively straightforward; however, the logistic regression model is significantly more complex and demands additional effort.
2023:
Unlocking Retrospective Prevalent Information in EHRs--a Pairwise Pseudolikelihood Approach
N. Keret, M. Gorfine (2023b)
https://arxiv.org/abs/2309.01128
Abstract: Typically, electronic health record data are not collected towards a specific research question. Instead, they comprise numerous observations recruited at different ages, whose medical, environmental and oftentimes also genetic data are being collected. Some phenotypes, such as disease-onset ages, may be reported retrospectively if the event preceded recruitment, and such observations are termed ``prevalent". The standard method to accommodate this ``delayed entry" conditions on the entire history up to recruitment, hence the retrospective prevalent failure times are conditioned upon and cannot participate in estimating the disease-onset age distribution. An alternative approach conditions just on survival up to recruitment age, plus the recruitment age itself. This approach allows incorporating the prevalent information but brings about numerical and computational difficulties. In this work we develop consistent estimators of the coefficients in a regression model for the age-at-onset, while utilizing the prevalent data. Asymptotic results are provided, and simulations are conducted to showcase the substantial efficiency gain that may be obtained by the proposed approach. In particular, the method is highly useful in leveraging large-scale repositories for replicability analysis of genetic variants. Indeed, analysis of urinary bladder cancer data reveals that the proposed approach yields about twice as many replicated discoveries compared to the popular approach.
Optimal Cox Regression Subsampling Procedure with Rare Events
N. Keret, M. Gorfine (2023)
https://arxiv.org/abs/2012.02122
Journal of the American Statistical Association
Abstract: Massive sized survival datasets are becoming increasingly prevalent with the development of the healthcare industry. Such datasets pose computational challenges unprecedented in traditional survival analysis use-cases. A popular way for coping with massive datasets is downsampling them to a more manageable size, such that the computational resources can be afforded by the researcher. Cox proportional hazards regression has remained one of the most popular statistical models for the analysis of survival data to-date. This work addresses the settings of right censored and possibly left truncated data with rare events, such that the observed failure times constitute only a small portion of the overall sample. We propose Cox regression subsampling-based estimators that approximate their full-data partial-likelihood-based counterparts, by assigning optimal sampling probabilities to censored observations, and including all observed failures in the analysis. Asymptotic properties of the proposed estimators are established under suitable regularity conditions, and simulation studies are carried out to evaluate the finite sample performance of the estimators. We further apply our procedure on UK-biobank colorectal cancer genetic and environmental risk factors.
Shared Frailty Methods for Complex Survival Data: A Review of Recent Advances.
M. Gorfine, D.M. Zucker (2023)
https://arxiv.org/abs/2205.05322
Annual Review of Statistics and Its Application
Abstract: Dependent survival data arise in many contexts. One context is clustered survival data, where survival data are collected on clusters such as families or medical centers. Dependent survival data also arise when multiple survival times are recorded for each individual. Frailty models is one common approach to handle such data. In frailty models, the dependence is expressed in terms of a random effect, called the frailty. Frailty models have been used with both Cox proportional hazards model and the accelerated failure time model. This paper reviews recent developments in the area of frailty models in a variety of settings. In each setting we provide a detailed model description, assumptions, available estimation methods, and R packages.
An Accelerated Failure Time Regression Model for Illness-Death Data: A Frailty Approach
L. Katz, M. Gorfine (2023)
https://arxiv.org/abs/2205.03954
Biometrics
Abstract: This work presents a new model and estimation procedure for the illness-death survival data where the hazard functions follow accelerated failure time (AFT) models. A shared frailty variate induces positive dependence among failure times of a subject for handling the unobserved dependency between the non-terminal and the terminal failure times given the observed covariates. Semi-parametric maximum likelihood estimation procedure is developed via a kernel smoothed-aided EM algorithm, and variances are estimated by weighted bootstrap. The model is presented in the context of existing frailty-based illness-death models, emphasizing the contribution of the current work. The breast cancer data of the Rotterdam tumor bank are analyzed using the proposed and existing illness-death models. The results are contrasted and evaluated based on a new graphical goodness-of-fit procedure. Simulation results and data analysis nicely demonstrate the practical utility of the shared frailty variate with the AFT regression model under the illness-death framework.
Revisiting the Cumulative Incidence Function With Competing Risks Data
D.M. Zucker, M. Gorfine (2023)
https://arxiv.org/abs/2202.11743
Journal of the Royal Statistical Society Series C: Applied Statistics
Abstract: We consider estimation of the cumulative incidence function (CIF) in the competing risks Cox model. We study three methods. Methods 1 and 2 are existing methods while Method 3 is a newly-proposed method. Method 3 is constructed so that the sum of the CIF's across all event types at the last observed event time is guaranteed, assuming no ties, to be equal to 1. The performance of the methods is examined in a simulation study, and the methods are illustrated on a data example from the field of computer code comprehension. The newly-proposed Method 3 exhibits performance comparable to that of Methods 1 and 2 in terms of bias, variance, and confidence interval coverage rates. Thus, with our newly-proposed estimator, the advantage of having the end-of-study total CIF equal to 1 is achieved with no price to be paid in terms of performance.
2022:
PyDTS: A Python Package for Discrete Time Survival Analysis with Competing Risks
T. Meir, R. Gutmanm, M. Gorfine (2022)
https://arxiv.org/abs/2204.05731
Abstract: Time-to-event analysis (survival analysis) is used when the outcome or the response of interest is the time until a pre-specified event occurs. Time-to-event data are sometimes discrete either because time itself is discrete or due to grouping of failure times into intervals or rounding off measurements. In addition, the failure of an individual could be one of several distinct failure types; known as competing risks (events) data. This work focuses on discrete-time regression with competing events. We emphasize the main difference between the continuous and discrete settings with competing events, develop a new estimation procedure, and present PyDTS, an open source Python package which implements our estimation procedure and other tools for discrete-time-survival analysis with competing risks.
Covid-19 mRNA vaccination: age and immune status and its association with axillary lymph node PET/CT uptake
M. Eifer, N. Tau, Y. Alhoubani, N. Kanana, L. Domachevsky, J. Shams, N. Keret, M. Gorfine, Y. Eshet (2022)
https://jnm.snmjournals.org/content/63/1/134.abstract
The Journal of Nuclear Medicine
Abstract: With hundreds of millions of coronavirus disease 2019 (COVID-19) messenger RNA (mRNA)–based vaccine doses planned to be delivered worldwide in the upcoming months, it is important to recognize PET/CT findings in recently vaccinated immunocompetent or immunocompromised patients. We aimed to assess PET/CT uptake in the deltoid muscle and axillary lymph nodes of patients who received a COVID-19 mRNA-based vaccine and to evaluate its association with patient age and immune status. Methods: All consecutive adults who underwent PET/CT scans with any radiotracer at our center during the f irst month of a national COVID-19 vaccination rollout (between December 23, 2020, and January 27, 2021) and had received the vaccination were included. Data on clinical status, laterality, and time from vaccination were prospectively collected, retrospectively analyzed, and correlated with deltoid muscle and axillary lymph node uptake. Results: Of 426 eligible subjects (median age, 67612 y; 49% female), 377 (88%) underwent PET/CT with 18F-FDG, and positive axillary lymph node uptake was seen in 45% of them. Multivariate logistic regression analysis revealed a strong inverse association between positive 18F-FDG uptake in ipsilateral lymph nodes and patient age (odds ratio [OR], 0.57; 95% CI, 0.45–0.72; P,0.001), immunosuppressive treatment (OR, 0.37; 95% CI, 0.20–0.64; P50.003), and presence of hematologic disease (OR, 0.44; 95% CI, 0.24–0.8; P50.021). No such association was found for deltoid muscle uptake. The number of days from the last vaccination and the number of vaccine doses were also significantly associated with increased odds of positive lymph node uptake. Conclusion: After mRNA-based COVID-19 vaccination, a high proportion of patients showed ipsilateral lymph node axillary uptake, which was more commonin immunocompetent patients. This information will help with the recognition of PET/CT pitfalls and may hint about the patient’s immune response to the vaccine.
Causal inference for semi-competing risks data.
D. Nevo, M. Gorfine (2022)
https://arxiv.org/abs/2010.04485
Biostatistics
Abstract: The causal effects of APOE on late-onset Alzheimer's disease (AD) and death are complicated to define because AD may occur under one intervention but not under the other, and because AD occurrence may affect age of death. In this paper, this dual outcome scenario is studied using the semi-competing risks framework for time-to-event data. Two event times are of interest: a non-terminal event time (age at AD diagnosis), and a terminal event time (age at death). AD diagnosis time is observed only if it precedes death, which may occur before or after AD. We propose new estimands for capturing the causal effect of APOE on AD and death. Our proposal is based on a stratification of the population with respect to the order of the two events. We present a novel assumption utilizing the time-to-event nature of the data, which is more exible than the often-invoked monotonicity assumption. We derive results on partial identifiability, suggest a sensitivity analysis approach, and give conditions under which full identification is possible. Finally, we present and implement non-parametric and semi- parametric estimation methods under right-censored semi-competing risks data for studying the complex effect of APOE on AD and death.
PyMSM: Python package for Competing Risks and Multi-State models for Survival Data.
H. Rossman, A. Keshet, M. Gorfine (2022)
Journal of Open Source Software
Abstract: Multi-state survival data are common, and can be used to describe trajectories in diverse applications such as a patient’s health progression through disease states, pickups during the workday of a taxi driver, or a website browsing trajectory to name a few. When faced with such data, a researcher or clinician might seek to characterize the possible transitions between states, their occurrence probabilities, or to predict the trajectory of future observations given various baseline and time-varying individual covariates (features). By fitting a multi-state model, we can learn the hazard for each specific transition, which would later be used to predict future paths. Predicting paths can be used at a single individual level, for example predicting how long a cancer patient will be relapse-free given his current health status, or at what probability will a patient end a trajectory at any of the possible states. At the population level paths predictions can be used, for example, to estimate how many patients which arrive at the emergency-room will need to be admitted, given their covariates. Moreover, their expected hospitalization duration can also be inferred, and provide planners with anticipated patients load.
The role of 68Ga-FAPI PET/CT in detection of metastatic lobular breast cancer.
Y. Eshet, N. Tau, S. Apter, N. Nissan, K. Levanon, R. Bernstein-Molho, O. Globus, A. Itay, T. Shapira, C. Oedegaard, M. Gorfine, M. Eifer, T. Davidson, E. Gal-Yam, L. Domachevsky
Clinical Nuclear Medicine
Abstract: Retinoblastoma is the most common intraocular cancer worldwide. There is some evidence to suggest that major differences exist in treatment outcomes for children with retinoblastoma from different regions, but these differences have not been assessed on a global scale. We aimed to report 3-year outcomes for children with retinoblastoma globally and to investigate factors associated with survival.
The Global Retinoblastoma Outcome Study: a prospective, cluster-based analysis of 4064 patients from 149 countries.
Fabian et al. (2022)
The Lancet global health
Abstract: Retinoblastoma is the most common intraocular cancer worldwide. There is some evidence to suggest that major differences exist in treatment outcomes for children with retinoblastoma from different regions, but these differences have not been assessed on a global scale. We aimed to report 3-year outcomes for children with retinoblastoma globally and to investigate factors associated with survival.
2021:
COVID-19 dynamics following a broad national immunization program in Israel.
H. Rossman, S. Shilo, T. Meir, M. Gorfine*, U. Shalit*, E. Segal* (2021)
https://doi.org/10.1038/s41591-021-01337-2
*Joint last authors and corresponding authors.
Nature Medicine.
Abstract: Studies on the real-life effect of the BNT162b2 vaccine for Coronavirus Disease 2019 (COVID-19) prevention are urgently needed. In this study, we conducted a retrospective analysis of data from the Israeli Ministry of Health collected between 28 August 2020 and 24 February 2021. We studied the temporal dynamics of the number of new COVID-19 cases and hospitalizations after the vaccination campaign, which was initiated on 20 December 2020. To distinguish the possible effects of the vaccination on cases and hospitalizations from other factors, including a third lockdown implemented on 8 January 2021, we performed several comparisons: (1) individuals aged 60 years and older prioritized to receive the vaccine first versus younger age groups; (2) the January lockdown versus the September lockdown; and (3) early-vaccinated versus late-vaccinated cities. A larger and earlier decrease in COVID-19 cases and hospitalization was observed in individuals older than 60 years, followed by younger age groups, by the order of vaccination prioritization. This pattern was not observed in the previous lockdown and was more pronounced in early-vaccinated cities. Our analysis demonstrates the real-life effect of a national vaccination campaign on the pandemic dynamics.
Efficient study design with multiple measurement instruments.
M. Talitman, M. Gorfine, L. Rosen, and D.M. Steinberg (2021)
DOI: 10.1002/sim.9032
Statistics in Medicine.
Abstract: Outcomes from studies assessing exposure often use multiple measurements. In previous work, using a model first proposed by Buonoccorsi (1991), we showed that combining direct (e.g. biomarkers) and indirect (e.g. self-report) measurements provides a more accurate picture of true exposure than estimates obtained when using a single type of measurement. In this article, we propose a valuable tool for efficient design of studies that include both direct and indirect measurements of a relevant outcome. Based on data from a pilot or preliminary study, the tool, which is available online as a shiny app \citep{shinyR}, can be used to compute: (1) the sample size required for a statistical power analysis, while optimizing the percent of participants who should provide direct measures of exposure (biomarkers) in addition to the indirect (self-report) measures provided by all participants; (2) the ideal number of replicates; and (3) the allocation of resources to intervention and control arms. In addition we show how to examine the sensitivity of results to underlying assumptions. We illustrate our analysis using studies of tobacco smoke exposure and nutrition. In these examples, a near-optimal allocation of the resources can be found even if the assumptions are not precise.
Predicting illness trajectory and hospital resource utilization of COVID-19 hospitalized patients - a nationwide study.
M. Roimi, R. Gutman, J. Somer, A. Ben Arie, I. Calman, Y. Bar-Lavie, A. Ziv, D. Eytan, M. Gorfine*, U. Shalit* (2021)
DOI: 10.1093/jamia/ocab005. Epub ahead of print. PMID: 33479727; PMCID: PMC7928913.
*Joint last authors and joint corresponding authors.
Journal of the American Medical Informatics Association
Abstract: Importance: The spread of COVID-19 has led to a severe strain on hospital capacity in many countries. There is a need for a model to help planners assess expected COVID-19 hospital resource utilization. Objective: Provide publicly available tools for predicting future hospital-bed utilization given a succinct characterization of the status of currently hospitalized patients and scenarios for future incoming patients. Design: Retrospective cohort study following the day-by-day clinical status of all hospitalized COVID-19 patients in Israel from March 1st to May 2nd, 2020. Patient clinical course was modelled with a machine learning approach based on a set of multistate Cox regression-based models with adjustments for right censoring, recurrent events, competing events, left truncation, and time-dependent covariates. The model predicts the patient's entire disease course in terms of clinical states, from which we derive the patient's hospital length-of-stay, length-of-stay in critical state, risk of in-hospital mortality, and overall hospital-bed utilization. Accuracy assessed over 8 cross-validation cohorts of size 330, using per-day Mean Absolute Error (MAE) of predicted hospital utilization over time; and area under the receiver operating characteristics (AUROC) for individual risk of critical illness and in-hospital mortality, assessed on the first day of hospitalization. We present predicted hospital utilization under hypothetical incoming patient scenarios. Setting: 27 Israeli hospitals. Participants: During the study period, 2,703 confirmed COVID-19 patients were hospitalized in Israel for 1 day or more; 28 were excluded due to missing age or sex; the remaining 2,675 patients were included in the analysis. Main Outcomes and Measures: Primary outcome: per-day estimate of total number of hospitalized patients and number of patients in critical state; secondary outcome: risk of a single patient experiencing critical illness or in-hospital mortality. Results: For random validation samples of 330 patients, the per-day MAEs for total hospital-bed utilization and critical-bed utilization, averaged over 64 days, were 4.72 ± 1.07 and 1.68 ± 0.40 respectively; the AUROCs for prediction of the probabilities of critical illness and in-hospital mortality were 0.88 ± 0.04 and 0.96 ± 0.04, respectively. We further present the impact of several scenarios of patient influx on healthcare system utilization, demonstrating the ability to accurately plan ahead how to allocate healthcare resources. Conclusions and Relevance: We developed a model that, given basic easily obtained data as input, accurately predicts total and critical care hospital utilization. The model enables evaluating the impact of various patient influx scenarios on hospital utilization. Accurate predictions are also given for individual patients' probability of in-hospital mortality and critical illness. We further provide an R software package and a web-application for the model.
Holocaust experience and mortality patterns: 4-decades follow-up in a population-based cohort.
I., Youssim, M. Gorfine, R., Calderon-Margalit, O., Orly Manor, O., Paltiel, D.S., Siscovick, Y., Friedlander, H. Hochner. (2021)
DOI: 10.1093/aje/kwab021
American Journal of Epidemiology.
Abstract: Cancer and coronary heart disease are prominent causes of death in older populations. Previous studies did not find excess in late-life all-cause mortality among Holocaust survivors living in Israel, yet long-term cause-specific mortality was not investigated. Using detailed cause-specific mortality data (spanning the period 1960’s – 2016) on 22,671 individuals from a population-based cohort, we found excess mortality due to some (e.g. cancer), but not other causes of death. These hazards varied by participants’ sex and were independent of several personal sociodemographic and health-related characteristics. Our findings are in agreement with the recent discoveries on surplus cancer- and other chronic morbidity among Holocaust survivors. Study of mortality associated with the Holocaust increases understanding of health resilience and vulnerability over life span of survivors of extreme traumatic events and can inform health interventions. As man-made mass-killings continue to happen, our findings are also important for populations of survivors in more-recent genocides.
Hospital load and increased COVID-19 related mortality - a nationwide study in Israel.
H., Rossman, T., Meir, J., Somer, S., Shilo, R., Gutman, A., Ben Arie, E., Segal, U., Shalit*, M. Gorfine* (2021)
DOI: 10.1038/s41467-021-22214-z. PMID: 33771988; PMCID: PMC7997985.
*Joint last authors and joint corresponding authors.
Nature Communications, 26;12(1):1904.
Abstract: The spread of Coronavirus disease 19 (COVID-19) has led to many healthcare systems being overwhelmed by the rapid emergence of new cases within a short period of time. We explore the ramifications of hospital load due to COVID-19 morbidity on COVID-19 hospitalized patient mortality. We address this question with a nationwide study based on the records of all 19,336 COVID-19 patients hospitalized in Israel from mid-July 2020 to early January 2021. We show that even under moderately heavy patient load (>500 countrywide hospitalized severely-ill patients; the Israeli Ministry of Health defined 800 severely-ill patients as the maximum capacity allowing adequate treatment), in-hospital mortality rate of patients with COVID-19 significantly increased compared to periods of lower patient load (250-500 severely-ill patients); we further show this higher mortality rate cannot be attributed to changes in the patient population during periods of heavier load.
2019 - 2020:
Marginalized Frailty-Based Illness-Death Model: Application to the UK-Biobank Survival Data.
M. Gorfine, N. Keret, A. Ben Arie, D.M. Zucker, L. Hsu (2020)
DOI: 10.1080/01621459.2020.1831922
Journal of the American Statistical Association.
Abstract: The UK Biobank is a large-scale health resource comprising genetic, environmental and medical information on approximately 500,000 volunteer participants in the UK, recruited at ages 40–69 during the years 2006–2010. The project monitors the health and well-being of its participants. This work demonstrates how these data can be used to estimate in a semi-parametric fashion the effects of genetic and environmental risk factors on the hazard functions of various diseases, such as colorectal cancer. An illness-death model is adopted, which inherently is a semi-competing risks model, since death can censor the disease, but not vice versa. Using a shared-frailty approach to account for the dependence between time to disease diagnosis and time to death, we provide a new illness-death model that assumes Cox models for the marginal hazard functions. The recruitment procedure used in this study introduces delayed entry to the data. An additional challenge arising from the recruitment procedure is that information coming from both prevalent and incident cases must be aggregated. Lastly, we do not observe any deaths prior to the minimal recruitment age, 40. In this work we provide an estimation procedure for our new illness-death model that overcomes all the above challenges.
Estimating the Intervention Effect in Calibration Sub-Studies
M. Talitman, M. Gorfine, D. M. Steinberg, (2020)
Statistics in Medicine. 39(3): 239--251.
Abstract: Exposure assessment is often subject to measurement errors. We consider here the analysis of studies aimed at reducing exposure to potential health hazards, in which exposure is the outcome variable. In these studies, the intervention effect may be estimated using either biomarkers or self-report data, but it is not common to combine these measures of exposure. Bias in the self-reported measures of exposure is a well-known fact; however, only few studies attempt to correct it.Recently, Keogh et al. (2016) addressed this problem, presenting a model for measurement error in this setting and investigating how self-report and biomarker data can be combined. Keogh et al. find the maximum likelihood estimate for the intervention effect in their model via direct numerical maximization of the likelihood. Here, we exploit an alternative presentation of the model that leads us to a closed formula for the MLE and also for its variance, when the number of biomarker replicates is the same for all subjects in the substudy. The variance formula enables efficient design of such intervention studies. When the number of biomarker replicates is not constant, our approach can be used along with the EM-algorithm to quickly compute the MLE. We compare the MLE to Buonaccorsi's method (Buonaccorsi,1996) and find that they have similar efficiency when most subjects have biomarker data,but that the MLE has clear advantages when only a small fraction of subjects has biomarker data. This conclusion extends the findings of Keogh et al(2016) and has practical importance for efficiently designing studies.
K-Sample Omnibus Non-Proportional Hazards Tests Based on Right-Censored Data
M. Gorfine, M. Schlesinger, L. Hsu, (2020)
Statistical Methods in Medical Research. 29(10): 2830--2850.
Abstract: This work presents novel and powerful tests for comparing non-proportional hazard functions, based on sample-space partitions. Right censoring introduces two major difficulties which make the existing sample-space partition tests for uncensored data non-applicable: (i) the actual event times of censored observations are unknown; and (ii) the standard permutation procedure is invalid in case the censoring distributions of the groups are unequal. We overcome these two obstacles, introduce invariant tests, and prove their consistency. Extensive simulations reveal that under non-proportional alternatives, the proposed tests are often of higher power compared with existing popular tests for non-proportional hazards. Efficient implementation of our tests is available in the R package KONPsurv.
https://cran.r-project.org/web/packages/KONPsurv/index.html
Practical Implementation of Frailty Models in Mendelian Risk Prediction
T. Huang, M. Gorfine, L. Hsu, G. Parmigiani, D. Braun (2020)
Genetic Epidemiology. 44: 546--578.
When and Where Do Patients with Bone Metastases Actually Break Their Femur? A CT-Based Ginite Element Analysis of Patients.
A. Sternheim, Y. Kollender, T. Traub, N. Trabelsi, D. Solomon, G. Ariel, D. Nikomarov, Y. Gortzak, M. Gorfine, Z. Yosibash (2020)
The Bone & Joint Journal. 102-B(5): 638--645.
An Improved Fully Nonparametric estimator of the Marginal Survival Function Based on Case-Control Clustered Data
D.M. Zucker, M. Gorfine (2019)
Electronic Journal of Statistics. 13: 5415--5453.
Abstract:A case-control family study is a study where individuals with a disease of interest (case probands) and individuals without the disease (control probands) are randomly sampled from a well-defined population. Possibly right-censored age at onset and disease status are observed for both probands and their relatives. Correlation among the outcomes within a family is induced by factors such as inherited genetic susceptibility,shared environment,and common behavior patterns.For this setting,we present a nonparametric estimator of the marginal survival function,based on local linear estimation of conditional survival functions.Asymptotic theory for the estimator is provided,making this paper the first to present for this data setting a fully nonparametric estimator with proven consistency. Simulation results are presented showing that the method performs well. The method is illustrated on data from a prostate cancer study.
2018:
On Estimation of the Hazard Function from Population-based Case-Control Studies
L. Hsu, M. Gorfine, D. M. Zucker, (2018)
Journal of the Americal Statistical Association. Vol. 113: 560–570.
Abstract: The population-based case-control study design has been widely used for studying the etiology of chronic diseases. It is well established that the Cox proportional hazards model can be adapted to the case-control study and hazard ratios can be estimated by (conditiona) logistic regression model with time as either a matched set or a covariate (Prentice and Breslow, 1978). However, the baseline hazard function, a critical component in absolute risk assessment, is unidentifiable, because the ratio of cases and controls is controlled by the investigators and does not reflect the true disease incidence rate in the population. In this paper we propose a simple and innovative approach, which makes use of routinely collected family history information, to estimate the baseline hazard function for any logistic regression model that is fit to the risk factor data collected on cases and controls. We establish that the proposed baseline hazard function estimator is consistent and asymptotically normal and show via simulation that it performs well in finite samples. We illustrate the proposed method by a population-based case-control study of prostate cancer where the association of various risk factors is assessed and the family history information is used to estimate the baseline hazard function.
General Semiparametric Shared Frailty Model: Estimation and Simulation with frailtySurv
J.V. Monaco, M. Gorfine, L. Hsu, (2018)
Journal of Statistical Software. 86, DOI: 10.18637/jss.v086.i04
Abstract: The R package frailtySurv for simulating and fitting semi-parametric shared frailty models is introduced. The package implements semi-parametric consistent estima- tors for a variety of frailty distributions (gamma, log-normal, inverse Gaussian and power variance function), and provides consistent estimators of the standard errors of the param- eters’ estimators. The parameters’ estimators are asymptotically normally distributed, and therefore statistical inference based on the results of this package (e.g. hypothesis testing, confidence intervals) can be performed using the normal distribution. Exten- sive simulations demonstrate the flexibility and correct implementation of the estimator. Two case studies are performed with publicly-available datasets, including the Diabetic Retinopathy Study and a large hard drive failure dataset where failure times are thought to be clustered by the hard drive manufacturer and model.
The Effect of Bedtime Meal on Fasting Hyperglycemia by Continuous Glucose Monitoring System in Type 2 Diabetes Mellitus Patients
J. Ilany, N. Konvalina, N. Bordo, G. Shlomai, O. Cohen, M. Gorfine, (2018)
J Diabetes Treat: JDBT-161 86, DOI: 10.29011/2574-7568.000061
Abstract: Objective: Fasting hyperglycemia is a significant abnormality in patients with diabetes mellitus and pre-diabetes. The dawn phenomenon is a major contributor to fasting hyperglycemia. We hypothesized that a meal just before bedtime might attenuate the dawn phenomenon and lower fasting hyperglycemia by increasing early morning insulin secretion in these patients. Design: We investigated the effect of different compositions of bedtime meals on the dawn phenomenon and morning glucose level, using a continuous glucose monitoring system. Results: We did not find any significant difference in morning glucose levels between eating any kind of food and not eating at bedtime, for the whole group of 11 patients. However, two patients showed a consistent lowering of fasting blood glucose in response to a bedtime snack, raising the possibility of individual response. Conclusions: This work focuses on the bedtime meal as a simple tool for lowering fasting hyperglycemia in patients with Type 2 Diabetes. Our study implies that most patients would not show a favorable response. However, the study consisted of only 11 participants, and pointed to a possibility of individual response. Therefore, we encourage testing this simple tool of bedtime meal in patients with Type 2 Diabetes, and conducting a larger clinical study.
Nonparametric Adjustment for Measurement Error in Time to Event Data: Application to Risk Prediction Models
D. Braun, M. Gorfine, H. Katki, A. Ziogas, G. Parmigiani, (2018)
Journal of the Americal Statistical Association. 113: 14-25
Abstract: Mismeasured time to event data used as a predictor in risk prediction models will lead to inaccurate predictions. This arises in the context of self-reported family history, a time to event predictor often measured with error, used in Mendelian risk prediction models. Using validation data, we propose a method to adjust for this type of error. We estimate the measurement error process using a nonparametric smoothed Kaplan-Meier estimator, and use Monte Carlo integration to implement the adjustment. We apply our method to simulated data in the context of both Mendelian and multivariate survival prediction models. Simulations are evaluated using measures of mean squared error of prediction (MSEP), area under the response operating characteristics curve (ROC-AUC), and the ratio of observed to expected number of events. These results show that our method mitigates the effects of measurement error mainly by improving calibration and total accuracy. We illustrate our method in the context of Mendelian risk prediction models focusing on misreporting of breast cancer, fitting the measurement error model on data from the University of California at Irvine, and applying our method to counselees from the Cancer Genetics Network. We show that our method improves overall calibration, especially in low risk deciles.
2017:
Heritability Estimation using a Regularized Regression Approach (HERRA): Applicable to Continuous, Dichotomous or Age-at-Onset Outcome
M. Gorfine, S. I. Berndt, J. Chang-Claude, M. Hoffmeister, L. Le Marchand, J. Potter, M. L. Slattery, N. Kerret, U. Pteres, L. Hsu (2017)
PLoS ONE. PLoS ONE 12(8): e0181269.
Abstract: The popular Genome-wide Complex Trait Analysis (GCTA) software uses the random-effects models for estimating the narrow-sense heritability based on GWAS data of unrelated individuals without knowing and identifying the causal loci. Many methods have since extended this approach to various situations. However, since the proportion of causal loci among the variants is typically very small and GCTA uses all variants to calculate the similarities among individuals, the estimation of heritability may be unstable, resulting in a large variance of the estimates. Moreover, if the causal SNPs are not genotyped, GCTA sometimes greatly underestimates the true heritability. We present a novel narrow-sense heritability estimator, named HERRA, using well-developed ultra-high dimensional machine-learning methods, applicable to continuous or dichotomous outcomes, as other existing methods. Additionally, HERRA is applicable to time-to-event or age-at-onset outcome, which, to our knowledge, no existing method can handle. Compared to GCTA and LDAK for continuous and binary outcomes, HERRA often has a smaller variance, and when causal SNPs are not genotyped, HERRA has a much smaller empirical bias. We applied GCTA, LDAK and HERRA to a large colorectal cancer dataset using dichotomous outcome (4,312 cases, 4,356 controls, genotyped using Illumina 300K), the respective heritability estimates of GCTA, LDAK and HERRA are 0.068 (SE=0.017), 0.072 (SE=0.021) and 0.110 (SE=5.19 x 10−3). HERRA yields over 50% increase in heritability estimate compared to GCTA or LDAK.
Propensity scores with misclassified treatment assignment: a likelihood-based approach
D. Braun, M. Gorfine, G. Parmigiani, N. D. Arvold, F. Dominici and C. Zigler
Biostatistics. Vol. 18, pp. 695-710
Abstract: Propensity score methods are widely used in comparative effectiveness research using claims data. In this context, the inaccuracy of procedural or billing codes in claims data frequently misclassi fies patients into treatment groups, that is, the treatment assignment (T) is often measured with error. In the context of a validation data where treatment assignment is accurate, we show that misclassification of treatment assignment can impact three distinct stages of a propensity score analysis: 1) propensity score estimation; 2) propensity score implementation (e.g., weighting or matching); and 3) outcome analysis conducted conditional on the estimated propensity score and its implementation. We examine how the error in T impacts each stage in the context of three common propensity score implementations: subclassification, matching, and inverse probability of treatment weighting (IPTW). Using validation data, we propose a two-step likelihood-based approach which fully adjusts for treatment misclassification bias when subclassifying based on the propensity score. We use simulation studies to assess the performance of the adjustment when using subclassification, and also investigate the method’s performance when using matching or IPTW. We apply the methods to Medicare Part A hospital claims data to estimate the effect of resection versus biopsy on 1-year mortality among 10,284 Medicare beneficiaries diagnosed with brain tumors. The ICD9 billing codes from Medicare Part A inaccurately reflect surgical treat ment, but SEER-Medicare data are available as validation data with more accurate information.
A fully nonparametric estimator of the marginal survival function based on case-control clustered age-at-onset data
M. Gorfine, N. Bordo and L. Hsu (2017)
Biostatistics. Vol. 18 (1) , pp. 76-90.
Abstract: Consider a popular case-control family study where individuals with a disease under study (case probands) and individuals who do not have the disease (control probands) are randomly sam- pled from a well defined population. Possibly right-censored age at onset and disease status are observed for both probands and their relatives. For example, case probands are men diagnosed with prostate cancer, control probands are men free of prostate cancer, and the prostate cancer history of the fathers of the probands is also collected. Inherited genetic susceptibility, shared en- vironment, and common behavior lead to correlation among the outcomes within a family. In this work, a novel nonparametric estimator of the marginal survival function is provided. The estima- tor is defined in the presence of intra-cluster dependence, and is based on consistent smoothed kernel estimators of conditional survival functions. By simulation, it is shown that the proposed estimator performs very well in terms of bias. The utility of the estimator is illustrated by the analysis of case-control family data of early onset prostate cancer. To our knowledge, this is the first work that provides a fully nonparametric marginal survival estimator based on case-control clustered age at onset data.
A quantile regression model for failure-time data with time- dependent covariates
M. Gorfine, Y. Goldberg and Y. Ritov (2017)
Biostatistics. Vol. 18 (1) , pp. 132-146.
Abstract: Since survival data occur over time, often important covariates that we wish to consider also change over time. Such covariates are referred as time-dependent covariates. Quantile regression offers flexible modeling of survival data by allowing the covariates to vary with quantiles. This paper provides a novel quantile regression model accommodating time-dependent covariates, for analyzing survival data subject to right censoring. Our simple estimation technique assumes the existence of instrumental variables. In addition, we present a doubly-robust estimator in the sense of Robins and Rotnitzky (1992). The asymptotic properties of the estimators are rigorously studied. Finite-sample properties are demonstrated by a simulation study. The utility of the proposed methodology is demonstrated using the Stanford heart transplant dataset.
2016:
Simulation of gross solids generation and transport in an Israeli sewer system
M. Schutze, R. Penn, E. Friedler, M. Gorfine, J. Alex (2016)
Urban Water Journal.
Abstract: Together with significant water saving, due to lower wastewater discharges, greywater reuse (GWR) may also increase the impacts of faecal matter and other gross solids (GS) on sewer systems (sedimentation, blocking).. Modelling and assessments of these effects require detailed description of individual domestic wastewater streams. A novel stochastic methodology to generate such streams was developed and validated. The generator was developed while considering dependencies between in-house water-use events, and their varying degree of elasticity. These generated streams served as input to a sewer model, where scenarios of GWR were simulated and their effects on GS transport were examined. Extensive GWR decreases GS movement mostly in upstream links. Nevertheless, transient high flows, characterising these links, may move stationary GS and prevent blockages. The developed generator can be adopted to other sewer-relatedstudies. Intelligent implementation of the results can assist in introducing GWR, or other water saving measures to the urban sector.
Change-point detection for infinte horion dynamic treatment regimes
Y. Goldberg, M. Pollak, A. Mitelpunkt, M. Orlovsky, A. Weiss-Meilik, M. Gorfine (2016)
Statistical Methods in Meical Research.
Abstract: A dynamic treatment regime is a set of decision rules for how to treat a patient at multiple time-points. At each time-point, a treatment decision is made depending on the patient’s medical history up to that point. We consider the infinite-horizon setting in which the number of decision points is very large. Specifically, we consider long trajectories of patients’ measurements recorded over time. At each time-point, the decision whether to intervene or not is conditional on whether or not there was a change in the patient’s trajectory. We present change-point detection tools and show how to use them in defining dynamic treatment regimes. The performance of these regimes is assessed using an extensive simulation study. We demonstrate the utility of the proposed change-point detection approach using two case studies: detection of sepsis in preterm infants in the intensive care unit, and detection of a change in glucose levels of a diabetic patient.
Consistent distribution-free K-sample and independence tests for univariate random variables
R. Heller, Y. Heller, S. Kaufman, B. Brill and M. Gorfine (2016)
Journal of Machine Learning Research. Vol. 17, 1-54.
Abstract: A popular approach for testing if two univariate random variables are statistically independent consists of partitioning the sample space into bins, and evaluating a test statistic on the binned data. The partition size matters, and the optimal partition size is data dependent. While for detecting simple relationships coarse partitions may be best, for detecting complex relationships a great gain in power can be achieved by considering finer partitions. We suggest novel consistent distribution-free tests that are based on summation or maximization aggregation of scores over all partitions of a fixed size. We show that our test statistics based on summation can serve as good estimators of the mutual information. Moreover, we suggest regularized tests that aggregate over all partition sizes, and prove those are consistent too. We provide polynomial-time algorithms, which are critical for computing the suggested test statistics efficiently. We show that the power of the regularized tests is excellent compared to existing tests, and almost as powerful as the tests based on the optimal (yet unknown in practice) partition size, in simulations as well as on a real data example.
2015:
Function of cancer associated genes revealed by modern univariate and multivariate association tests.
M. Gorfine, B. Goldstein, A. Fishman, R. Heller, Y. Heller, A. T. Lamm (2015)
PloS one. Vol. 10 (5) , pp. e0126544.
Abstract: Copy number variation (CNV) plays a role in pathogenesis of many human diseases, especially cancer. Several whole genome CNV association studies have been performed for the purpose of identifying cancer associated CNVs. Here we undertook a novel approach to whole genome CNV analysis, with the goal being identification of associations between CNV of different genes (CNV-CNV) across 60 human cancer cell lines. We hypothesize that these associations point to the roles of the associated genes in cancer, and can be indicators of their position in gene networks of cancer-driving processes. Recent studies show that gene associations are often non-linear and non-monotone. In order to obtain a more complete picture of all CNV associations, we performed omnibus univariate analysis by utilizing dCov, MIC, and HHG association tests, which are capable of detecting any type of association, including non-monotone relationships. For comparison we used Spearman and Pearson association tests, which detect only linear or monotone relationships. Application of dCov, MIC and HHG tests resulted in identification of twice as many associations compared to those found by Spearman and Pearson alone. Interestingly, most of the new associations were detected by the HHG test. Next, we utilized dCov's and HHG's ability to perform multivariate analysis. We tested for association between genes of unknown function and known cancer-related pathways. Our results indicate that multivariate analysis is much more effective than univariate analysis for the purpose of ascribing biological roles to genes of unknown function. We conclude that a combination of multivariate and univariate omnibus association tests can reveal significant information about gene networks of disease-driving processes. These methods can be applied to any large gene or pathway dataset, allowing more comprehensive analysis of biological processes.
The impact of covariate measurement error on risk prediction
P. Khudyakov, M. Gorfine, D. Zucker and D. Spiegelman (2015)
Statistics in Medicine, Vol. 34 (15) , pp. 2353-2367.
Abstract: In the development of risk prediction models, predictors are often measured with error. In this paper, we investigate the impact of covariate measurement error on risk prediction. We compare the prediction performance using a costly variable measured without error, along with error-free covariates, to that of a model based on an inexpensive surrogate along with the error-free covariates. We consider continuous error-prone covariates with homoscedastic and heteroscedastic errors, and also a discrete misclassified covariate. Prediction performance is evaluated by the area under the receiver operating characteristic curve (AUC), the Brier score (BS), and the ratio of the observed to the expected number of events (calibration). In an extensive numerical study, we show that (i) the prediction model with the error-prone covariate is very well calibrated, even when it is mis-specified; (ii) using the error-prone covariate instead of the true covariate can reduce the AUC and increase the BS dramatically; (iii) adding an auxiliary variable, which is correlated with the error-prone covariate but conditionally independent of the outcome given all covariates in the true model, can improve the AUC and BS substantially. We conclude that reducing measurement error in covariates will improve the ensuing risk prediction, unless the association between the error-free and error-prone covariates is very high. Finally, we demonstrate how a validation study can be used to assess the effect of mismeasured covariates on risk prediction. These concepts are illustrated in a breast cancer risk prediction model developed in the Nurses' Health Study. Copyright 2015 John Wiley & Sons, Ltd.
A Novel Host-Proteome Signature for Distinguishing between Acute Bacterial and Viral Infections.
K. Oved, A. Cohen, O. Boico, R. Navon, T. Friedman, L. Etshtein, O. Kriger, E. Bamberger, Y. Fonar, R. Yacobov, R. Wolchinsky, G. Denkberg, Y. Dotan, A. Hochberg, Y. Reiter, M. Grupper, I. Srugo, P. Feigin, M. Gorfine, I. Chistyakov, R. Dagan, A. Klein, I. Potasman and E. Eden (2015)
PloS one. Vol. 10 (3) , pp. e0120012.
Abstract: Bacterial and viral infections are often clinically indistinguishable, leading to inappropriate patient management and antibiotic misuse. Bacterial-induced host proteins such as procalcitonin, C-reactive protein (CRP), and Interleukin-6, are routinely used to support diagnosis of infection. However, their performance is negatively affected by inter-patient variability, including time from symptom onset, clinical syndrome, and pathogens. Our aim was to identify novel viral-induced host proteins that can complement bacterial-induced proteins to increase diagnostic accuracy. Initially, we conducted a bioinformatic screen to identify putative circulating host immune response proteins. The resulting 600 candidates were then quantitatively screened for diagnostic potential using blood samples from 1002 prospectively recruited patients with suspected acute infectious disease and controls with no apparent infection. For each patient, three independent physicians assigned a diagnosis based on comprehensive clinical and laboratory investigation including PCR for 21 pathogens yielding 319 bacterial, 334 viral, 112 control and 98 indeterminate diagnoses; 139 patients were excluded based on predetermined criteria. The best performing host-protein was TNF-related apoptosis-inducing ligand (TRAIL) (area under the curve [AUC] of 0.89; 95% confidence interval [CI], 0.86 to 0.91), which was consistently up-regulated in viral infected patients. We further developed a multi-protein signature using logistic-regression on half of the patients and validated it on the remaining half. The signature with the highest precision included both viral- and bacterial-induced proteins: TRAIL, Interferon gamma-induced protein-10, and CRP (AUC of 0.94; 95% CI, 0.92 to 0.96). The signature was superior to any of the individual proteins (P<0.001), as well as routinely used clinical parameters and their combinations (P<0.001). It remained robust across different physiological systems, times from symptom onset, and pathogens (AUCs 0.87-1.0). The accurate differential diagnosis provided by this novel combination of viral- and bacterial-induced proteins has the potential to improve management of patients with acute infections and reduce antibiotic misuse.
2013 - 2014:
Misreported Family Histories and Underestimation of Risk
D. Braun, M. Gorfine and G. Parmigiani (2014)
Journal of Clinical Oncology. Vol. 32 (32) , pp. 3682-3.
Calibrated predictions for multivariate competing risks models
M. Gorfine, L. Hsu, D. M. Zucker and G. Parmigiani (2014)
Lifetime Data Analysis. Vol. 20 (2) , pp. 234-251.
Abstract: Prediction models for time-to-event data play a prominent role in assessing the individual risk of a disease, such as cancer. Accurate disease prediction models provide an efficient tool for identifying individuals at high risk, and provide the groundwork for estimating the population burden and cost of disease and for developing patient care guidelines. We focus on risk prediction of a disease in which family history is an important risk factor that reflects inherited genetic susceptibility, shared environment, and common behavior patterns. In this work family history is accommodated using frailty models, with the main novel feature being allowing for competing risks, such as other diseases or mortality. We show through a simulation study that naively treating competing risks as independent right censoring events results in non-calibrated predictions, with the expected number of events overestimated. Discrimination performance is not affected by ignoring competing risks. Our proposed prediction methodologies correctly account for competing events, are very well calibrated, and easy to implement.
Frailty Models for Familial Risk With Application to Breast Cancer
M. Gorfine, L. Hsu and G. Parmigiani (2013)
Journal of the American Statistical Association. Vol. 108 (504) , pp. 1205-1215.
Abstract: In evaluating familial risk for disease we have two main statistical tasks: assessing the probability of carrying an inherited genetic mutation conferring higher risk; and predicting the absolute risk of developing diseases over time, for those individuals whose mutation status is known. Despite substantial progress, much remains unknown about the role of genetic and environmental risk factors, about the sources of variation in risk among families that carry high-risk mutations, and about the sources of familial aggregation beyond major Mendelian effects. These sources of heterogeneity contribute substantial variation in risk across families. In this paper we present simple and efficient methods for accounting for this variation in familial risk assessment. Our methods are based on frailty models. We implemented them in the context of generalizing Mendelian models of cancer risk, and compared our approaches to others that do not consider heterogeneity across families. Our extensive simulation study demonstrates that when predicting the risk of developing a disease over time conditional on carrier status, accounting for heterogeneity results in a substantial improvement in the area under the curve of the receiver operating characteristic. On the other hand, the improvement for carriership probability estimation is more limited. We illustrate the utility of the proposed approach through the analysis of BRCA1 and BRCA2 mutation carriers in the Washington Ashkenazi Kin-Cohort Study of Breast Cancer.
A Regularization Corrected Score Method for Nonlinear Regression Models with Covariate Error
D. M. Zucker, M. Gorfine, Y. Li, M. G. Tadesse and D. Spiegelman (2013)
Biometrics. Vol. 69 (March) , pp. 80-90.
Abstract: Summary Many regression analyses involve explanatory variables that are measured with error, and failing to account for this error is well known to lead to biased point and interval estimates of the regression coefficients. We present here a new general method for adjusting for covariate error. Our method consists of an approximate version of the Stefanski-Nakamura corrected score approach, using the method of regularization to obtain an approximate solution of the relevant integral equation. We develop the theory in the setting of classical likelihood models; this setting covers, for example, linear regression, nonlinear regression, logistic regression, and Poisson regression. The method is extremely general in terms of the types of measurement error models covered, and is a functional method in the sense of not involving assumptions on the distribution of the true covariate. We discuss the theoretical properties of the method and present simulation results in the logistic regression setting (univariate and multivariate). For illustration, we apply the method to data from the Harvard Nurses' Health Study concerning the relationship between physical activity and breast cancer mortality in the period following a diagnosis of breast cancer.
2012:
Conditional and marginal estimates in case-control family data-extensions and sensitivity analyses
M. Gorfine, R. De-Picciotto and L. Hsu (2012)
Journal of Statistical Computation and Simulation. Vol. 82 (10) , pp. 1449-1470.
Abstract: This work considers two specific estimation techniques for the family specific proportional hazards model and for the population-averaged proportional hazards model. So far, these two estimation procedures were presented and studied under the gamma frailty distribution mainly because of its simple interpretation and mathematical tractability. Modifications of both procedures for other frailty distributions, such as inverse Gaussian, positive stable and a specific case of discrete distribution, are presented. By extensive simulations, it is shown that under the family specific proportional hazards model, the gamma frailty model appears to be robust to frailty distribution misspecification in both bias and efficiency loss in the marginal parameters. The population-averaged proportional hazards model, is found to be robust under the gamma frailty model misspecification only under moderate or weak dependency within cluster members.
A class of multivariate distribution-free tests of independence based on graphs
R. Heller, M. Gorfine and Y. Heller (2012)
Journal of Statistical Planning and Inference. Vol. 142 (12) , pp. 3097-3106.
Abstract: A class of distribution-free tests is proposed for the independence of two subsets of response coordinates. The tests are based on the pairwise distances across subjects within each subset of the response. A complete graph is induced by each subset of response coordinates, with the sample points as nodes and the pairwise distances as the edge weights. The proposed test statistic depends only on the rank order of edges in these complete graphs. The response vector may be of any dimensions. In particular, the number of samples may be smaller than the dimensions of the response. The test statistic is shown to have a normal limiting distribution with known expectation and variance under the null hypothesis of independence. The exact distribution free null distribution of the test statistic is given for a sample of size 14, and its Monte-Carlo approximation is considered for larger sample sizes. We demonstrate in simulations that this new class of tests has good power properties for very general alternatives. ?? 2012 Elsevier B.V.
A consistent multivariate test of association based on ranks of distances
R. Heller, Y. Heller and M. Gorfine (2012)
Biometrika. Vol. 100 (2) , pp. 503-510.
Abstract: We consider the problem of detecting associations between random vectors of any dimension. Few tests of independence exist that are consistent against all dependent alternatives. We propose a powerful test that is applicable in all dimensions and consistent against all alternatives. The test has a simple form, is easy to implement, and has good power.
Bias correction in the hierarchical likelihood approach to the analysis of multivariate survival data
J. Jeon, L. Hsu and M. Gorfine (2012)
Biostatistics. Vol. 13 (3) , pp. 384-397.
Abstract: Frailty models are useful for measuring unobserved heterogeneity in risk of failures across clusters, providing cluster-specific risk prediction. In a frailty model, the latent frailties shared by members within a cluster are assumed to act multiplicatively on the hazard function. In order to obtain parameter and frailty variate estimates, we consider the hierarchical likelihood (H-likelihood) approach (Ha, Lee and Song, 2001. Hierarchical-likelihood approach for frailty models. Biometrika 88, 233-243) in which the latent frailties are treated as "parameters" and estimated jointly with other parameters of interest. We find that the H-likelihood estimators perform well when the censoring rate is low, however, they are substantially biased when the censoring rate is moderate to high. In this paper, we propose a simple and easy-to-implement bias correction method for the H-likelihood estimators under a shared frailty model. We also extend the method to a multivariate frailty model, which incorporates complex dependence structure within clusters. We conduct an extensive simulation study and show that the proposed approach performs very well for censoring rates as high as 80 We also illustrate the method with a breast cancer data set. Since the H-likelihood is the same as the penalized likelihood function, the proposed bias correction method is also applicable to the penalized likelihood estimators.
2011:
Frailty-Based Competing Risks Model for Multivariate Survival Data
M. Gorfine and L. Hsu (2011)
Biometrics. Vol. 67 (2) , pp. 415-426.
Abstract: In this work, we provide a new class of frailty-based competing risks models for clustered failure times data. This class is based on expanding the competing risks model of Prentice et al. (1978, Biometrics 34, 541-554) to incorporate frailty variates, with the use of cause-specific proportional hazards frailty models for all the causes. Parametric and nonparametric maximum likelihood estimators are proposed. The main advantages of the proposed class of models, in contrast to the existing models, are: (1) the inclusion of covariates; (2) the flexible structure of the dependency among the various types of failure times within a cluster; and (3) the unspecified within-subject dependency structure. The proposed estimation procedures produce the most efficient parametric and semiparametric estimators and are easy to implement. Simulation studies show that the proposed methods perform very well in practical situations.
Missing genetic information in case-control family data with general semi-parametric shared frailty model
A. Graber-Naidich, M. Gorfine, K. E. Malone and L. Hsu (2011)
Lifetime Data Analysis. Vol. 17 (2) , pp. 175-194.
Abstract: Case-control family data are now widely used to examine the role of gene-environment interactions in the etiology of complex diseases. In these types of studies, exposure levels are obtained retrospectively and, frequently, information on most risk factors of interest is available on the probands but not on their relatives. In this work we consider correlated failure time data arising from population-based case-control family studies with missing genotypes of relatives. We present a new method for estimating the age-dependent marginalized hazard function. The proposed technique has two major advantages: (1) it is based on the pseudo full likelihood function rather than a pseudo composite likelihood function, which usually suffers from substantial efficiency loss; (2) the cumulative baseline hazard function is estimated using a two-stage estimator instead of an iterative process. We assess the performance of the proposed methodology with simulation studies, and illustrate its utility on a real data example.
Glucose homeostasis abnormalities assessed by an OGTT in coronary artery disease patients during admission and follow-up at ambulation
J. Ilany, L. Michael, O. Cohen, S. Matetzky, M. Gorfine, H. Hod and A. Karasik (2011)
Experimental and Clinical Endocrinology and Diabetes. Vol. 119 (8) , pp. 463-466.
Abstract: Most non diabetic patients admitted with acute coronary syndrome (ACS) demonstrate an abnormality in glucose homeostasis. It was claimed that an oral glucose tolerance test (OGTT) undertaken during the admission is a good indicator of the patient's glycemic status.
Combining longitudinal discriminant analysis and partial area under the ROC curve to predict non-response to treatment for hepatitis C virus.
E. Lukasiewicz, M. Gorfine, A. U. Neumann and L. S. Freedman (2011)
Statistical methods in medical research. Vol. 20 (3) , pp. 275-289.
Abstract: A longitudinal discriminant analysis is applied to build predictive models based on repeated measurements of serum hepatitis C virus RNA. These models are evaluated through the partial area under the receiver operating curve index (PA index) and, the final selection of the best model is based on cross-validated estimates of the PA index. Models are compared by building 95% bootstrap confidence interval for the difference in PA index between two models. Data from a randomised trial, in which chronic HCV patients were enrolled, are used to illustrate the application of the proposed method to predict treatment outcome.
Sensitivity analysis for complex ecological models - A new approach
V. Makler-Pick, G. Gal, M. Gorfine, M. R. Hipsey and Y. Carmel (2011)
Environmental Modelling and Software. Vol. 26 (2) , pp. 124-134.
Abstract: A strategy for global sensitivity analysis of a multi-parameter ecological model was developed and used for the hydrodynamic-ecological model (DYRESM-CAEDYM, DYnamic REservoir Simulation Model-Computational Aquatic Ecosystem Dynamics Model) applied to Lake Kinneret (Israel). Two different methods of sensitivity analysis, RPART (Recursive Partitioning And Regression Trees) and GLM (General Linear Model) were applied in order to screen a subset of significant parameters. All the parameters which were found significant by at least one of these methods were entered as input to a GBM (Generalized Boosted Modeling) analysis in order to provide a quantitative measure of the sensitivity of the model variables to these parameters. Although the GBM is a general and powerful machine learning algorithm, it has substantial computational costs in both storage requirements and CPU time. Employing the screening stage reduces this cost. The results of the analysis highlighted the role of particulate organic material in the lake ecosystem and its impact on the over all lake nutrient budget. The GBM analysis established, for example, that parameters such as particulate organic material diameter and density were particularly important to the model outcomes. The results were further explored by lumping together output variables that are associated with sub-components of the ecosystem. The variable lumping approach suggested that the phytoplankton group is most sensitive to parameters associated with the dominant phytoplankton group, dinoflagellates, and with nanoplankton (Chlorophyta), supporting the view of Lake Kinneret as a bottom-up system. The study demonstrates the effectiveness of such procedures for extracting useful information for model calibration and guiding further data collection. ?? 2010 Elsevier Ltd.
2009 - 2010:
Prediction of nonSVR to therapy with pegylated interferon-$2a and ribavirin in chronic hepatitis C genotype 1 patients after 4, 8 and 12 weeks of treatment
E. Lukasiewicz, M. Gorfine, L. S. Freedman, J. M. Pawlotsky, S. W. Schalm, C. Ferrari, S. Zeuzem and A. U. Neumann (2010)
Journal of Viral Hepatitis. Vol. 17 (5) , pp. 345-351.
Abstract: In patients with chronic hepatitis C genotype 1, the current algorithm for treatment discontinuation is based on no early virological response (<2 log decline in hepatitis C virus (HCV)-RNA) at 12weeks. It is important to determine whether prediction of nonsustained virological response (NR) before 12weeks can be robustly obtained by statistical methods. We used longitudinal discriminant analysis (LDA) to build and cross-validate models including baseline patient characteristics and measurements of serum HCV-RNA in the first 4, 8 or 12weeks of treatment. The performance of each model was evaluated by the partial AUC (PA) index, exploring the accuracy of prediction in the range of high negative predictive values. Models were compared by computing 95% confidence intervals for the difference in PA indices. NR was best predicted before week 12 by a single HCV-RNA measurement at week 8 taken together with gender, BMI and age (W8 model, PA index=0.857). This model was not inferior to models that included a measurement at week 12 (PA index=0.831). The best model obtained with LDA within the first 4weeks, which included measurements at days 4, 8 and at week 4, was found to be inferior to the week 8 model (PA index=0.796). These results indicate that lack of sustained viral response is best predicted after 8weeks of treatment and that waiting until 12weeks does not improve the prediction
Case-control survival analysis with a general semizparametric shared frailty model: A pseudo full likelihood approach
M. Gorfine, D. M. Zucker and L. Hsu (2009)
Annals of Statistics. Vol. 37 (3) , pp. 1489-1517.
Abstract: In this work we deal with correlated failure time (age at onset) data arising from population-based case-control studies, where case and control probands are selected by population-based sampling and an array of risk factor measures is collected for both cases and controls and their relatives. Parameters of interest are effects of risk factors on the hazard function of failure times and within-family dependencies of failure times after adjusting for the risk factors. Due to the retrospective nature of sampling, a large sample theory for existing methods has not been established. We develop a novel estimation techniques for estimating these parameters under a general semiparametric shared frailty model. We also present a simple, easily computed, and non-iterative nonparametric estimator for the cumulative baseline hazard function. A rigorous large sample theory for the proposed estimators of these parameters is given along with simulations and a real data example illustrate the utility of the proposed method.
Glucose homeostasis abnormalities in cardiac intensive care unit patients
J. Ilany, I. Marai, O. Cohen, S. Matetzky, M. Gorfine, I. Erez, H. Hod and A. Karasik (2009)
Acta Diabetologica. Vol. 46 (3) , pp. 209-216.
Abstract: The aim of this study was to characterize the abnormalities in glucose homeostasis in intensive care unit patients following an acute coronary event. The study population included all non-diabetic patients ages 20-80 years that were admitted to a coronary intensive unit. Glucose, insulin and C-peptide levels during an oral glucose tolerance test (OGTT) were measured during the acute admission. From January to September 2003, 277 patients were admitted to the coronary unit. Of these, 127 patients underwent an OGTT. Of these, only 29 patients (23 exhibited normal glucose metabolism. The remainder had type 2 diabetes (32, impaired glucose tolerance (37 or isolated impaired fasting glucose (8 100-125 mg/dl). Based on homeostasis model assessment (HOMA) calculations, diabetic patients had impaired beta-cell function and patients with elevated fasting glucose levels were insulin resistant. Beta-cell dysfunction during the acute stress seems to contribute to the glucose abnormalities. Most patients who experience an acute coronary event demonstrate abnormal glucose metabolism. Post glucose-load abnormalities are more common than abnormal fasting glucose level in this situation. It is postulated that the acute stress of a coronary event may contribute to the dysglycemia.
2007 - 2008:
Antigen-driven selection in germinal centers as reflected by the shape characteristics of immunoglobulin gene lineage trees: A large-scale simulation study
G. Shahaf, M. Barak, N. S. Zuckerman, N. Swerdlin, M. Gorfine and R. Mehr (2008)
Journal of Theoretical Biology. Vol. 255 (2) , pp. 210-222.
Abstract: During the immune response, the generation of memory B lymphocytes in germinal centers involves affinity maturation of the cells' antigen receptors, based on somatic hypermutation of receptor genes and antigen-driven selection of the resulting mutants. Affinity maturation is vital for immune protection, and is the basis of humoral immune learning and memory. Lineage trees of somatically hypermutated immunoglobulin genes often serve to qualitatively illustrate claims concerning the dynamics of affinity maturation in germinal centers. Here, we derive the quantitative relationships between parameters characterizing affinity maturation dynamics (proliferation, differentiation and mutation rates, initial affinity of the Ig to the antigen, and selection thresholds) and the mathematical properties of lineage trees, using a computer simulation which combines mathematical models for all mature B cell populations, stochastic models of hypermutation and selection, lineage tree generation and measurement of graphical tree characteristics. We identified seven key lineage tree properties, and found correlations of these with initial clone affinity and with the selection threshold. These two parameters were found to be the main factors affecting lineage tree shapes in both primary and secondary response trees. The results also confirm that recycling from centrocytes back to centroblasts is highly likely. ?? 2008 Elsevier Ltd. All rights reserved.
Pseudo-full likelihood estimation for prospective survival analysis with a general semiparametric shared frailty model: Asymptotic theory
D. M. Zucker, M. Gorfine and L. Hsu (2008)
Journal of Statistical Planning and Inference. Vol. 138 (7) , pp. 1998-2016.
Abstract: In this work we present a simple estimation procedure for a general frailty model for analysis of prospective correlated failure times. Earlier work showed this method to perform well in a simulation study. Here we provide rigorous large-sample theory for the proposed estimators of both the regression coefficient vector and the dependence parameter, including consistent variance estimators. ?? 2007 Elsevier B.V. All rights reserved.
Linear measurement error models with restricted sampling
M. Gorfine, N. Lipshtat, L. S. Freedman and R. L. Prentice (2007)
Biometrics. Vol. 63 (1) , pp. 137-142.
Abstract: The relationship between nutrient consumption and chronic disease risk is the focus of a large number of epidemiological studies where food frequency questionnaires (FFQ) and food records are commonly used to assess dietary intake. However, these self-assessment tools are known to involve substantial random error for most nutrients, and probably important systematic error as well. Study subject selection in dietary intervention studies is sometimes conducted in two stages. At the first stage, FFQ-measured dietary intakes are observed and at the second stage another instrument, such as a 4-day food record, is administered only to participants who have fulfilled a prespecified criterion that is based on the baseline FFQ-measured dietary intake (e.g., only those reporting percent energy intake from fat above a prespecified quantity). Performing analysis without adjusting for this truncated sample design and for the measurement error in the nutrient consumption assessments will usually provide biased estimates for the population parameters. In this work we provide a general statistical analysis technique for such data with the classical additive measurement error that corrects for the two sources of bias. The proposed technique is based on multiple imputation for longitudinal data. Results of a simulation study along with a sensitivity analysis are presented, showing the performance of the proposed method under a simple linear regression model.
On robustness of marginal regression coefficient estimates and hazard functions in multivariate survival analysis of family data when the frailty distribution is mis-specified
L. Hsu, M. Gorfine and K. Malone (2007)
Statistics in Medicine. Vol. 26 (25) , pp. 4657-4678.
Abstract: The shared frailty model is an extension of the Cox model to correlated failure times and, essentially, a random effects model for failure time outcomes. In this model, the latent frailty shared by individual members in a cluster acts multiplicatively as a factor on the hazard function and is typically modelled parametrically. One commonly used distribution is gamma, where both shape and scale parameters are set to be the same to allow for unique identification of baseline hazard function. It is popular because it is a conjugate prior, and the posterior distribution possesses the same form as gamma. In addition, the parameter can be interpreted as a time-independent cross-ratio function, a natural extension of odds ratio to failure time outcomes. In this paper, we study the effect of frailty distribution mis-specification on the marginal regression estimates and hazard functions under assumed gamma distribution with an application to family studies. The simulation results show that the biases are generally 10% and lower, even when the true frailty distribution deviates substantially from the assumed gamma distribution. This suggests that the gamma frailty model can be a practical choice in real data analyses if the regression parameters and marginal hazard function are of primary interest and individual cluster members are exchangeable with respect to their dependencies.
2005 - 2006:
Prospective survival analysis with a general semiparametric shared frailty model: A pseudo full likelihood approach
M. Gorfine, D. M. Zucker and L. Hsu (2006)
Biometrika. Vol. 93 (3) , pp. 735-741.
Abstract: In this work we provide a simple estimation procedure for a general frailty model for analysis of prospective correlated failure times. Rigorous large-sample theory for the proposed estimators of both the regression coefficient vector and the dependence parameter is given, including consistent variance estimators. In a simulation study under the widely used gamma frailty model, our proposed approach was found to have essentially the same efficiency as the EM-based estimator considered by other authors, with negligible difference between the standard errors of the two estimators. The proposed approach, however, provides a framework capable of handling general frailty distributions with finite moments and yields an explicit consistent variance estimator.
Multivariate survival analysis for case-control family data
L. Hsu and M. Gorfine (2006)
Biostatistics. Vol. 7 (3) , pp. 387-398.
Abstract: Multivariate survival data arise from case-control family studies in which the ages at disease onset for family members may be correlated. In this paper, we consider a multivariate survival model with the marginal hazard function following the proportional hazards model. We use a frailty-based approach in the spirit of Glidden and Self (1999) to account for the correlation of ages at onset among family members. Specifically, we first estimate the baseline hazard function nonparametrically by the innovation theorem, and then obtain maximum pseudolikelihood estimators for the regression and correlation parameters plugging in the baseline hazard function estimator. We establish a connection with a previously proposed generalized estimating equation-based approach. Simulation studies and an analysis of case-control family data of breast cancer illustrate the methodology's practical utility.
Feedback inhibition of gonadotropins by testosterone in men with hypogonadotropic hypogonadism: comparison to the intact pituitary-testicular axis in primary hypogonadism.
I. Shimon, A. Lubina, M. Gorfine and J. Ilany (2006)
Journal of andrology. Vol. 27 (3) , pp. 358-64.
Abstract: Men with hypogonadotropic hypogonadism (HH) due to hypothalamic-pituitary disease present with low serum testosterone levels combined with undetectable, low, or normal gonadotropin levels. Treatment consists of testosterone replacement to reverse the symptoms of androgen deficiency. The aim of this study was to examine the dynamics and feedback inhibition of follicle-stimulating hormone (FSH) and luteinizing hormone (LH) in relation to testosterone in 38 men with HH treated with testosterone. Findings were compared with 11 men with primary hypergonadism (PH). Testosterone replacement led to a suppression of FSH levels from 2.8 IU/L at baseline to 1.1 IU/L and to a suppression of LH levels from 2.3 to 0.8 IU/L. There was a linear correlation between levels of FSH and LH (after natural log transformation for both) and testosterone levels in both the HH and PH groups. However, the differences in intercepts and slopes between the groups were significant. To determine whether nonsuppressed FSH or LH during testosterone replacement reduces the probability of eugonadism, as reflected by normal testosterone levels, gonadotropin levels were measured and categorized as low (<0.5 IU/L), medium (0.5-2 IU/L), and high levels (>2 IU/L). The higher FSH or LH levels were found to significantly decrease the chance for achieving eugonadism. In conclusion, in men with HH due to hypothalamic-pituitary disease or injury, the pituitary-testicular hormonal axis maintains its physiological negative feedback between testosterone and gonadotropins. Thus, gonadotropin levels in men with HH might be useful, together with testosterone concentrations, for assessing the adequacy of androgen replacement.
2004:
Statistical concerns about the GSEA procedure.
D. Damian and M. Gorfine (2004)
Nature genetics. Vol. 36 (7) , pp. 663; author reply 663.
Nonparametric correction for covariate measurement error in a stratified Cox model
M. Gorfine, L. I. Hsu and R. L. Prentice (2004)
Biostatistics. Vol. 5 (1) , pp. 75-87.
Abstract: Stratified Cox regression models with large number of strata and small stratum size are useful in many settings, including matched case-control family studies. In the presence of measurement error in covariates and a large number of strata, we show that extensions of existing methods fail either to reduce the bias or to correct the bias under nonsymmetric distributions of the true covariate or the error term. We propose a nonparametric correction method for the estimation of regression coefficients, and show that the estimators are asymptotically consistent for the true parameters. Small sample properties are evaluated in a simulation study. The method is illustrated with an analysis of Framingham data.
Semiparametric estimation of marginal hazard function from case-control family studies
L. Hsu, L. Chen, M. Gorfine and K. Malone (2004)
Biometrics. Vol. 60 (December) , pp. 936-944.
Abstract: Estimating marginal hazard function from the correlated failure time data arising from case-control family studies is complicated by noncohort study design and risk heterogeneity due to unmeasured, shared risk factors among the family members. Accounting for both factors in this article, we propose a two-stage estimation procedure. At the first stage, we estimate the dependence parameter in the distribution for the risk heterogeneity without obtaining the marginal distribution first or simultaneously. Assuming that the dependence parameter is known, at the second stage we estimate the marginal hazard function by iterating between estimation of the risk heterogeneity (frailty) for each family and maximization of the partial likelihood function with an offset to account for the risk heterogeneity. We also propose an iterative procedure to improve the efficiency of the dependence parameter estimate. The simulation study shows that both methods perform well under finite sample sizes. We illustrate the method with a case-control family study of early onset breast cancer.
Germ-line ATM gene alterations are associated with susceptibility to sporadic T-cell acute lymphoblastic leukemia in children.
E. Liberzon, S. Avigad, B. Stark, J. Zilberstein, L. Freedman, M. Gorfine, H. Gavriel, I. J. Cohen, Y. Goshen, I. Yaniv and R. Zaizov (2004)
Genes, chromosomes & cancer. Vol. 39 (2) , pp. 161-166.
Abstract: A major feature of ataxia-telangiectasia (A-T) is an increased risk of cancer, particularly of lymphoid malignancies. We studied ATM gene involvement in leukemic cells derived from 39 pediatric T-cell acute lymphoblastic leukemias (ALLs). Two types of sequence changes--truncating and missense--were identified in 8 T-cell ALL samples: 3 truncating changes, all previously identified in A-T (R35X, -30del215, 2284delCT), and 3 missense variants (V410A, F582L, F1463C) were found, none associated with loss of heterozygosity (LOH). In all patients studied, the mutation was present in the germ-line. A-T carriers, defined by the finding of truncating mutations, were found to be 12.9 times more frequent than in the normal population (P = 0.004). A normally ethnically matched population was screened for the 3 missense variants, and their frequency was significantly more prevalent (4.9-fold excess) than in the normal population (P = 0.03). Our data suggest there is some evidence of an association between missense alterations in the ATM gene and T-cell ALL. A significant difference in the mean age at diagnosis of T-cell ALL was noted between patients harboring an ATM sequence change and those with no change, 5.4 years and 9.7 years, respectively (P = 0.001). No ATM alterations were identified in relapse samples, indicating that ATM does not play a role in disease progression. The high prevalence of germ-line truncating and missense ATM gene alterations among children with sporadic T-cell ALL suggests an association with susceptibility to T-cell acute leukemia and supports the model of predisposition to cancer in A-T heterozygotes.
2001 - 2003:
Estimation of dependence between paired correlated failure times in the presence of covariate measurement error
M. Gorfine, L. Hsu and R. L. Prentice (2003)
Journal of the Royal Statistical Society. Series B: Statistical Methodology. Vol. 65 (3) , pp. 643-661.
Abstract: In many biomedical studies, covariates are subject to measurement error. Although it is well known that the regression coefficients estimators can be substantially biased if the measurement error is not accommodated, there has been little study of the effect of covariate measurement error on the estimation of the dependence between bivariate failure times. We show that the dependence parameter estimator in the Clayton–Oakes model can be considerably biased if the measurement error in the covariate is not accommodated. In contrast with the typical bias towards the null for marginal regression coefficients, the dependence parameter can be biased in either direction. We introduce a bias reduction technique for the bivariate survival function in copula models while assuming an additive measurement error model and replicated measurement for the covariates, and we study the large and small sample properties of the dependence parameter estimator proposed.
Maximum Likelihood Estimator and Likelihood Ratio Test in Complex Models: An Application to B Lymphocyte Development
M. Gorfine, L. Freedman, G. Shahaf and R. Mehr (2003)
Bulletin of Mathematical Biology. Vol. 65 (6) , pp. 1131-1139.
Abstract: In this paper we introduce a simple framework which provides a basis for estimating parameters and testing statistical hypotheses in complex models. The only assumption that is made in the model describing the process under study, is that the deviations of the observations from the model have a multivariate normal distribution. The application of the statistical techniques presented in this paper may have considerable utility in the analysis of a wide variety of complex biological and epidemiological models. To our knowledge, the model and methods described here have not previously been published in the area of theoretical immunology. ?? 2003 Society for Mathematical Biology. Published by Elsevier Ltd. All rights reserved.
Survivor function estimators under group sequential monitoring based on the logrank statistic
M. Gorfine (2003)
Lifetime Data Analysis. Vol. 9 (2) , pp. 175-193.
Abstract: In this paper we investigate a group sequential analysis of censored survival data with staggered entry, in which the trial is monitored using the logrank test while comparisons of treatment and control Kaplan-Meier curves at various time points are performed at the end of the trial. We concentrate on two-sample tests under local alternatives. We describe the relationship of the asymptotic bias of Kaplan-Meier curves between the two groups. We show that even if the asymptotic bias of the Kaplan-Meier curve is negligible relative to the true survival, this is not the case for the difference between the curves of the two arms of the trial. A corrected estimator for the difference between the survival curves is presented and by simulations we show that the corrected estimator reduced the bias dramatically and has a smaller variance. The methods of estimation are applied to the Beta-Blocker Heart Attack Trial (1982), a well-known group sequential trial.
Differences between estimated caloric requirements and self-reported caloric intake in the Women's Health Initiative
J. R. Hebert, R. E. Patterson, M. Gorfine, C. B. Ebbeling, S. T. St. Jeor and R. T. Chlebowski (2003)
Annals of Epidemiology. Vol. 13 (03) , pp. 629-637.
Abstract: Purpose: To compare energy intake derived from a food frequency questionnaire (FFQ) with estimated energy expenditure in postmenopausal women participating in a large clinical study. Methods: A total of 161,856 women aged 50 to 79 years enrolled in the Women's Health Initiative (WHI) Observational Study (OS) or Clinical Trial (CT) [including the Diet Modification (DM) component] completed the WHI FFQ, from which energy intake (FFQEI) was derived. Population-adjusted total energy expenditure (PATEE) was calculated according to the Harris-Benedict equation weighted by caloric intakes derived from the National Health and Nutrition Examination Survey. Stepwise regression was used to examine the influence of independent variables (e.g., demographic, anthropometric) on FFQEI-PATEE. Race, region, and education were forced into the model; other variables were retained if they increased model explanatory ability by more than 1 Results: On average, FFQEI was approximately 25% lower than PATEE. Regression results (intercept = -799 kcal/d) indicated that body mass index (b = -23 kcal/day/kg??m-2); age (b = 15 kcal/day/year of age); and study arm (relative to women in the OS, for DM women b = 169 kcal/d, indicating better agreement with PATEE) increased model partial R2 > .01. Results for CT women not eligible for DM were similar to those of women in the OS (b = 14 kcal/d). There also were apparent differences by race (b = -152 kcal/d in Blacks) and education (b = -67 kcal/d in women with⁢high school). Conclusion: This large, carefully studied population confirms previous observations regarding underestimates in self-reported caloric intake relative to estimates of metabolic need in younger women, and those with higher weight, with less education, and in Blacks. These differences, along with effects related to intervention assignment, underline the need for additional research to enhance understanding of errors in dietary measurement. ?? 2003 Elsevier Inc. All rights reserved.
Nonparametric Analysis of Longitudinal Binary Data: An Application to the Intergroup Prisoner's Dilemma Game
R. Nirel and M. Gorfine (2003)
Experimental Economics. Vol. 6 (3) , pp. 327-341.
Abstract: The intergroup prisoner's dilemma game was suggested by Bornstein (1992, Journal of Personality and Social Psychology. 7, 597–606) for modelling intergroup conflicts over continuous public goods. We analyse data of an experiment in which the game was played for 150 rounds, under three matching conditions. The objective is to study differences in the investment patterns of players in the different groups. A repeated measures analysis was conducted by Goren and Bornstein (1999, Games and Human Behaviour: Essays in Honor of Amnon Rapoport, pp. 299–314), involving data aggregation and strong distributional assumptions. Here we introduce a nonparametric approach based on permutation tests. Two new measures, the cumulative investment and the normalised cumulative investment, provide additional insight into the differences between groups. The proposed tests are based on the area under the investment curves. They identify an overall difference between the groups as well as pairwise differences. A simultaneous confidence band for the mean difference curve is used to detect the games which account for any pairwise difference.
Estimation of a secondary parameter in a group sequential clinical trial.
M. Gorfine (2001)
Biometrics. Vol. 57 (2) , pp. 589-597.
Abstract: In this article, we investigate estimation of a secondary parameter in group sequential tests. We study the model in which the secondary parameter is the mean of the normal distribution in a subgroup of the subjects. The bias of the naive secondary parameter estimator is studied. It is shown that the sampling proportions of the subgroup have a crucial effect on the bias: As the sampling proportion of the subgroup at or just before the stopping time increases, the bias of the naive subgroup parameter estimator increases as well. An unbiased estimator for the subgroup parameter and an unbiased estimator for its variance are derived. Using simulations, we compare the mean squared error of the unbiased estimator to that of the naive estimator, and we show that the differences are negligible. As an example, the methods of estimation are applied to an actual group sequential clinical trial, The Beta-Blocker Heart Attack Trial.
2000 & Earlier:
An open trial of plant-source derived PS for treatment of age-related cognitive decline
S. Schreiber, O. Kampf-Sherf, M. Gorfine, D. Kelly, Y. Oppenheim and B. Lerer (2000)
Isr J Psychiatry Relat Sci. Vol. 37 (4) , pp. 302-7.
Abstract:We assessed whether the efficacy of plant-source derived phosphatydilserine (one of the phospholipids which play an important functional role in membrane-related processes in the brain) for treatment of age related cognitive decline is consistent with previous (placebo controlled) positive findings with bovine derivative of PS (BC-PS). Eighteen healthy elderly volunteers meeting Age Associated Memory Impairment inclusion and exclusion criteria were treated for 12 weeks with plant-source derived phosphatydilserine (PS) (100 mg x 3/day p.o.) and evaluated at base line, after 6 weeks of treatment and at the end of the trial. Fifteen concluded the study. All but two outcome measures elicited a significant drug over time effect. Post-hoc paired t-tests showed that the significant effect was attributable to an improvement from base line to week 6 and that effect was maintained at week 12. These results are encouraging. However, they await double-blind controlled verification in a large sample before suggesting that this may be a viable approach to the treatment of age-related cognitive decline, without exposing the patients to possible hazards involved in the treatment with bovine derivative of PS (BC-PS).
5-HT(1A) Receptor function in normal subjects on clinical doses of fluoxetine: Blunted temperature and hormone responses to ipsapirone challenge
B. Lerer, Y. Gelfin, M. Gorfine, B. Allolio, K. P. Lesch and M. E. Newman (1999)
Neuropsychopharmacology. Vol. 20 (6) , pp. 628-639.
Abstract: Serotonergic receptors of the 5-HT(1A) subtype have been suggested to play a pivotal role in the mechanism of action of antidepressant drugs, including specific serotonin reuptake inhibitors (SSRIs). We examined the effect of clinical doses of the SSRI, fluoxetine, on 5-HT(1A) receptor function in 15 normal volunteers. Hypothermic and hormone responses to the 5-HT(1A) receptor agonist, ipsapirone (0.3 mg per kg, per os) were examined after two weeks of placebo and again, after the subjects had been receiving fluoxetine for four weeks. On fluoxetine, the hypothermic response to ipsapirone was significantly blunted, as were ACTH, cortisol and growth hormone release. Ipsapirone plasma levels were significantly increased by fluoxetine but a pharmacokinetic effect could not have accounted for the observed blunting of 5-HT(1A) receptor mediated effects. These findings confirm and extend previous observations in rodents and humans and indicate that both post-synaptic 5-HT(1A) receptors in the hypothalamus, which mediate hormone responses to 5-HT(1A) agonists, and pre-synaptic 5-HT(1A) receptors which (putatively) mediate the hypothermic response, are rendered subsensitive by chronic SSRI administration. Since fluoxetine did not have significant effects on mood and other psychological variables in these subjects, alterations in 5-HT(1A) receptor function induced by SSRIs may have psychotropic relevance only in the context of existing perturbations of serotonergic function which underlie the psychopathological states in which these drugs are therapeutically effective. Copyright (C) 1999 American College of Neuropsychopharmacology.
Social adjustment and self-esteem in remitted patients with unipolar and bipolar affective disorder: A case-control study
B. Shapira, J. Zislin, Y. Gelfin, Y. Osher, M. Gorfine, D. Souery, J. Mendlewicz and B. Lerer (1999)
Comprehensive Psychiatry. Vol. 40 (1) , pp. 24-30.
Abstract: To evaluate social adjustment and self-esteem in patients with unipolar (UP) and bipolar (BP) affective disorder and to examine demographic and clinical correlates of these variables, outpatients with UP and BP disorder in remission for at least 12 months were consecutively recruited and individually matched to control subjects with no personal or family history of psychiatric illness (UP-control matched pairs, n = 23; BP-control matched pairs, n = 27). Subjects completed the Rosenberg Self-Esteem scale (SES) and the self-report version of the Social Adjustment Scale (SAS). UP patients reported significantly worse overall social adjustment than their matched controls (P= .009), specifically in the area of social and leisure activities (P = .0003) and poorer self-esteem (P = .02). When separated by gender, only the female UP group manifested significant findings on the SAS. BP patients reported poorer self-esteem than their controls (P = .04), but were not significantly different on the SAS. Although the patients were not clinically depressed, a worse social adjustment was significantly associated with a higher score on the Hamilton Depression Scale (HAM-D) in both groups. In the UP group, this association was absent when the analysis was limited to patients receiving antidepressant pharmacotherapy. The findings indicate that (1) UP patients, particularly women, experience substantial difficulties in social adjustment, primarily in social and leisure activities, even during stable clinical remission, and (2) in both UP and BP patients, adjustment problems are related to depressive symptoms even though these are minimal in severity.
Effect of clinical doses of fluoxetine on psychological variables in healthy volunteers
Y. Gelfin, M. Gorfine and B. Lerer (1998)
American Journal of Psychiatry. Vol. 155 (2) , pp. 290-292.
Abstract: OBJECTIVE: The authors sought to determine the effect of clinically equivalent doses of fluoxetine on mood and other psychological variables in normal subjects. METHOD: Fifteen healthy volunteers received placebo for 2 weeks; fluoxetine, 10 mg/day, for 1 week; fluoxetine, 20 mg/day, for 5 weeks; and then an additional 2 weeks of placebo in the context of a single-blind study. The subjects were evaluated with a series of self- and observer-rated instruments. RESULTS: No significant effects attributable to fluoxetine were observed on any of the psychological variables examined. Minimal adverse effects were reported. CONCLUSIONS: Significant mood-elevating and other psychological effects of fluoxetine would appear to be induced only when symptomatic targets exist. (Am J Psychiatry 1998; 155:290–292)
Age-related changes in brain perfusion of normal subjects detected by 99mTc-HMPAO SPECT
Y. Krausz, O. Bonne, M. Gorfine, H. Karger, B. Lerer and R. Chisin (1998)
Neuroradiology. Vol. 40 (7) , pp. 428-434.
Abstract: Previous functional imaging data generally show impairment in global cerebral blood flow (CBF) with age. Conflicting data, however, concerning age-related changes in regional CBF (rCBF) have been reported. We examined the relative rCBF in a sample of healthy subjects of various ages, to define and localize any age-related CBF reduction. Twenty-seven healthy subjects (17 male, 10 female; mean age 49 +/- 15, range 26-71, median 47 years) were studied by 99mTc-HMPAO brain SPECT. The younger age group consisted of subjects below, the older group above 47 years of age, respectively. Analysis was performed by applying three preformed templates, each containing delineated regions of interest (ROIs), to three transaxial brain slices at approximately 4, 6, and 7 cm above the orbitomeatal line (OML). The average number of counts for each ROI was normalized to mean uptake of the cerebellum and of the whole brain slice. Globally, 99mTc-HMPAO uptake ratio normalized to cerebellum was significantly decreased in older subjects, affecting both hemispheres. A slight left-to-right asymmetry was observed in HMPAO uptake of the whole study group. It did not, however, change with age. Regionally, both cortical and subcortical structures of older subjects were involved: uptake ratio to cerebellum was significantly lower (after correction for multiple testing) in the left basal ganglia and in the left superior temporal, right frontal and bilateral occipital cortices at 4 cm above the OML. At 6 cm above the OML, reduced uptake ratios were identified in the left frontal and bilateral parietal areas. At 7 cm, reduced uptake was detected in the right frontal and left occipital cortices. Most of these differences were reduced when uptake was normalized to whole slice, whereas an increase in uptake ratios was observed in the cingulate cortex of the elderly. An inverse correlation between age and HMPAO uptake ratios normalized to cerebellum was observed in a number of brain regions. These findings suggest that advancing age has a differential effect on cerebral perfusion reflected in brain 99mTc-HMPAO uptake.
Schizophrenia, chronic hospitalization and the 5-HT2C receptor gene.
R. H. Segman, R. P. Ebstein, U. Heresco-Levy, M. Gorfine, M. Avnon, E. Gur, L. Nemanov and B. Lerer (1997)
Psychiatric genetics. Vol. 7 (2) , pp. 75-78.
Abstract: Frequency of a polymorphism in the coding region of the 5-hydroxytryptamine2C (5-HT2C) receptor gene (HTR2C Xq24) was not significantly different in 122 unrelated Israeli schizophrenia patients compared with 180 control subjects matched for gender and ethnicity. However, proportion of time spent in hospital since the first admission was significantly greater in patients hemi- of homozygous for the 5-HT2Cser allele than in patients carrying other genotypes (p = 0.006). The 5-HT2Cser genotype conferred a 3.3-fold increased risk for lifetime hospitalization exceeding 10 years. Genetically determined variation in the 5-HT2C receptor may influence the clinical course and phenotypic expression of schizophrenia.
Increased cerebral blood flow in depressed patients responding to electroconvulsive therapy.
O. Bonne, Y. Krausz, B. Shapira, M. Bocher, H. Karger, M. Gorfine, R. Chisin and B. Lerer (1996)
Journal of nuclear medicine : official publication, Society of Nuclear Medicine. Vol. 37 (7) , pp. 1075-1080.
Abstract: Considerable data support the existence of impaired regional cerebral blood flow (rCBF) in major depression. We compare rCBF in depressed patients before and after electroconvulsive therapy (ECT) to define whether the impairment is a "state"-related property or a trait phenomenon. METHODS: Twenty patients with a major depressive disorder were studied by 99mTc-HMPAO brain SPECT, 2-4 days before and 5-8 days after a course of ECT. Three transaxial brain slices delineating anatomically defined regions of interest at approximately 4, 6 and 7 cm above the orbitomeatal line were used, with the average number of counts for each region of interest normalized to the area of maximal cerebellar uptake. RESULTS: Technetium-99m-HMPAO uptake significantly increased in patients who responded to ECT but remained unchanged in patients who did not respond to the treatment (response defined as a reduction of at least 60% on the Hamilton Depression Rating Scale). An inverse correlation was observed between severity of depression and HMPAO uptake, and clinical improvement was positively correlated with the increase in tracer uptake. CONCLUSIONS: These findings imply that reduced rCBF in depression, as reflected in brain 99mTc-HMPAO uptake, is a "state"-related property and is reversible by successful treatment. Technetium-99m-HMPAO uptake may serve as an objective state marker for depression, an an indicator of the severity of depression and as an objective means of evaluating response to treatment.
Cerebral hypoperfusion in medication resistant, depressed patients assessed by Tc99m HMPAO SPECT
O. Bonne, Y. Krausz, M. Gorfine, H. Karger, Y. Gelfin, B. Shapira, R. Chisin and B. Lerer (1996)
Journal of Affective Disorders. Vol. 41 (3) , pp. 163-171.
Abstract: Functional imaging studies generally show decreased cerebral metabolism and perfusion in depressed patients relative to normal controls, although the location of the deficits varies. We used Tc99m HMPAO SPECT to compare cerebral blood flow in medication resistant, depressed patients and a normal control group. HMPAO uptake ratios (adjusted for age) were significantly lower in the depressed patients in the transaxial slices 4 cm and 6 cm above the orbitomeatal line (OML) on the left side. Examining individual regions of interest (corrected for age and multiple testing), we found significantly lower perfusion in the left superior temporal, right parietal and bilateral occipital regions in the patient group. These findings are in limited agreement with previous HMPAO SPECT studies. Methodological differences between studies, particularly variability in adjusting data for age, lead to a divergence in findings. Future research should seek to standardize protocols and data analysis in order to generate comparable results.
Interrelationship of age, depression, and central serotonergic function: evidence from fenfluramine challenge studies.
B. Lerer, D. Gillon, P. Lichtenberg, M. Gorfine, Y. Gelfin and B. Shapira (1996)
International psychogeriatrics / IPA. Vol. 8 (1) , pp. 83-102.
Abstract: The purpose of this study was to examine the relationship between age-associated changes in central serotonergic function and abnormalities associated with major depression. Under randomized double-blind conditions, prolactin and cortisol responses to the serotonin-releasing agent d,l-fenfluramine hydrochloride (60 mg orally) and placebo were examined in 30 normal subjects (15 men, 15 women; age range 21-84 years) and 39 patients with major depressive disorder, endogenous subtype (14 men, 25 women; age range 29-72 years). In the normal subjects, a significant Age x Challenge x Time interaction was observed in the prolactin response (p = .03). This was primarily due to the elevated prolactin responses of the younger healthy women. Peak minus baseline (delta) prolactin responses were negatively correlated with age (women, p = .004; men, p = .06). In the depressed patients there was no age-related decline in prolactin response to fenfluramine. When depressed and healthy younger subjects were compared, delta prolactin responses to fenfluramine were significantly blunted in young patients with depression (p = .003) irrespective of the significant effect of gender (p = .01), but not in older depressed patients. Cortisol responses to fenfluramine did not reveal consistent effects of age, gender, or diagnosis. Age-related decline in central serotonergic function may make older individuals more vulnerable to depression and possibly render depressive episodes more frequent, more severe, and less amenable to treatment.
Electroconvulsive therapy and resistant depression: clinical implications of seizure threshold
B. Shapira, D. Lidsky, M. Gorfine and B. Lerer (1996)
The Journal of clinical psychiatry. Vol. 57 (1) , pp. 32-38.
Abstract: BACKGROUND: Patients with major depressive disorder (MDD) were treated with electroconvulsive therapy (ECT) to determine (1) variability of initial seizure threshold, (2) factors that influence seizure threshold, (3) change in seizure threshold during the ECT course, and (4) relationship of seizure threshold to antidepressant effects. METHOD: Seizure threshold was measured by a stimulus titration technique during the first, eighth, and final ECT of medication-free patients who had MDD, endogenous subtype based on Research Diagnostic Criteria and were randomly assigned to three-times-weekly, bilateral, brief pulse ECT (N = 24) or twice-weekly ECT plus one simulated treatment per week (N = 23). Subsequent to the first ECT, stimulus intensity was 1.3 to 1.8 (median = 1.5) times threshold. The Hamilton Rating Scale for Depression (HAM-D) was the primary clinical outcome measure. RESULTS: Initial seizure threshold varied by 594 Gender (p = .03), total strength of pre-ECT pharmacotherapy trials (p = .02), and age (p = .12) accounted for 23.9% of the variance. Threshold increased by 42% +/- 26% (p = .0001) from the first to the final ECT, and seizure duration decreased by 33% +/- 28% (p = .0001). Seizure duration and mean stimulus intensity were negatively associated over all treatments (r = -.49, p = .0003). Change in HAM-D score was related to duration of the current depressive episode (r = -.39, p = .006) and total strength of pre-ECT pharmacotherapy trials (r = -.39, p = .008), but not to seizure threshold or duration. CONCLUSION: (1) Initial seizure threshold for pulse bilateral ECT is highly variable and not yet amenable to accurate prediction. (2) Stimulus titration allows threshold to be determined on an individual basis and dosage for subsequent treatments to be defined. (3) Seizure duration is of limited value as a sole criterion for the adequacy of treatment when initial threshold is unknown and/or electrical doses that substantially exceed threshold are used. (4) With moderately suprathreshold bilateral ECT, a relationship of seizure threshold to antidepressant response is not demonstrable.
Complex effects of age and gender on hypothermic, adrenocorticotrophic hormone and cortisol responses to ipsapirone challenge in normal subjects
Y. Gelfin, B. Lerer, K. P. Lesch, M. Gorfine and B. Allolio (1995)
Psychopharmacology Berl. Vol. 120 (3) , pp. 356-364.
Abstract: The effects of a challenge dose of the 5-HT1A agonist, ipsapirone (0.3 mg per kg body weight), or placebo on body temperature and on adrenocorticotrophic hormone (ACTH) and cortisol release, were examined in 30 normal subjects (14 males, 19-74 years and 16 females, 22-69 years) using a randomized, double blind design. Irrespective of age or gender, ipsapirone induced a significant reduction in body temperature relative to placebo and a significant increase in ACTH and cortisol release. Maximal temperature reduction by ipsapirone was significantly blunted in older subjects and was inversely related to age. There was no gender difference in the hypothermic response to ipsapirone. ACTH and cortisol responses showed an opposite impact of aging in males and females. Whereas both responses diminished with age in male subjects, they increased with age in females. The cortisol response of older females was significantly larger than that of all the other subjects. Adverse effects of ipsapirone were also more marked in elderly females and were correlated with ACTH and cortisol responses. These findings should be taken into consideration in the use of ipsapirone and other 5-HT1A agonists as challenge procedures for studying central serotonergic function in depression and other disorders. Careful matching of control and experimental subjects is indicated so as to avoid spurious results which reflect the effects of age and gender rather than the pathophysiology of the disorders being investigated.
Onset and time course of antidepressant action: psychopharmacological implications of a controlled trial of electroconvulsive therapy
R. H. Segman, M. Gorfine, B. Lerer and B. Shapira (1995)
Psychopharmacology. Vol. 119 (4) , pp. 440-448.
Abstract: Onset and time course of antidepressant effect were examined in 47 patients with major depressive disorder who had been randomly assigned to twice weekly bilateral, brief pulse electroconvulsive therapy plus one simulated treatment per week (ECTx2) or to a three times weekly schedule of administration (ECTx3). Rapid improvement was observed in the ECTx3 group in whom the number of real ECTs to 30% reduction on the Hamilton Depression Scale (HAM-D) was 3.2 +/- 1.90, administered over 7.3 +/- 4.43 days and to 60% reduction, 5.9 +/- 3.09 real ECTs over 13.7 +/- 7.21 days. Among the responders in both groups combined, 24.3 +/- 29.58% of the overall improvement in HAM-D was contributed by the first real ECT, 60.9 +/- 28.13% by the first four real ECTs and 91.6 +/- 25.82% by the first eight. Although 85.3% of the responders had reached 60% HAM-D improvement after eight ECTs, a clinically significant minority (14.7 responded later in the course (ECT 9-12). However, response was predictable on the basis of symptomatic improvement (30% on the HAM-D) by the sixth real ECT. Thirty-three out of 34 responders would have been correctly identified by this criterion and only 2 out of 13 non-responders mis-identified (P < 0.000001). Once achieved, the antidepressant effect was stable, without continuation pharmacotherapy, until 1 week after the last treatment and on lithium carbonate (Li) or Li plus clomipramine for a further 3 weeks. These findings confirm the clinical impression that ECT is a rapidly effective treatment for major depression with a shorter latency than generally reported for antidepressant drugs.
A prospective study of lithium continuation therapy in depressed patients who have responded to electroconvulsive therapy.
B. Shapira, M. Gorfine and B. Lerer (1995)
Convulsive therapy. Vol. 11 (2) , pp. 80-85.
Abstract: Twenty-eight of 34 patients with major depression who completed a course of electroconvulsive therapy (ECT) and were classified as responders were administered lithium carbonate (Li) continuation therapy in the context of an open, prospective study. Twenty-four patients were followed for 6 months or until relapse; four patients dropped out of follow-up while still in remission. The probability of completing 6 months without relapse (by survival analysis, including the patients who dropped out as censored observations) was 65 The eight patients who relapsed into depression all did so within 13 weeks. They were characterized by a shorter duration of their index depressive episode, a greater likelihood of having suffered an additional depressive episode in the preceding 12 months, and failure of an adequate trial of antidepressant medication before the ECT course. Novel pharmacological strategies may be needed in the post-ECT continuation therapy of patients who have a prior history of relapse and are demonstrably resistant to antidepressant medication.

Working Papers:

Extending mendelian risk prediction models to handle misreported family history
D. Braun, M. Gorfine, H.A. Katki, A. Ziogas, H. Anton-Culver and G. Parmigiani
Abstract: Mendelian risk prediction models calculate the probability of a proband being a mutation carrier based on family history and known mutation prevalence and penetrance. Family history in this setting, is self-reported and is often reported with error. Various studies in the literature have evaluated misreporting of family history. Using a validation data set which includes both error-prone self-reported family history and error-free validated family history, we propose a method to adjust for misreporting of family history. We estimate the measurement error process in a validation data set (from University of California at Irvine (UCI)) using nonparametric smoothed Kaplan-Meier estimators, and use Monte Carlo integration to implement the adjustment. In this paper, we extend BRCAPRO, a Mendelian risk prediction model for breast and ovarian cancers, to adjust for misreporting in family history. We apply the extended model to data from the Cancer Genetics Network (CGN).
Adjustment for mismeasured exposure using validation data and propensity scores
D. Braun, M. Gorfine, C. Zigler, F. Dominici and and G. Parmigiani
Abstract: Propensity score methods are widely used to analyze observational studies in which patient characteristics might not be balanced by treatment group. These methods assume that exposure, or treatment assignment, is error-free, but in reality these variables can be subject to measurement error. This arises in the context of comparative effectiveness research, using observational administrative claims data in which accurate procedural codes are not always available. When using propensity score based methods, this error affects both the exposure variable directly, as well as the propensity score. We propose a two step maximum likelihood approach using validation data to adjust for the measurement error. First, we use a likelihood approach to estimate an adjusted propensity score. Using the adjusted propensity score, we then use a likelihood approach on the outcome model to adjust for measurement error in the exposure variable directly. In addition, we show the bias introduced when using error-prone treatment in the inverse probability weighting (IPW) estimator and propose an approach to eliminate this bias. Simulations show our proposed approaches reduce the bias and mean squared error (MSE) of the treatment effect estimator compared to using the error-prone treatment assignment.