News — GRAND RAPIDS, Mich. (Jan. 2, 2025) — A multi-institutional team of scientists has developed a free, publicly accessible resource to aid in classification of patient tumor samples based on distinct molecular features identified by .      

The resource comprises classifier models that can accelerate the design of cancer subtype-specific test kits for use in clinical trials and cancer diagnosis. This is an important advance because tumors belonging to different subtypes may vary in their response to cancer therapies.  

The resource is the first of its kind to bridge the gap between TCGA’s immense data library and clinical implementation.

A paper describing the tools published online today in Cancer Cell.

“TCGA defined molecular subtypes for each major type of cancer. With this resource, we aimed to provide the clinical and scientific communities with the tools to assign a newly diagnosed tumor to one of these established subtypes,” said , the Peter and Emajean Cook Endowed Chair in Epigenetics at and the study’s lead corresponding author. “Our new resource will be a powerful asset for creating clinical assays based on the diverse molecular variations between cancers.”

TCGA was a decade-long, National Cancer Institute-led effort to create detailed molecular maps of 33 cancer types. Unlike traditional approaches that define cancers based on the organ or tissue in which they arise, TCGA identified nuanced genomic, epigenomic, proteomic and transcriptomic characteristics that more precisely describe cancer subtypes.

, of the and , of the at also are corresponding authors of the paper, which represents a collaborative effort between scientists from more than a dozen research organizations.

“Since many TCGA molecular subtypes were generated using hundreds or thousands of features from multiple data types, scientists and physicians have asked us for help subtyping their samples,” Cherniack said. “Our resource greatly simplifies this process.”

The team created the new resource by leveraging data from 8,791 TCGA cancer samples that represented 26 cancer cohorts and 106 cancer subtypes. They then used existing machine learning tools to develop and test nearly half a million models across six categories — gene expression, DNA methylation, miRNA, copy number, mutation calls and multi-omics — and selected those that performed best for inclusion in the online resource.   

In total, the resource contains 737 ready-to-use models, which represent the top models from each of the 26 cancer cohorts, the five training algorithms and six data types.  

“A major element of this effort was working to ensure that these models could be deployed by other groups onto new datasets,” Ellrott said. “All too often this type of work is difficult to replicate or apply to new samples.”

The resource may be accessed at .

Co-first authors of the study include Christopher K. Wong of University of California, Santa Cruz, Christina Yau of University of California, San Francisco, and Buck Institute for Research on Aging, Mauro A. A. Castro of the Federal University of Paraná, Jordan E. Lee of Oregon Health and Science University, Brian J. Karlberg of Oregon Health and Science University, Jasleen K. Grewal of BC Cancer, Vincenzo Lagani of JADBio Gnosis DA and Ilia State University, and Bahar Tercan of the Institute for Systems Biology.

Other authors include Verena Friedl, Vladislav Uzunangelov and Joshua M. Stewart of University of California, Santa Cruz; Toshinori Hinoue of Van Andel Institute; Lindsay Westlake and Xavier Loinaz of the Broad Institute of MIT and Harvard; Ina Felau, Peggy I. Wang, Anab Kemal, Samantha J. Cesar-Johnson and Jean C. Zenklusen of the National Cancer Institute; Ilya Shmulevich of the Institute for Systems Biology; Alexander J. Lazar of the University of Texas MD Anderson Cancer Center; Ioannis Tsamardinos of JADBio Gnosis DA and University of Crete; Katherine A. Hoadley of Lineberger Comprehensive Cancer Center at University of North Carolina at Chapel Hill; The Cancer Genome Atlas Analysis Network; A. Gordon Robertson of BC Cancer; Theo A. Knijnenburg of the Institute for Systems Biology; and Christopher C. Benz of Buck Institute for Research on Aging.

Research reported in this publication was supported by the National Cancer Institute of the National Institutes of Health under award nos. U24CA264029 (Cherniack), U24CA264023 (Laird), U24CA264007 (Ellrott), U24CA264021 (Hoadley) and U24CA264009 (Stewart and Benz). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health or other funders.

###

ABOUT VAN ANDEL INSTITUTE Van Andel Institute (VAI) is committed to improving the health and enhancing the lives of current and future generations through cutting-edge biomedical research and innovative educational offerings. Established in Grand Rapids, Michigan, in 1996 by the Van Andel family, VAI is now home to more than 500 scientists, educators and support staff, who work with a growing number of national and international collaborators to foster discovery. The Institute’s scientists study the origins of cancer, Parkinson’s and other diseases and translate their findings into breakthrough prevention and treatment strategies. Our educators develop inquiry-based approaches for K–12 education to help students and teachers prepare the next generation of problem-solvers, while our Graduate School offers a rigorous, research-intensive Ph.D. program in molecular and cellular biology. Learn more at .

 

MEDIA CONTACT
Register for reporter access to contact details
Â鶹´«Ã½: New Resource Available to Help Scientists Better Classify Cancer Subtypes

Credit:

Caption:

Â鶹´«Ã½: New Resource Available to Help Scientists Better Classify Cancer Subtypes

Credit: Van Andel Institute

Caption: Dr. Peter W. Laird

CITATIONS

Cancer Cell