Skip to main content

Inocras Unveils Cancer Foundation Model Trained on Thousands of Whole Genomes

New AI model introduces learnable tokenization and genome-wide feature aggregation, advancing the use of whole genome data in precision oncology.

Inocras, a bioinformatics-led company harnessing the power of whole genome data and proprietary analytics to deliver curated insights that advance precision health, today announced a groundbreaking cancer foundation model trained on 2,882 whole genomes across diverse cancer types, marking a major advance in applying AI to precision oncology. Developed in collaboration with the Korea Advanced Institute of Science and Technology (KAIST), the model couples a novel learnable tokenization method with a cancer-aware embedding framework that integrates diverse genomic features into patient-level representations, capturing the molecular patterns that drive tumor biology and clinical outcomes.

At the core of this effort is DNAChunker, a proprietary model that tokenizes genomic sequences adaptively through an H-Net–based hierarchical dynamic chunking architecture. Unlike conventional methods that process DNA in fixed units, DNAChunker focuses on regions of high biological signal and compresses low-information regions - optimizing both precision and computational efficiency.

The result is a state-of-the-art, biologically informed model that achieves higher accuracy with fewer parameters and lower computational cost. In benchmark evaluations across genomic representation learning and functional prediction tasks, the model achieved state-of-the-art performance, outperforming leading DNA foundation models such as Nucleotide Transformer and DNABERT-2. When tested for clinical relevance by aggregating mutation-level embeddings generated by DNAChunker across nearly 3,000 cancer genomes, it reached 98% accuracy for HRD prediction and 84% for PAM50 subtype classification from DNA data alone on independent datasets - establishing one of the first direct links between a DNA-based foundation model and clinically relevant cancer biology.

“This represents a pivotal leap toward clinically interpretable, AI-native cancer genomics,” said Jehee Suh, CEO of Inocras. “We have moved from sequencing genomes to understanding them. That’s the inflection point AI brings to oncology. This is where data becomes diagnosis, and where genome insights start to truly shape patient care.”

Inocras’s research will be featured next week at the EMBL Cancer Genomics Conference in Heidelberg, Germany. Young Seok Ju, Ph.D., Co-founder of Inocras and one of this year’s organizers, will present “A Cancer Foundation Model from 1,364 Breast Cancer Whole Genomes for Patient Stratification” on November 11 at 16:15 CET, highlighting findings from a 1,364-patient cohort and how the DNAChunker-powered Genomic Foundation Model enables AI-driven patient stratification and molecular subtype discovery. The full conference program is available here.

The Inocras cancer foundation model marks a major step toward embedding-based patient stratification from genomic data, strengthening the company’s foundation for next-generation precision diagnostics.

About Inocras

Inocras is a bioinformatics-led company redefining precision health through whole genome data and proprietary analytics. Our oncology and rare disease platforms integrate comprehensive whole genome data with advanced automation to deliver curated and actionable insights at scale that accelerate discovery and diagnostics to improve patient care, bringing a real-world impact. Inocras operates a CLIA/CAP-certified laboratory and partners with leading hospitals, pharmaceutical companies, and research institutions worldwide. For more information, please visit inocras.com and follow the Inocras LinkedIn page.

Contacts

Recent Quotes

View More
Symbol Price Change (%)
AMZN  248.40
+0.00 (0.00%)
AAPL  269.43
+0.00 (0.00%)
AMD  243.98
+0.00 (0.00%)
BAC  53.42
+0.00 (0.00%)
GOOG  290.59
+0.00 (0.00%)
META  631.76
+0.00 (0.00%)
MSFT  506.00
+0.00 (0.00%)
NVDA  199.05
+0.00 (0.00%)
ORCL  240.83
+0.00 (0.00%)
TSLA  445.23
+0.00 (0.00%)
Stock Quote API & Stock News API supplied by www.cloudquote.io
Quotes delayed at least 20 minutes.
By accessing this page, you agree to the Privacy Policy and Terms Of Service.