What is the GenomeIndia Project?
GenomeIndia (GI) is a pan-India initiative funded by the Department of Biotechnology (DBT), Ministry of Science & Technology, Government of India, aiming to create a comprehensive catalogue of genetic variations across India’s diverse population.
When was the GenomeIndia Project conceptualised and when was it launched?
The project was conceptualised in late 2017 followed by two years of intense brainstorming and preparation by the GenomeIndia consortium under the leadership of Prof. Vijayalakshmi Ravindranath. The project was officially sanctioned in late 2019 and was launched in January 2020.
What is the constitution of the GenomeIndia Consortium?
The GI Consortium comprises institutions with the following responsibilities:
For details on the members of the consortium, please look up: https://genomeindia.in/institute.php
Who coordinates the GenomeIndia project?
The institution coordinating the GI project is the Centre for Brain Research (CBR), Bengaluru (https://cbr-iisc.ac.in/). Prof. Vijayalakshmi Ravindranath from CBR became the founding national coordinator of the GI project until 2022. Subsequently, the GenomeIndia project has been jointly coordinated by Prof. Y. Narahari, IISc, Bengaluru and Prof. K. Thangaraj, CSIR-CCMB, Hyderabad. Dr. Suchita Ninawe, Senior Adviser and Scientist-H, DBT, is the Scientific Coordinator. Dr. Richi Mahajan, Scientist-D, DBT, is the Administrative Scientist. Details of the scientists associated with the GI project may be found in: https://genomeindia.in/people.php.
What are the goals of the GenomeIndia Project?
How many samples have been collected and sequenced?
20,195 samples have been collected and archived in the Biobank at CBR, Bengaluru. The samples collected belong to healthy individuals (as self-declared by the individuals). 10074 samples have been sequenced. 13242 samples have undergone GWAS (Genome Wide Association Studies).
How many populations/ethnic groups are covered by the GenomeIndia Project?
Samples span 83 distinct Indian populations (also known as ethic groups), ensuring a balanced representation of anthropological, sociocultural, and ethnolinguistic diversity. In particular, the study covered diverse populations including the Tibeto-Burman, Indo-European, Dravidian, and Austro-Asiatic speakers, encompassing both tribal and non-tribal groups.
Where is the genomic data stored?
Sequencing data is archived at the Indian Biological Data Centre (IBDC), Faridabad. The following data is archived: (a) FASTQ data for 9,772 samples (b) gVCF files for the above samples (c) joint call files , and phenotype data for about 9,000 samples.
What guidelines govern data sharing?
The project follows Biotech-PRIDE Guidelines (2021) and the FeED Protocol (Framework for Exchange of Data) (2025) for responsible, equitable data access. These documents can be accessed respectively from Biotech-PRIDE Guidelines (2021) and FeED Protocol (2025) .
Why is India important for global genomics?
India contains one of the largest and most diverse human populations in the world, shaped by thousands of years of migration, cultural diversity, and endogamy (marriage within communities).
However,Indian populations have been historically under-represented in global genomic databases, which are dominated by European samples. This gap limits the accuracy of genetic studies and medical predictions for people from South Asia.
GenomeIndia helps address this imbalance by providing a large, population-aware genomic dataset from India.
What did the study discover?
The project identified around 130 million genetic variants, including over 44 million previously unknown variants that were not present in global databases.
The study also found:
These findings provide a more detailed picture of how demographic history has shaped genetic diversity in India.
Why do many populations show strong genetic signatures?
India has a long history of social stratification, linguistic diversity, and community-based marriage practices. Over time, these practices have created genetically distinct populations.
Some groups have experienced founder effects, where a small ancestral population gave rise to many descendants. This can sometimes increase the frequency of certain inherited diseases.
Understanding these patterns helps researchers identify population-specific disease risks.
How could this research improve healthcare in India?
The GenomeIndia dataset provides several tools that could improve health research and clinical care:
Better disease gene discovery
Improved drug response prediction
Population-specific diagnostics
Better risk prediction models
This dataset will help develop more accurate medical tools tailored for Indian populations.
Why do European genetic risk scores not work well for Indians?
Many genetic risk prediction tools were developed using data from European populations. Because genetic variation differs across populations, these models often perform poorly in people from South Asia.
GenomeIndia demonstrates this limitation and highlights the need for population-specific genomic resources.
Will this lead to new treatments?
The project itself does not directly develop treatments. However, it provides a foundational dataset that researchers can use to:
In the long term, such data can support drug discovery and targeted therapies.
Does this research change our understanding of Indian populations?
The study confirms and refines earlier insights from anthropology and population genetics. It shows how migration, geography, language, and social structure have shaped India's genetic landscape.
Importantly, the research does not support simplistic notions of biological divisions between communities. Instead, it shows that most populations share substantial ancestry and are connected through historical admixture.
Could this research inadvertently stigmatize communities?
No. The project was designed carefully to avoid any possibility of stigmatisation. Key principles include:
The goal is to improve scientific understanding and healthcare—not to label or rank populations.
What scientific assets has the GenomeIndia project created?
The project has built several national research assets:
These resources will support many future studies beyond this project.
How does GenomeIndia compare with other national genomics initiatives?
Many countries have launched large-scale genomic projects, such as:
GenomeIndia differs in several ways:
Focus on population diversity
Representation of small and isolated groups
Integration with social and linguistic context
This makes GenomeIndia one of the most detailed population genomics studies of a single country.
What comes next for GenomeIndia?
The current dataset represents an initial phase. Future work may include:
Will the data be available to researchers?
Yes. The project aims to make data available to researchers through controlled-access frameworks that protect participant privacy.
Such datasets can accelerate research in:
How will ordinary people benefit from this project?
The benefits will emerge gradually through research and healthcare applications:
Ultimately, the project helps ensure that advances in precision medicine benefit people in India as well.
Why is representation in genomic research important?
If certain populations are missing from genomic datasets:
By improving representation, GenomeIndia helps ensure that global genomic medicine becomes more equitable.
Where are the key findings of the GenomeIndia Project published?
A marker paper on the GenomeIndia project was published in Nature Genetics: Mapping genetic diversity with the GenomeIndia project. Nature Genetics. Volume 57, April 2025. Pages 767-773. Springer..
A detailed manuscript on the findings of the GenomeIndia project is published in medRxiv: An Atlas of Indian Genetic Diversity. 20 March 2026.
Where do I get more information about the GenomeIndia Project?
The GenomeIndia Digest (published in February 2024) is a coffee table book that provides a rich source of information on all aspects of the GI Project. This is available at https://dbtindia.gov.in/ebook/feed-protocols-genomeindia-digest . There is a 7-minute documentary video that captures all essential details of the GenomeIndia project. You can view the video at: Watch GenomeIndia Video . On February 27, 2024, there was an event organized in New Delhi by the Department of Biotechnology, Ministry of Science & Technology, to mark the completion of sequencing of 10000 genomes. The meeting was addressed by Dr. Jitendra Singh-Ji, the Hon’ble Union Minister of State of Science & Technology (IC). For a video recording of the event, please look up: Watch the event on YouTube.. On January 9, 2025, Hon’ble Prime Minister Shri Narendra Modi-Ji addressed the delegates of the “Genomics Data Conclave and Release of GenomeIndia Data ” organized in the Vigyan Bhawan, New Delhi, by the Department of Biotechnology, Ministry of Science & Technology. Dr. Jitendra Singh-Ji, the Hon’ble Union Minister of State of Science & Technology (IC) also addressed the delegates. During the event, the IBDC portal was launched to enable access to the GenomeIndia data. For a video recording of the event, please look up: Watch the event on YouTube . The GenomeIndia website provides up-to-date information on the project and can be accessed at https://genomeindia.in/index.php