Microhaplotype Database (MHBase)

To support the development of the MH database and systematically select the best MHs for different purposes, our team has successfully established the microhaplotype database (MHBase). First, we screened for all MHs containing two or more SNPs and/or InDels with a separation distance of less than 50 bp in the 26 global populations from the 1000 Genomes (Phase 3) dataset (GRCh37 reference genome assembly). Then, we obtained the estimated allele frequency of all MHs in each population group using the PHASE software and then calculated the effective number of alleles (Ae) value, polymorphism information content (PIC), and observed heterozygosity (Het).

The microhaplotype database (MHBase) website is a user-friendly interface that provides researchers with an easy and fast way to retrieve the associated MHs from the MHBase or variants (SNP or InDel) information from multiple public resources according to the search conditions. It also integrates the PHASE software to enable a sufficiently accurate haplotype reconstruction observed in natural populations from the 1000 Genomes (Phase 3) data or researchers’ uploaded population data. The bioinformatics tool was developed for MH markers to analyze ancestry informative statistics. The main purpose of the website was to provide an easy-to-use web-based bioinformatics tool for researchers to access the original MH pools and select the appropriate markers based on their research, either in forensic or medical applications such as individualization, biogeographic ancestry inference, mixture deconvolution, kinship analysis, and non-invasive parental testing.

Before searching, users can learn about the MHBase website and its functions through Website tutorials

Analysis Modules

MHBase
Users can search and query the MH database according to the input retrieval conditions.
Published MHs
Users can search the information of published MHs according to the input retrieval conditions.
PHASE calculation
The PHASE calculation based on 1000 Genome Project (Phase 3) or user defined input data.
AIM calculation
The calculation for Fst/Gst and In of MHs among populations based on 1000 Genome Project (Phase 3) data.

Publications

Please cite the most recent paper:
Xue, J., Tan, M., Qu, S. et al. Genome-wide microhaplotype database construction and preliminary research in forensic investigation. Unpublished
Xue, J., Qu, S., Tan, M. et al. An overview of SNP-SNP microhaplotypes in the 26 populations of the 1000 Genomes Project. Int J Legal Med 136, 1211–1226 (2022). https://doi.org/10.1007/s00414-022-02820-2

Others:
[1] Zhang R., Xue J., Tan M. et al. An MPS-Based 50plex Microhaplotype Assay for Forensic DNA Analysis. Genes. 2023; 14(4):865. https://doi.org/10.3390/genes14040865
[2] Tan M., Xue J., Zhang R. et al. An NGS-based microhaplotype system with high polymorphism for forensic DNA mixtures analysis, Forensic Science International: Genetics Supplement Series (2022). 10.1016/j.fsigss.2022.10.079
[3] Chen P., Zhu W., Tong F.et al. Identifying novel microhaplotypes for ancestry inference, (2018).
[4] Zhu J., Chen P., Qu S.et al. Genotyping microhaplotype markers through massively parallel sequencing, Forensic Science International: Genetics Supplement Series 6 (2017) e314-e316. 10.1016/j.fsigss.2017.09.128
[5] Zhu J.,Zhou N., Jiang Y.et al. FLfinder: A novel software for the microhaplotype marker, Forensic Science International: Genetics Supplement Series 5 (2015) e622-e624.
[6] J. Zhu, N. Zhou, Y. Jiang, L. Wang, W. He, D. Peng, Q. Su, J. Mao, D. Chen, W. Liang, L. Zhang, FLfinder: A novel software for the microhaplotype marker, Forensic Science International: Genetics Supplement Series 5 (2015) e622-e624. https://doi.org/10.1016/j.fsigss.2015.10.002

Copyright © 2023 Lin Zhang & Weibo Liang Lab. All rights reserved.


Result


Note. All MHs containing two or more SNP and/or InDel with a separation distance of less than 50 bp in the 26 global populations from the 1000 Genomes (Phase 3).

Result
Go to NCBI Variantion Viewer! Here


Note. The allele frequencies of variants 200 bp upward and downward around the input variants are also shown in the output table.


Fasta Reference sequence (Legend)


Note. The reference sequence information of all variants (GRCh37) above is presented above, and the “R” in red means a variant.



Result
Go to NCBI Variantion Viewer! Here


Note. The allele frequencies of variants 200 bp upward and downward around the input variants are also shown in the output table.

Result
Go to ALFRED! Here


Note:

Result




Basic Statistics
Note. He: Expected Heterozygosity; PIC: Polymorphism Information Content; Ae: Effective number of alleles; DP: Discrimination Power; PE: Power of Exclusion.



Result



Chosen populations


Note. In: Rosenberg’s informativeness for assignment;Fst:Fixation index;Gst: Gene differentiation coefficient.



Lin Zhang & Weibo Liang Lab

Department of Forensic Genetics, West China School of Basic Medical Sciences and Forensic Medicine, Sichuan University, Chengdu 610041, Sichuan, China
Fax: +86 28 85401825
E-mail address: liangweibo@scu.edu.cn (W. Liang), zhanglin@scu.edu.cn (L. Zhang).


Like most people these days we get far too much email. Because of this we sometimes miss important messages. Please accept our apologies if we don't respond to you in a timely manner, and feel free to resend your message.


Historical lab photos: