Skip to content

lanctools

Tools for working with phased local ancestry data stored in the .lanc file format, as defined by Admix-kit [Hou et al., 2024].

lanctools is designed to provide fast local ancestry queries and convenient conversion from external formats (e.g., FLARE [Browning et al., 2023] and RFMix [Maples et al., 2013]). It focuses on efficient access to .lanc data and is not intended to replace the full functionality of Admix-kit.

Features

  • Efficient random access to phased local ancestry data
  • Local ancestry-masked genotype queries
  • Conversion from FLARE and RFMix output to .lanc format
  • Python API and command-line interface (CLI)

Installation

pip install lanctools

Quickstart

Querying Local Ancestry Data

lanctools is primarily intending for performing fast queries of local ancestry and genotype data. Examples are provided below.

import numpy as np
from lanctools import LancData

ld = LancData(
    plink_prefix="chr1",
    lanc_file="chr1.lanc",
    ancestries=["YRI", "CEU"]
)

idx_var = np.arange(100, dtype=np.uint32)

# Get phased local ancestry: shape (N, 100, 2)
lanc = ld.get_lanc(idx_var)

# Get phased genotypes: shape (N, 100, 2)
geno = ld.get_geno(idx_var)

# Get ancestry-masked genotypes: shape (N, 100, len(ancestries))
lanc_geno = ld.get_lanc_geno(idx_var)

Converting FLARE or RFMix Files to .lanc

lanctools also provides c++ code for converting RFMix2 .msp.tsv or FLARE .vcf.gz files into the .lanc file format. This can be called with the python function convert_to_lanc.

from lanctools import convert_to_lanc

convert_to_lanc(
    file="chr1.anc.vcf.gz",
    file_fmt="FLARE",
    plink_prefix="chr1",
    output="chr1.lanc"
)

Command-Line Interface

For the file format conversion example above, a command-line utility is provided which accomplishes the same task.

lanctools convert --input chr1.anc.vcf.gz --plink chr1 --format FLARE --output chr1.lanc

lanctools also has a helpful CLI command for combining multiple .lanc files (e.g. across chromosomes) into a single .lanc file.

lanctools merge --input chr1.lanc --input chr2.lanc --input chr3.lanc --output chr1_3.lanc

References

  • Hou, K. et al. Admix-kit: an integrated toolkit and pipeline for genetic analyses of admixed populations. Bioinformatics 40, btae148 (2024). paper software
  • Browning, S. R., Waples, R. K. & Browning, B. L. Fast, accurate local ancestry inference with FLARE. Am J Hum Genet 110, 326–335 (2023). paper software
  • Maples, B. K., Gravel, S., Kenny, E. E. & Bustamante, C. D. RFMix: A Discriminative Modeling Approach for Rapid and Robust Local-Ancestry Inference. Am J Hum Genet 93, 278–288 (2013). paper software