qdiv.sequences.sequence_distance_matrix

qdiv.sequences.sequence_distance_matrix(obj, *, savename='SeqDistMat', path='', band_width=12, save=True, use_numba=True)[source]

Compute pairwise Levenshtein distances with a parallelized, Numba-accelerated banded Wagner–Fischer algorithm (if use_numba=True), else pure Python fallback.

Parameters:
  • obj (MicrobiomeData or dict) – Must provide a DataFrame in obj.seq or obj[‘seq’] with index=sequence IDs and a column containing the sequences (default name: ‘seq’).

  • savename (str, optional) – Base filename for CSV outputs. Default ‘SeqDistMat’.

  • path (str, default "") – Directory path (absolute or relative) where output is saved. Can be “” for CWD.

  • band_width (int, optional) – Sakoe–Chiba band half-width (expanded automatically to |len1-len2|). Larger values increase accuracy (approach exact DP) but reduce speed. Default 12.

  • save (bool, optional) – If True, writes two CSVs: edits and normalized.

  • use_numba (bool, optional) – If True, uses Numba path; otherwise uses pure Python implementation.

Returns:

{

‘edits’: pd.DataFrame, # int distances ‘normalized’: pd.DataFrame, # float in [0, 1] ‘meta’: {‘backend’: ‘numba’|’python’}

}

Return type:

Dict[str, Any]