2. SCIV API

Download (.dl)

Data download interface, used to download single-cell data and trait files.

sciv.dl.download_sc_atac_file(is_force: bool = False) → None

Download the scATAC2 file from the remote server to the local cache.

Parameters

is_forcebool, optional: If True, force re-download even if the file exists. Default is False.

Examples

>>> sciv.dl.download_sc_atac_file()

sciv.dl.download_trait_file(is_force: bool = False) → None

Download the trait file from the remote server to the local cache.

Parameters

is_forcebool, optional: If True, force re-download even if the file exists. Default is False.

Examples

>>> sciv.dl.download_trait_file()

sciv.dl.download_trs_file(is_force: bool = False) → None

Download the TRS file from the remote server to the local cache.

Parameters

is_forcebool, optional: If True, force re-download even if the file exists. Default is False.

Examples

>>> sciv.dl.download_trs_file()

sciv.dl.download_trs_score_file(is_force: bool = False) → None

Download the TRS score file from the remote server to the local cache.

Parameters

is_forcebool, optional: If True, force re-download even if the file exists. Default is False.

Examples

>>> sciv.dl.download_trs_score_file()

sciv.dl.read_sc_atac_file() → AnnData

Read the scATAC-seq file from the local cache.

Returns

AnnData: The scATAC-seq data.

Examples

>>> adata = sciv.dl.read_sc_atac_file()

sciv.dl.read_trait_file() → Tuple[dict, DataFrame]

Read the trait files from the local cache.

Returns

Tuple[dict, DataFrame]: The trait data.

Examples

>>> variants, trait_info = sciv.dl.read_trait_file()

sciv.dl.read_trs_file() → AnnData

Read the TRS file from the local cache.

Returns

AnnData: The TRS data.

Examples

>>> trs = sciv.dl.read_trs_file()

sciv.dl.read_trs_score_file() → AnnData

Read the TRS score file from the local cache.

Returns

AnnData: The TRS score data.

Examples

>>> trs_score = sciv.dl.read_trs_score_file()

File (.fl)

File read-write interface, used for processing single-cell ATAC data, H5AD, H5 and other format files.

sciv.fl.barcodes_add_anno(annotation_file: str | Path, cell_anno: DataFrame, clusters: str | None = None) → DataFrame

Add user inputted cell information to the cell annotation data.

Parameters

annotation_filepath: The file that adds information about cells must contain the column name barcodes, the file input by the user.
cell_annoDataFrame: Read the cell description in the scATAC-seq data generated from the file.
clustersstr, optional: The column name for cell clusters or cell types. (In most cases, this column can be ignored.) It is worth noting that only the values in this column are judged to determine whether they contain NA values. If they do, they are assigned the value unknown, and if not, no operation is performed.

Returns

Complete cell annotation data: Complete cell annotation data with user inputted cell information.

sciv.fl.read_barcodes_file(barcodes_file: str | Path, clusters: str | None = None, barcode_split_character: str = '-', annotation_file: str | Path | None = None) → DataFrame

Read barcodes file.

Parameters

barcodes_filepath: Barcodes file.
clustersstr, optional: The column name for cell clusters or cell types. (In most cases, this column can be ignored.) It is worth noting that only the values in this column are judged to determine whether they contain NA values. If they do, they are assigned the value unknown, and if not, no operation is performed.
barcode_split_characterstr, default=’-’: A barcode separated character symbol. (meta)
annotation_filepath, optional: The file that adds information about cells must contain the column name barcodes, the file input by the user.

Returns

Cell annotation data: Cell annotation data with user inputted cell information.

sciv.fl.read_h5(file: str | Path, is_close: bool = False)

Read AnnData data from an h5 file.

Parameters

filepath: Path to the h5 file.
is_closebool, default=False: If True, close the file. Default is False.

Returns

AnnData data.: The loaded AnnData data from the h5 file.

sciv.fl.read_h5ad(file: str | Path, is_verbose: bool = True) → AnnData

Read AnnData from an h5ad file.

Parameters

filepath: Path to the h5ad file.
is_verbosebool, default=True: If True, print log information. Default is True.

Returns

AnnData: The loaded AnnData object.

sciv.fl.read_pkl(file: str | Path, is_verbose: bool = True)

Read data from a pickle file.

Parameters

filepath: Path to the pickle file.
is_verbosebool, default=True: If True, print log information. Default is True.

Returns

Python variable data.: The loaded Python variable data from the pickle file.

sciv.fl.read_sc_atac(resource: str | Path | None = None, is_transpose: bool = True, barcode_split_character: str = '-', on_barcode_split_character: str | None = None, annotation_file: str | Path | None = None, clusters: str | None = None, peak_split_character: Tuple = (':', '-')) → AnnData

Read scATAC-seq data and return it in AnnData format.

Parameters

resourcepath, optional: Input data source. Can be one of the following: 1. Path to directory containing matrix, bed file, etc. (output from cell-ranger) 2. H5 file obtained through cell-ranger 3. A comprehensive h5ad file 4. A table file with cell or peak columns and indexes, where content is fragment counts Default is None.
is_transposebool, default=True: Whether transpose is required to read the matrix file.
barcode_split_characterstr, default=’-’: Character used to split barcode information (for metadata).
on_barcode_split_characterstr, optional: Character used to split barcode information (for matrix). If None, uses barcode_split_character. Default is None.
annotation_filepath, optional: File containing additional cell information. Must contain a ‘barcodes’ column. Default is None.
clustersstr, optional: Column name for cell clusters or cell types. If NA values exist in this column, they will be assigned as ‘unknown’. Default is None.
peak_split_characterTuple, default=(“:”, “-“): Characters used to split peak information (chromosome, start, end). First element splits chromosome from start, second splits start from end.

Returns

AnnData: scATAC-seq data in AnnData format with cell and peak annotations.

sciv.fl.read_sc_atac_10x_h5(file: str | Path, clusters: str | None = None, barcode_split_character: str = '-', annotation_file: str | Path | None = None, peak_split_character: Tuple = (':', '-')) → AnnData

Read hdf5 file from Cell Ranger v3 or later versions.

Parameters

filepath: A comprehensive h5ad file. (It can be obtained through cell-ranger)
clustersstr, optional: The column name for cell clusters or cell types. (In most cases, this column can be ignored.) It is worth noting that only the values in this column are judged to determine whether they contain NA values. If they do, they are assigned the value unknown, and if not, no operation is performed.
barcode_split_characterstr, default=’-’: A barcode separated character symbol (meta)
annotation_filepath, optional: The file that adds information about cells must contain the column name barcodes
peak_split_charactertuple, default=(‘:’,’-‘): A peak separated character symbol

Returns

AnnData: scATAC-seq data.

Read variant file set.

Parameters

base_pathpath, optional: Path for storing mutation trait data. The file must contain the following column names: chr, position, rsId, pp, where ID represents the representative of the trait name. Default is None.
filescollection, optional: Collection of mutation trait data file paths. Default is None.
labelsdict, optional: Classification labels for each trait or disease. Default is None.
column_mapdict, optional: Mapping of column names to facilitate mapping the corresponding column names in the mutation file to the specified column name information. For example: {0: “chr”, 1: “position”, 2: “rsId”, 3: “pp”}. Default is None.
repeat_symbolstr, default=”_#”: Symbol used to distinguish duplicate trait names. If two files have the same name abbreviation, a symbol and numerical value will be added to one of the abbreviations.

Returns

dict: Dictionary containing AnnData objects for each trait or disease, where keys are trait names and values are AnnData objects with variant information.
DataFrame: Annotated information on traits or diseases, including summary statistics such as pp_sum, pp_mean, count, and filename.

sciv.fl.save_h5(data: dict, save_file: str | Path, group_name: str = 'matrix') → None

Save H5 data to H5 file.

Parameters

datadict: Input H5 data to save.
save_filepath: Input path to save file.
group_name: str, default=”matrix”: The group name.

Returns

H5 file: The input H5 file.

sciv.fl.save_h5ad(data: AnnData, file: str | Path) → AnnData

Save AnnData data to h5ad file.

Parameters

dataAnnData: Input AnnData object to save.
filepath: Path to save file.

Returns

AnnData: The input AnnData object.

sciv.fl.save_pkl(data, save_file: str | Path, is_verbose: bool = False) → None

Save pkl data to pkl file.

Parameters

dataany: Input data to save.
save_filepath: Input path to save file.

is_verbose: Set true to print log;

Returns

pkl file: The input pkl file.

sciv.fl.to_fragments(adata: AnnData, fragments: str, layer: str | None = None, batch_size: int = 100000, is_sort: bool = True, is_gz: bool = True, is_keep: bool = False) → None

Convert AnnData format data into fragments format file.

Parameters

adataAnnData: Input AnnData object containing single-cell data.
fragmentsstr: Output file path for the fragments file.
layerstr, optional: The layer of data to use for generating fragments file. If None, uses the main data matrix (adata.X).
batch_sizeint, default=50000: Batch size for processing data. Larger values reduce memory consumption.
is_sortbool, default=True: Whether to sort the output by chromosome and start position. Sorts chromosomes in natural order (chr1, chr2, …, chrX, chrY, chrM).
is_gzbool, default=True: Whether to compress the output file using gzip. Uses pysam.tabix_compress for compression.
is_keepbool, default=False: Whether to keep the uncompressed fragments file after compression. Only effective when is_gz is True. If False, the uncompressed file is deleted after successful compression.

Returns

None: Writes fragments file to the specified path.

Note

To export results processed by SnapATAC2, please use snapatac2.ex.export_fragments directly. Using this function is not recommended.

sciv.fl.to_meta(adata: AnnData, dir_path: str | Path, layer: str | None = None, feature_name: str = 'peaks.bed', field: Literal['real', 'complex', 'pattern', 'integer'] | None = None) → None

Convert AnnData object into metadata directory containing matrix, feature files, etc.

This function exports single-cell data into standard 10x Genomics format, including: - matrix.mtx: Sparse matrix file in Matrix Market format - annotation.txt: Cell annotation information - barcodes.tsv: Cell barcodes list - peaks.bed or specified feature file: Genomic feature information

Parameters

adataAnnData: Input AnnData object containing single-cell data.
dir_pathpath: Output directory path for storing generated metadata files.
layerstr, optional: layer: The layer of data that needs to form meta files; If None, uses adata.X as the main data matrix.
feature_namestr, default=”peaks.bed”: Output name for the feature file. If starts with “peaks”, feature indices will be parsed by chromosome position into BED format.
field_Field, optional: Matrix data type field, available values: - ‘real’: Real numbers - ‘complex’: Complex numbers - ‘pattern’: Pattern matrix (no values) - ‘integer’: Integer values If None, automatically determined from data type.

Returns

Directory: The input directory.

Model (.ml)

The core interface of the model provides functions for cell type association analysis and causal variation recognition.

sciv.ml.association_score(adata: AnnData, score_name: str = 'association_score', layer: str = 'trs_source', axis: Literal[0, 1] = 0) → None

Calculate association score for traits or diseases. This function calculates the association score for traits or diseases based on the TRS (Trait Relevance Score) data in the input AnnData object.

Parameters

adataAnnData: Input AnnData object containing TRS data.
score_namestr, optional: Name of the score column in the AnnData object. Default is “association_score”.
layerstr, optional: Layer name in the AnnData object containing TRS data. Default is “trs_source”.
axisLiteral[0, 1], optional: Axis to calculate the score (0 for traits, 1 for diseases). Default is 0.

return:: None

sciv.ml.core(adata: AnnData, variants: dict, trait_info: DataFrame, cell_rate: float | None = None, peak_rate: float | None = None, max_epochs: int = 500, lr: float = 1e-05, batch_size: int = 128, eps: float = 1e-08, early_stopping: bool = True, early_stopping_patience: int = 50, strategy: str = 'ddp_notebook_find_unused_parameters_true', batch_key: str | None = None, resolution: float = 0.5, k: int = 30, or_k: int = 10, weight: float = 0.5, kernel: Literal['laplacian', 'gaussian'] = 'gaussian', local_k: int = 10, kernel_gamma: float | str | list | set | Tuple | ndarray | None = None, epsilon: float = 1e-05, max_steps: int = 300, gamma: float = 0.05, enrichment_gamma: float = 0.05, p: int = 2, n_jobs: int = -1, min_seed_cell_rate: float = 0.01, max_seed_cell_rate: float = 0.05, credible_threshold: float = 0, diff_peak_value: Literal['emp_effect', 'bayes_factor', 'emp_prob1', 'all'] = 'emp_effect', enrichment_threshold: Literal['golden', 'half', 'e', 'pi', 'none'] | float = 'golden', is_ablation: bool = False, model_dir: str | Path | None = None, save_path: str | Path | None = None, is_simple: bool = True, is_save_random_walk_model: bool = False, is_file_exist_loading: bool = False, filename_dict: dict | None = None, block_size: int = -1) → AnnData

The core algorithm of sciv includes the flow of all algorithms, as well as drawing and saving data. In the entire algorithm, the samples are in the row position, and the traits or diseases are in the column position, while ensuring that there is no interaction between the traits or diseases, ensuring the stability of the results;

Meaning of main variables:

overlap_adata, (obs: peaks, var: traits/diseases) Peaks-traits/diseases data obtained by overlaying variant data with peaks.
da_peaks, (obs: clusters (Leiden), var: peaks) Differential peak data of cell clustering, used for weight correction of cells.
init_score, (obs: cells, var: traits/diseases) This is the initial TRS data.
cc_data, (obs: cells, var: cells) Cell similarity data.
random_walk, RandomWalk class.
trs, (obs: cells, var: traits/diseases) This is the final TRS data.

Parameters

adataAnnData

scATAC-seq data.

variantsdict

Variant data. This data is recommended to be obtained by executing the fl.read_variants method.

trait_infoDataFrame

Variant annotation file information.

cell_rateOptional[float], default None

Removing the percentage of cell count in total cell count only takes effect when the min_cells parameter is None.

peak_rateOptional[float], default None

Removing the percentage of peak count in total peak count only takes effect when the min_peaks parameter is None.

max_epochsint, default 500

The maximum number of epochs for PoissonVI training.

lrfloat, default 1e-5

Learning rate for optimization.

batch_sizeint, default 128

Minibatch size to use during training.

epsfloat, default 1e-08

Optimizer eps.

early_stoppingbool, default True

Whether to perform early stopping with respect to the validation set.

early_stopping_patienceint, default 50

How many epochs to wait for improvement before early stopping.

strategystr, default “ddp_notebook_find_unused_parameters_true”

DDP strategy.

batch_keyOptional[str], default None

Batch information in scATAC-seq data.

resolutionfloat, default 0.5

Resolution of the Leiden Cluster. The recommended values are any one of 0.4, 0.9, 1.3, 1.5.

kint, default 30

When building an mKNN network, the number of nodes connected by each node (and operation).

or_kint, default 10

When building an mKNN network, the number of nodes connected by each node (or operation).

weightfloat, default 0.5

The weight of interactions or operations.

kernelLiteral[“laplacian”, “gaussian”], default “gaussian”

Determine the kernel function to be used.

local_kint, default 10

Determining the number of neighbors for the adaptive kernel.

kernel_gammaOptional[Union[float, str, collection]], default None

When the value of kernel is “laplacian”, if it is None, then it is the reciprocal of the latent representation dimension of the cell. When the value of kernel is “gaussian”, if it is None, then it defaults to an adaptive value obtained through local information of the parameter local_k. Otherwise, it should be strictly positive.

epsilonfloat, default 1e-05

Conditions for stopping in random walk.

max_stepsint, default 300

Maximum number of steps in a random walk with restart.

gammafloat, default 0.05

Reset weight for random walk.

enrichment_gammafloat, default 0.05

Reset weight for random walk for enrichment.

pint, default 2

Distance used for loss {1: Manhattan distance, 2: Euclidean distance}.

n_jobsint, default -1

The maximum number of concurrently running jobs.

min_seed_cell_ratefloat, default 0.01

The minimum percentage of seed cells in all cells.

max_seed_cell_ratefloat, default 0.05

The maximum percentage of seed cells in all cells.

credible_thresholdfloat, default 0

The threshold for determining the credibility of enriched cells in the context of enrichment, i.e. the threshold for judging enriched cells.

diff_peak_valuedifference_peak_optional, default ‘emp_effect’

Specify the correction value in peak correction of clustering type differences. {‘emp_effect’, ‘bayes_factor’, ‘emp_prob1’}

enrichment_thresholdUnion[enrichment_optional, float], default ‘golden’

Only by setting a threshold for the standardized output TRS can a portion of the enrichment results be obtained. Parameters support string types {‘golden’, ‘half’, ‘e’, ‘pi’, ‘none’}, or valid floating-point types within the range of (0, log1p(1)).

is_ablationbool, default False

True represents obtaining the results of the ablation experiment. This parameter is limited by the is_simple parameter, and its effectiveness requires setting is_simple to False.

model_dirOptional[path], default None

The folder name saved by the training module. It is worth noting that if the training model file (model.pt) exists in this path, it will be automatically read and skip the training of PoissonVI model.

save_pathOptional[path], default None

Save path for process files and result files.

is_simplebool, default True

True represents not adding unnecessary intermediate variables, only adding the final result. It is worth noting that when set to True, the is_ablation parameter will become invalid, and when set to False, is_ablation will only take effect.

is_save_random_walk_modelbool, default False

Default to False, do not save random walk model. When setting True, please ensure sufficient storage as the saved pkl file is relatively large.

is_file_exist_loadingbool, default False

By default, the file will be overwritten. When set to True, if the file exists, the process will be skipped and the file will be directly read as the result.

filename_dictOptional[dict], default None

The name of the file that exists. default: {

“sc_atac”: “sc_atac.h5ad”, “da_peaks”: “da_peaks.h5ad”, “atac_overlap”: “atac_overlap.h5ad”, “init_score”: “init_score.h5ad”, “cc_data”: “cc_data.h5ad”, “random_walk”: “random_walk.h5ad”, “trs”: “trs.h5ad”

}

block_sizeint

The size of the segmentation stored in block wise matrix multiplication. By sacrificing time and space to reduce memory consumption to a certain extent. If the value is less than or equal to zero, no block operation will be performed.

Returns

AnnData: AnnData object containing TRS (Trait Relevance Score) results. (obs: cells, var: traits/diseases) This is the final TRS data.

sciv.ml.knock(trs: AnnData, sc_atac: AnnData, da_peaks: AnnData, cc_data: AnnData, knock_trait: str, knock_info: dict[str, Union[str, list, set, Tuple, numpy.ndarray]], knock_value: float = 0, is_add_control: bool = False) → AnnData

Perform gene knockdown or knockout analysis on a specific trait.

This function simulates the effect of knocking down or knocking out specific variants associated with a trait, and re-runs the random walk algorithm to compute the resulting TRS (Trait Relevance Score) changes.

Parameters

trsAnnData: TRS result data from ml.core, containing parameters, variants, trait_info and trs_source.
sc_atacAnnData: scATAC-seq data used in the original analysis.
da_peaksAnnData: Differential accessibility peaks data from the original analysis.
cc_dataAnnData: Cell-cell similarity network data from the original analysis.
knock_traitstr: The trait ID to perform knockdown/knockout on.
knock_infodict[str, Union[str, collection]]: Dictionary mapping knock group names to variant IDs (rsId) to be knocked down. Each key is a group name, and each value is either a single variant ID (str) or a collection of variant IDs to knock down together.
knock_valuefloat, default 0: The value to set for knocked-down variants. Default is 0 (complete knockout). Values >= 1e-3 are not recommended as they may not achieve the desired effect.
is_add_controlbool, default False: Whether to add control experiments (knocking out background variants).

Returns

AnnData: AnnData object containing TRS results after knockdown/knockout. Includes knock parameters in .uns[“params”] and knock-specific metadata.

Plot (.pl)

Visual interface, including multiple chart types for data analysis and presentation.

Graph

Network diagram visualization function.

sciv.pl.communities_graph(adata: AnnData, labels: list | set | Tuple | ndarray, layer: str | None = None, groupby: str = 'clusters', x_name: str | None = None, y_name: str | None = None, title: str | None = None, width: float = 2, height: float = 2, bottom: float = 0, node_size: float = 2.0, line_widths: float = 0.001, start_color_index: int = 0, color_step_size: int = 0, output: str | Path | None = None, show: bool = True, close: bool = False)

Plot a cell-cell network diagram with community detection coloring.

This function visualizes a network graph where nodes represent cells and edges represent connections between cells. Nodes are colored based on their community assignments.

Parameters

adataAnnData: Annotated data matrix with observations (cells) and variables (genes).
labelscollection: Community labels for grouping nodes. Each community is a collection of node indices.
layerstr, optional: Name of the layer in adata to use for adjacency matrix. If None, uses adata.X.
groupbystr, default=”clusters”: Column name in adata.obs containing cluster information for color assignment.
x_namestr, optional: Label for the x-axis.
y_namestr, optional: Label for the y-axis.
titlestr, optional: Title of the plot.
widthfloat, default=2: Width of the figure in inches.
heightfloat, default=2: Height of the figure in inches.
bottomfloat, default=0: Bottom margin adjustment.
node_sizefloat, default=2.0: Size of the nodes in the network.
line_widthsfloat, default=0.001: Width of the node edges and network edges.
start_color_indexint, default=0: Starting index for color selection from the color palette.
color_step_sizeint, default=0: Step size for selecting colors from the palette for different communities.
outputpath, optional: Path to save the figure.
showbool, default=True: Whether to display the figure.
closebool, default=False: Whether to close the figure after display.

Returns

None: The function displays and/or saves the network plot.

Plot a graph from an adjacency matrix.

Parameters

datamatrix_data: Adjacency matrix representing the graph connections.
labelscollection, optional: Labels for each node in the graph.
node_sizeint, default=50: Size of the nodes in the plot.
namestr, optional: Name of the graph.
x_namestr, optional: Label for the x-axis.
y_namestr, optional: Label for the y-axis.
titlestr, optional: Title of the plot.
widthfloat, default=2: Width of the figure in inches.
heightfloat, default=2: Height of the figure in inches.
bottomfloat, default=0: Bottom margin adjustment.
is_fontbool, default=False: Whether to display node labels.
outputpath, optional: Path to save the figure.
showbool, default=True: Whether to display the figure.
closebool, default=False: Whether to close the figure after display.

sciv.pl.network_two_types(data_pairs: list, type1_scores: dict, type2_scores: dict, type1_node_size: dict | list | float | None = 50, type2_node_size: dict | list | float | None = 50, label_nodes: list | None = None, width: float = 4, height: float = 3, k: float | None = None, iterations: int = 50, scale: float = 1, radius: float = 0.35, type1_node_shape: str = 'o', type2_node_shape: str = 's', type1_bar_label: str = 'Score', type2_bar_label: str = 'Score', type1_cmap_str: str = 'winter', type2_cmap_str: str = 'YlOrRd', node_alpha: float = 0.8, edge_alpha: float = 0.8, is_fluctuate: bool = True, layout_type: str = 'spring', output: str | Path | None = None, show: bool = True, close: bool = False)

Plot a bipartite network graph with two types of nodes.

This function visualizes a network where nodes are divided into two distinct types (e.g., genes and variations), with edges representing connections between them. Each node type can have different sizes, colors, and shapes based on their scores.

Parameters

data_pairslist: List of tuples representing edges between type1 and type2 nodes.
type1_scoresdict: Dictionary mapping type1 node names to their score values for color mapping.
type2_scoresdict: Dictionary mapping type2 node names to their score values for color mapping.
type1_node_sizeUnion[dict, list, float], default=50: Size of type1 nodes. Can be a single value, list, or dict mapping nodes to sizes.
type2_node_sizeUnion[dict, list, float], default=50: Size of type2 nodes. Can be a single value, list, or dict mapping nodes to sizes.
label_nodeslist, optional: List of node names to display labels for.
widthfloat, default=4: Width of the figure in inches.
heightfloat, default=3: Height of the figure in inches.
kfloat, optional: Optimal distance between nodes for spring layout. If None, uses default.
iterationsint, default=50: Number of iterations for spring layout optimization.
scalefloat, default=1: Scale factor for the layout positions.
radiusfloat, default=0.35: Radius for positioning connected nodes around their parent nodes in custom layouts.
type1_node_shapestr, default=’o’: Matplotlib marker shape for type1 nodes.
type2_node_shapestr, default=’s’: Matplotlib marker shape for type2 nodes.
type1_bar_labelstr, default=’Score’: Label for the color bar of type1 nodes.
type2_bar_labelstr, default=’Score’: Label for the color bar of type2 nodes.
type1_cmap_strstr, default=”winter”: Colormap name for type1 node colors.
type2_cmap_strstr, default=”YlOrRd”: Colormap name for type2 node colors.
node_alphafloat, default=0.8: Transparency level for nodes (0-1).
edge_alphafloat, default=0.8: Transparency level for edges (0-1).
is_fluctuatebool, default=True: Whether to add random fluctuation to node positions in custom layouts.
layout_typestr, default=’spring’: Layout algorithm to use. Options: ‘spring’, ‘kamada_kawai’, ‘circular’, ‘shell’, ‘circular_type1’, ‘circular_type2’, ‘square_type1’, ‘square_type2’.
outputpath, optional: Path to save the figure.
showbool, default=True: Whether to display the figure.
closebool, default=False: Whether to close the figure after display.

Returns

None: The function displays and/or saves the network plot.

Heatmap

Heatmap visualization function.

sciv.pl.heatmap(adata: AnnData, layer: str | None = None, title: str | None = None, width: float = 4, height: float = 4, bottom: float = 0, annot: bool = False, square: bool = True, is_cluster: bool = False, cmap: str = 'Oranges', line_widths: float = 1, fmt: str = '.2f', rotation: float = 65, x_name: str | None = None, y_name: str | None = None, output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any) → None

Generate a simple heatmap using seaborn.

Parameters

adataAnnData: Input AnnData object containing the data matrix.
layerstr, default None: Layer name in adata.layers to use for plotting. If None, uses adata.X.
titleOptional[str], default None: Title of the figure.
widthfloat, default 4: Width of the figure in inches.
heightfloat, default 4: Height of the figure in inches.
bottomfloat, default 0: Bottom margin of the figure.
annotbool, default False: Whether to annotate each cell with its numeric value.
squarebool, default True: Whether to make cells square-shaped.
is_clusterbool, default False: Whether to perform hierarchical clustering (uses clustermap instead of heatmap).
cmapstr, default “Oranges”: Colormap for the heatmap.
line_widthsfloat, default 1: Width of the lines that divide cells.
fmtstr, default “.2f”: String formatting code for annotations.
rotationfloat, default 65: Rotation angle for x-axis labels.
x_namestr, default None: Label for the x-axis.
y_namestr, default None: Label for the y-axis.
outputpath, default None: File path to save the figure. If None, figure is not saved.
showbool, default True: Whether to display the figure.
closebool, default False: Whether to close the figure after saving.
**kwargsAny: Additional keyword arguments passed to seaborn heatmap or clustermap.

Returns

None: Displays or saves the heatmap figure.

sciv.pl.heatmap_annotation(adata: AnnData, layer: str | None = None, width: float = 4, height: float = 4, title: str | None = None, label: str = 'value', row_name: str | None = None, col_name: str | None = None, row_names: str | None = None, col_names: str | None = None, row_anno_label: bool = False, col_anno_label: bool = False, row_anno_text: bool = False, col_anno_text: bool = False, row_legend: bool = False, col_legend: bool = False, row_show_names: bool = False, col_show_names: bool = False, row_cluster: bool = False, col_cluster: bool = False, cluster_method: str = 'average', cluster_metric: str = 'correlation', row_names_side: str = 'left', col_names_side: str = 'bottom', bottom: float = 0.01, label_size: float = 9, fontsize: float = 9, level_bar_height: float | None = None, anno_specific_labels: list | None = None, x_label_rotation: float = 245, y_label_rotation: float = 0, row_color_start_index: int = 0, col_color_start_index: int = 10, row_split: int | Series | None = None, col_split: int | Series | None = None, row_split_order: list | str | None = None, col_split_order: list | str | None = None, row_split_gap: float = 0.5, col_split_gap: float = 0.2, frac: float = 0.2, relpos: Tuple = (0, 1), anno_label_height: float | None = None, selected_anno_label_height: float = 2.5, category_height: float | None = 2.5, x_name: str | None = None, y_name: str | None = None, row_score_name: str = 'association_score', cmap: str = 'Oranges', is_sort: bool = True, show: bool = True, close: bool = False, output: str | Path | None = None, **kwargs) → None

Generate a heatmap with row and column annotations.

Parameters

adataAnnData: Input AnnData object containing the data matrix and metadata.
layerOptional[str], default None: Layer name in adata.layers to use for plotting. If None, uses adata.X.
widthfloat, default 4: Width of the figure in inches.
heightfloat, default 4: Height of the figure in inches.
titleOptional[str], default None: Title of the figure.
labelstr, default “value”: Label for the heatmap color bar.
row_nameOptional[str], default None: Column name in adata.obs for row annotations.
col_nameOptional[str], default None: Column name in adata.var for column annotations.
row_namesOptional[str], default None: Column name in adata.obs to use as row index labels.
col_namesOptional[str], default None: Column name in adata.var to use as column index labels.
row_anno_labelbool, default False: Whether to display merged labels for row annotations.
col_anno_labelbool, default False: Whether to display merged labels for column annotations.
row_anno_textbool, default False: Whether to display text labels on row annotation bars.
col_anno_textbool, default False: Whether to display text labels on column annotation bars.
row_legendbool, default False: Whether to show legend for row annotations.
col_legendbool, default False: Whether to show legend for column annotations.
row_show_namesbool, default False: Whether to display row names (index labels) on the heatmap.
col_show_namesbool, default False: Whether to display column names (index labels) on the heatmap.
row_clusterbool, default False: Whether to perform hierarchical clustering on rows.
col_clusterbool, default False: Whether to perform hierarchical clustering on columns.
cluster_methodstr, default “average”: Linkage method for hierarchical clustering (e.g., “average”, “single”, “complete”).
cluster_metricstr, default “correlation”: Distance metric for hierarchical clustering (e.g., “correlation”, “euclidean”).
row_names_sidestr, default “left”: Side to display row names (“left” or “right”).
col_names_sidestr, default “bottom”: Side to display column names (“top” or “bottom”).
bottomfloat, default 0.01: Bottom margin of the figure.
label_sizefloat, default 9: Font size for row and column name labels.
fontsizefloat, default 9: Font size for axis titles.
level_bar_heightfloat, default None: Height of the association score bar plot annotation.
anno_specific_labelslist, default None: List of specific row labels to highlight in the annotation.
x_label_rotationfloat, default 245: Rotation angle for x-axis labels (column names).
y_label_rotationfloat, default 0: Rotation angle for y-axis labels (row names).
row_color_start_indexint, default 0: Starting index in the color palette for row annotations.
col_color_start_indexint, default 10: Starting index in the color palette for column annotations.
row_splitUnion[int, pd.Series], default None: Number of clusters or grouping series for splitting rows.
col_splitUnion[int, pd.Series], default None: Number of clusters or grouping series for splitting columns.
row_split_orderUnion[list, str], default None: Order for row splits or ‘cluster_between_groups’ for auto-clustering.
col_split_orderUnion[list, str], default None: Order for column splits or ‘cluster_between_groups’ for auto-clustering.
row_split_gapfloat, default 0.5: Gap size between row splits in mm.
col_split_gapfloat, default 0.2: Gap size between column splits in mm.
fracfloat, default 0.2: Fraction parameter for annotation label positioning.
relposTuple, default (0, 1): Relative position for annotation labels.
anno_label_heightOptional[float], default None: Height of the annotation label bar.
selected_anno_label_heightfloat, default 2.5: Height of the selected annotation label bar.
category_heightOptional[float], default 2.5: Height of the category annotation bar.
x_nameOptional[str], default None: Label for the x-axis.
y_nameOptional[str], default None: Label for the y-axis.
row_score_namestr, default “association_score”: Column name in adata.obs for the association score bar plot.
cmapstr, default “Oranges”: Colormap for the heatmap.
is_sortbool, default True: Whether to sort rows and columns before plotting.
showbool, default True: Whether to display the figure.
closebool, default False: Whether to close the figure after saving.
outputpath, default None: File path to save the figure. If None, figure is not saved.
**kwargs: Additional keyword arguments passed to ClusterMapPlotter.

Returns

None: Displays or saves the heatmap figure.

Scatter

Scatter chart visualization function.

sciv.pl.manhattan_causal_variant(df: DataFrame, y: str = 'pp', chr_name: str = 'chr', label: str = 'rsId', size: int = 30, labels: list | None = None, colors: list | None = None, width: float = 8, height: float = 2, bottom: float = 0, title: str | None = None, is_sort: bool = True, line_width: float = 0.5, y_round: int = 3, x_name: str | None = 'Chromosome', y_name: str | None = 'pp', y_limit: Tuple[float, float] = (0, 1), output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any) → None

Create a Manhattan plot for causal variant visualization across chromosomes.

Parameters

dfDataFrame: Input data containing variant information with chromosome and position data
ystr, default “pp”: Column name for y-axis values (typically posterior probability or p-value)
chr_namestr, default “chr”: Column name for chromosome identifiers
labelstr, default “rsId”: Column name for variant labels/identifiers
sizeint, default 30: Size of scatter points
labelsOptional[list], optional: List of specific variant labels to annotate on the plot
colorsOptional[list], optional: Custom color palette for different chromosomes
widthfloat, default 8: Figure width in inches
heightfloat, default 2: Figure height in inches
bottomfloat, default 0: Bottom margin adjustment
titlestr, optional: Plot title
is_sortbool, default True: Whether to sort data by chromosome
line_widthfloat, default 0.5: Width of separator lines between chromosomes and grid lines
y_roundint, default 3: Number of decimal places for y-value annotations
x_nameOptional[str], default “Chromosome”: Label for x-axis
y_nameOptional[str], default “pp”: Label for y-axis
y_limitTuple[float, float], default (0, 1): Y-axis limits for the plot
outputpath, optional: Output file path
showbool, default True: Whether to display the plot
closebool, default False: Whether to close the figure after saving
**kwargsAny: Additional arguments passed to ax.axvline

sciv.pl.pseudo_time_score(df: DataFrame, x: str, y: str, x_name: str | None = None, y_name: str | None = None, title: str | None = None, width: float = 2, height: float = 1.2, bottom: float = 0, alpha: float = 0.65, line_width: float = 1.5, step_length: int = 5, polyorder: int = 1, size: float | list | set | Tuple | ndarray = 1.0, output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any) → None

Create a scatter plot showing pseudo-time scores with a smoothed trend line.

Parameters

dfDataFrame: Input data containing pseudo-time and score values
xstr: Column name for pseudo-time values (x-axis)
ystr: Column name for score values (y-axis)
x_namestr, optional: Label for x-axis
y_namestr, optional: Label for y-axis
titlestr, optional: Plot title
widthfloat, default 2: Figure width in inches
heightfloat, default 1.2: Figure height in inches
bottomfloat, default 0: Bottom margin adjustment
alphafloat, default 0.65: Transparency of scatter points
line_widthfloat, default 1.5: Width of the smoothed trend line
step_lengthint, default 5: Step length for determining Savitzky-Golay filter window size
polyorderint, default 1: Polynomial order for Savitzky-Golay filter
sizeUnion[float, collection], default 1.0: Size of scatter points
outputpath, optional: Output file path
showbool, default True: Whether to display the plot
closebool, default False: Whether to close the figure after saving
**kwargsAny: Additional arguments passed to ax.scatter

sciv.pl.scatter_3d(df: DataFrame, x: str, y: str, z: str, hue: str | None = None, x_name: str | None = None, y_name: str | None = None, z_name: str | None = None, title: str | None = None, width: float = 7, height: float = 7, elev: float = 30, azim: float = -60, is_add_legend: bool = True, cmap: str | ListedColormap = 'tab20', font_size: int = 14, edge_color: str | None = None, size: float | list | set | Tuple | ndarray = 0.1, legend_name: str | None = None, is_add_max_label: bool = False, text_left_offset: float = 0.5, output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any)

Create a 3D scatter plot with customizable aesthetics.

Parameters

dfDataFrame: Input data containing x, y, z coordinates
xstr: Column name for x-axis values
ystr: Column name for y-axis values
zstr: Column name for z-axis values
huestr, optional: Column name for color grouping
x_namestr, optional: Label for x-axis
y_namestr, optional: Label for y-axis
z_namestr, optional: Label for z-axis
titlestr, optional: Plot title
widthfloat, default 7: Figure width in inches
heightfloat, default 7: Figure height in inches
elevfloat, default 30: Elevation angle for 3D view
azimfloat, default -60: Azimuth angle for 3D view
is_add_legendbool, default True: Whether to add legend
cmapUnion[str, ListedColormap], default ‘tab20’: Colormap for coloring
font_sizeint, default 14: Font size for labels and title
edge_colorstr, optional: Edge color for scatter points
sizeUnion[float, collection], default 0.1: Size of scatter points
legend_namestr, optional: Title for legend
is_add_max_labelbool, default False: Whether to add label for maximum z value point
text_left_offsetfloat, default 0.5: Horizontal offset for max value label
outputpath, optional: Output file path
showbool, default True: Whether to display the plot
closebool, default False: Whether to close the figure after saving
**kwargsAny: Additional arguments passed to ax.scatter

sciv.pl.scatter_atac(adata: AnnData, columns: Tuple[str, str] = ('UMAP1', 'UMAP2'), groupby: str = 'clusters', hue_order: list | None = None, width: float = 2, height: float = 2, x_name: str | None = None, y_name: str | None = None, start_color_index: int = 0, color_step_size: int = 0, type_colors: list | set | Tuple | ndarray | None = None, edge_color: str | None = None, size: float = 1.0, text_fontsize: float = 7, legend_fontsize: float = 7, is_text: bool = False, output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any) → None

Create a scatter plot for ATAC-seq data with cluster coloring.

Parameters

adataAnnData: AnnData object containing observations and coordinates
columnsTuple[str, str], default (“UMAP1”, “UMAP2”): Column names for x and y coordinates in adata.obs
groupbystr, default “clusters”: Column name for cluster labels in adata.obs
hue_orderlist, optional: Order of clusters for legend
widthfloat, default 2: Figure width in inches
heightfloat, default 2: Figure height in inches
x_namestr, optional: Label for x-axis
y_namestr, optional: Label for y-axis
start_color_indexint, default 0: Starting index in color palette
color_step_sizeint, default 0: Step size for color selection
type_colorscollection, optional: Custom color palette
edge_colorstr, optional: Edge color for scatter points
sizefloat, default 1.0: Size of scatter points
text_fontsizefloat, default 7: Font size for annotation text
legend_fontsizefloat, default 7: Font size for legend text
is_textbool, default False: Whether to add text annotations
outputpath, optional: Output file path
showbool, default True: Whether to display the plot
closebool, default False: Whether to close the figure after saving
**kwargsAny: Additional arguments passed to scatter_base

Create a base scatter plot with customizable aesthetics.

Parameters

dfDataFrame: Input data containing x, y coordinates and optional hue values
xstr: Column name for x-axis values
ystr: Column name for y-axis values
huestr, optional: Column name for color grouping
hue_orderlist, optional: Order of hue categories for legend
x_namestr, optional: Label for x-axis
y_namestr, optional: Label for y-axis
titlestr, optional: Plot title
bar_labelstr, optional: Label for colorbar when number=True
cmapstr, default “Oranges”: Colormap for continuous coloring
widthfloat, default 2: Figure width in inches
heightfloat, default 2: Figure height in inches
rightfloat, default 0.9: Position for legend anchor
bottomfloat, default 0: Bottom margin adjustment
text_fontsizefloat, default 7: Font size for annotation text
legend_fontsizefloat, default 7: Font size for legend text
start_color_indexint, default 0: Starting index in color palette
color_step_sizeint, default 0: Step size for color selection
type_colorscollection, optional: Custom color palette
edge_colorstr, optional: Edge color for scatter points
sizeUnion[float, collection], default 1.0: Size of scatter points
legenddict, optional: Mapping to rename hue categories
numberbool, default False: Whether to use continuous color scale
is_textbool, default False: Whether to add text annotations
outputpath, optional: Output file path
showbool, default True: Whether to display the plot
closebool, default False: Whether to close the figure after saving
**kwargsAny: Additional arguments passed to sns.scatterplot

Plot trait data scatter plot.

Parameters

trait_adataAnnData: AnnData object containing trait/disease scores and cell metadata
titlestr, optional: Title prefix for the plot
bar_labelstr, optional: Label for colorbar when number=True
trait_namestr, default “All”: Name of trait/disease to plot, or “All” to plot all traits
layersUnion[None, collection], optional: List of layer names to plot from trait_adata.layers
columnsTuple[str, str], default (“UMAP1”, “UMAP2”): Column names for x and y coordinates in trait_adata.obs
cmapstr, default “viridis”: Colormap for continuous coloring
widthfloat, default 2: Figure width in inches
heightfloat, default 2: Figure height in inches
rightfloat, default 0.9: Position for legend anchor
x_namestr, optional: Label for x-axis
y_namestr, optional: Label for y-axis
numberbool, default True: Whether to use continuous color scale for trait scores
edge_colorstr, optional: Edge color for scatter points
sizeUnion[float, collection], default 1.0: Size of scatter points
text_fontsizefloat, default 7: Font size for annotation text
legend_fontsizefloat, default 7: Font size for legend text
start_color_indexint, default 0: Starting index in color palette
color_step_sizeint, default 0: Step size for color selection
type_colorscollection, optional: Custom color palette
is_textbool, default False: Whether to add text annotations
legenddict, optional: Mapping to rename hue categories
outputpath, optional: Output directory path for saving plots
showbool, default True: Whether to display the plot
closebool, default False: Whether to close the figure after saving
**kwargsAny: Additional arguments passed to scatter_base

sciv.pl.volcano_base(df: DataFrame, x: str = 'Log2(Fold change)', y: str = '-Log10(P value)', hue: str = 'type', size: int = 3, palette: list | None = None, width: float = 2, height: float = 2, bottom: float = 0, y_min: float = 0, axh_value: float = np.float64(3.0), axv_left_value: float = -1, axv_right_value: float = 1, title: str | None = None, x_name: str | None = None, y_name: str | None = None, output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any) → None

Plot volcano plot.

Parameters

dfDataFrame: Data frame.
xstr, optional: X-axis.
ystr, optional: Y-axis.
huestr, optional: Hue.
sizeint, optional: Size.
paletteOptional[list], optional: Palette.
widthfloat, optional: Width.
heightfloat, optional: Height.
bottomfloat, optional: Bottom.
y_minfloat, optional: Y-min.
axh_valuefloat, optional: Axh-value.
axv_left_valuefloat, optional: Axv-left-value.
axv_right_valuefloat, optional: Axv-right-value.
titlestr, optional: Title.
x_nameOptional[str], optional: X-name.
y_nameOptional[str], optional: Y-name.
outputpath, optional: Output.
showbool, optional: Show to display the plot.
closebool, optional: Close to close the figure after saving.
kwargsAny, optional: Additional keyword arguments passed to sns.scatterplot.

Returns

None

Violin

Violin chart visualization function.

sciv.pl.violin_base(df: DataFrame, value: str = 'value', x_name: str | None = None, y_name: str = 'value', kind: Literal['strip', 'swarm', 'box', 'violin', 'boxen', 'point', 'bar', 'count'] = 'violin', groupby: str = 'clusters', palette: Tuple | list | None = None, hue: str | None = None, width: float = 2, height: float = 2, bottom: float = 0.3, rotation: float = 65, line_width: float = 0.5, title: str | None = None, split: bool = False, is_sort: bool = True, order_names: list | None = None, output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any) → None

Plot violin plot.

Parameters

dfDataFrame: Input data.
valuestr, optional: Value column.
x_namestr, optional: X name.
y_namestr, optional: Y name.
kind_Kind, optional: Kind of plot.
groupbystr, optional: Clusters column.
paletteUnion[Tuple, list], optional: Palette.
huestr, optional: Hue column.
widthfloat, optional: Width.
heightfloat, optional: Height.
bottomfloat, optional: Bottom.
rotationfloat, optional: Rotation.
line_widthfloat, optional: Line width.
titlestr, optional: Title.
splitbool, optional: Whether to split.
is_sortbool, optional: Whether to sort.
order_nameslist, optional: Order names.
outputpath, optional: Output path.
showbool, optional: Whether to show.
closebool, optional: Whether to close.
kwargsAny, optional: Keyword arguments.

Returns

None

sciv.pl.violin_trait(trait_df: DataFrame, trait_name: str | list = 'All', trait_column_name: str = 'id', value: str = 'value', groupby: str = 'clusters', kind: Literal['strip', 'swarm', 'box', 'violin', 'boxen', 'point', 'bar', 'count'] = 'violin', x_name: str | None = None, y_name: str = 'value', palette: Tuple | None = None, width: float = 2, height: float = 2, rotation: float = 65, line_width: float = 0.1, bottom: float = 0.3, split: bool = False, is_sort: bool = True, order_names: list | None = None, title: str | None = None, output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any) → None

Plot violin plot for trait data.

This function creates violin plots (or other categorical plots) for trait data, allowing visualization of trait distributions across different clusters.

Parameters

trait_dfDataFrame: Input trait data containing trait information and values.
trait_nameUnion[str, list], optional: Name(s) of the trait(s) to plot. Use “All” to plot all traits.
trait_column_namestr, optional: Column name in trait_df that contains trait identifiers.
valuestr, optional: Column name containing the values to plot.
groupbystr, optional: Column name containing cluster assignments.
kind_Kind, optional: Type of categorical plot to create (e.g., “violin”, “box”, “strip”).
x_namestr, optional: Label for the x-axis.
y_namestr, optional: Label for the y-axis.
paletteTuple, optional: Color palette for the plot.
widthfloat, optional: Width of the figure in inches.
heightfloat, optional: Height of the figure in inches.
rotationfloat, optional: Rotation angle for x-axis labels in degrees.
line_widthfloat, optional: Width of the plot lines.
bottomfloat, optional: Bottom margin of the figure.
splitbool, optional: Whether to split the violin plot when using hue.
is_sortbool, optional: Whether to sort clusters by median value.
order_nameslist, optional: Custom order for cluster names.
titlestr, optional: Title prefix for the plot.
outputpath, optional: Directory path to save the output files.
showbool, optional: Whether to display the plot.
closebool, optional: Whether to close the figure after saving.
kwargsAny, optional: Additional keyword arguments passed to violin_base.

Returns

None

Box

Visualization function of box diagram.

sciv.pl.box_base(df: DataFrame, x: str = 'clusters', y: str = 'value', x_name: str | None = None, y_name: str = 'value', palette: Tuple | list | None = None, width: float = 2, height: float = 2, bottom: float = 0.3, line_width: float = 0.3, marker_size: float = 0.2, rotation: float = 65, orient: str | None = None, title: str | None = None, whis: float = 1.5, show_fliers: bool = True, is_sort: bool = True, order_names: list | None = None, output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any) → None

Create a box plot with customizable styling options.

Parameters

dfDataFrame: Input data containing the values to plot.
xstr, default “clusters”: Column name for the x-axis categorical variable.
ystr, default “value”: Column name for the y-axis numerical variable.
x_namestr, optional: Custom label for the x-axis. If None, uses the x column name.
y_namestr, default “value”: Custom label for the y-axis.
paletteUnion[Tuple, list], optional: Color palette for the boxes. If None and “color” column exists, uses that.
widthfloat, default 2: Width of the figure in inches.
heightfloat, default 2: Height of the figure in inches.
bottomfloat, default 0.3: Bottom margin adjustment for the plot.
line_widthfloat, default 0.3: Width of lines in the plot (box edges, whiskers, etc.).
marker_sizefloat, default 0.2: Size of outlier markers.
rotationfloat, default 65: Rotation angle for x-axis tick labels in degrees.
orientstr, optional: Orientation of the plot (“v” for vertical, “h” for horizontal).
titlestr, optional: Title of the plot.
whisfloat, default 1.5: Proportion of the IQR past the low and high quartiles to extend the whiskers.
show_fliersbool, default True: Whether to display outlier points beyond the whiskers.
is_sortbool, default True: Whether to sort boxes by median value in descending order.
order_nameslist, optional: Custom order for x-axis categories. Only used if is_sort is False.
outputpath, optional: File path to save the plot. If None, plot is not saved.
showbool, default True: Whether to display the plot.
closebool, default False: Whether to close the figure after displaying.
**kwargsAny: Additional keyword arguments passed to seaborn.boxplot.

sciv.pl.box_trait(trait_df: DataFrame, trait_name: str = 'All', trait_column_name: str = 'id', value: str = 'value', groupby: str = 'clusters', x_name: str | None = None, y_name: str = 'value', palette: Tuple | list | None = None, orient: str | None = None, width: float = 2, height: float = 2, line_width: float = 0.1, marker_size: float = 0.5, bottom: float = 0.3, rotation: float = 65, whis: float = 1.5, show_fliers: bool = True, is_sort: bool = True, order_names: list | None = None, title: str | None = None, output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any) → None

Create box plots for trait/disease data across different clusters.

This function generates box plots for each trait or a specific trait from the input dataframe. It filters data by trait and creates individual box plots using the box_base function.

Parameters

trait_dfDataFrame: Input data containing trait/disease information and values to plot.
trait_namestr, default “All”: Name of the trait/disease to plot. Use “All” to plot all traits.
trait_column_namestr, default “id”: Column name in trait_df that contains trait/disease identifiers.
valuestr, default “value”: Column name for the numerical values to be plotted on y-axis.
groupbystr, default “clusters”: Column name for the cluster categories to be plotted on x-axis.
x_namestr, optional: Custom label for the x-axis. If None, uses the clusters column name.
y_namestr, default “value”: Custom label for the y-axis.
paletteUnion[Tuple, list], optional: Color palette for the boxes.
orientstr, optional: Orientation of the plot (“v” for vertical, “h” for horizontal).
widthfloat, default 2: Width of the figure in inches.
heightfloat, default 2: Height of the figure in inches.
line_widthfloat, default 0.1: Width of lines in the plot.
marker_sizefloat, default 0.5: Size of outlier markers.
bottomfloat, default 0.3: Bottom margin adjustment for the plot.
rotationfloat, default 65: Rotation angle for x-axis tick labels in degrees.
whisfloat, default 1.5: Proportion of the IQR to extend the whiskers.
show_fliersbool, default True: Whether to display outlier points beyond the whiskers.
is_sortbool, default True: Whether to sort boxes by median value.
order_nameslist, optional: Custom order for x-axis categories.
titlestr, optional: Base title for the plots. Trait name will be appended.
outputpath, optional: Directory path to save the plots. If None, plots are not saved.
showbool, default True: Whether to display the plots.
closebool, default False: Whether to close the figure after displaying.
**kwargsAny: Additional keyword arguments passed to box_base function.

KDE

Visualization function of kernel density estimation map.

sciv.pl.kde(adata: AnnData, layer: str | None = None, x_name: str | None = None, y_name: str | None = None, title: str | None = None, width: float = 4, height: float = 2, bottom: float = 0.3, axis: Literal[-1, 0, 1] = -1, sample_number: int = 1000000, is_legend: bool = True, output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any) → None

Plot Kernel Density Estimation (KDE) for single-cell data.

Parameters

adataAnnData: Annotated data matrix with observations (rows) and variables (columns).
layerstr, optional: Which layer of adata to use. If None, uses adata.X.
x_namestr, optional: Label for the x-axis.
y_namestr, optional: Label for the y-axis.
titlestr, optional: Title of the plot.
widthfloat, default=4: Width of the figure in inches.
heightfloat, default=2: Height of the figure in inches.
bottomfloat, default=0.3: Bottom margin of the figure.
axisLiteral[-1, 0, 1], default=-1: Axis along which to compute KDE: - -1: Flatten all data and compute single KDE. - 0: Compute KDE for each column (variable). - 1: Compute KDE for each row (observation).
sample_numberint, default=1000000: Maximum number of samples to use for KDE computation. If data exceeds this, random downsampling is applied.
is_legendbool, default=True: Whether to display legend when axis is 0 or 1.
outputpath, optional: Path to save the figure. If None, figure is not saved.
showbool, default=True: Whether to display the figure.
closebool, default=False: Whether to close the figure after displaying.
**kwargsAny: Additional keyword arguments passed to seaborn.kdeplot.

Line

Line chart visualization function.

Base line plot function for visualizing data trends over time or categories.

This function creates a line plot from either AnnData or DataFrame objects, supporting grouped data visualization with customizable colors, legends, and styling.

Parameters

dataUnion[AnnData, DataFrame]: Input data object, can be either AnnData (single-cell data) or pandas DataFrame.
xstr: Column name to use for x-axis values.
ystr: Column name to use for y-axis values.
layerOptional[str], default None: Specific layer to use from AnnData.layers when data is AnnData.
widthfloat, default 2: Figure width in inches.
heightfloat, default 2: Figure height in inches.
bottomfloat, default 0: Bottom margin adjustment for the plot.
titleOptional[str], default None: Title of the plot.
x_nameOptional[str], default None: Label for x-axis. If None, uses x column name.
y_nameOptional[str], default None: Label for y-axis. If None, uses y column name.
labelOptional[str], default None: Column name used for grouping data (creates separate lines).
legendOptional[str], default None: Title for the legend. If None and label is provided, uses “category”.
legend_listlist, default None: List of specific group values to include in the plot.
start_color_indexint, default 0: Starting index for color selection from the color palette.
color_step_sizeint, default 0: Step size for selecting colors from the palette.
color_typestr, default “set”: Type of color palette to use (key from plot_color_types).
colorslist, default None: Custom list of colors to use for the plot.
line_widthfloat, default 1.5: Width of the lines in the plot.
x_name_rotationfloat, default 65: Rotation angle for x-axis tick labels (in degrees).
x_ticksOptional[Union[int, collection]], default None: Custom tick positions or number of ticks for x-axis.
y_limitTuple[float, float], default (0, 1): Y-axis limits as (min, max) tuple.
outputOptional[path], default None: File path to save the figure. If None, figure is not saved.
is_strbool, default True: Whether to treat x-axis values as strings (affects tick formatting).
showbool, default True: Whether to display the plot.
closebool, default False: Whether to close the figure after display.
**kwargsAny: Additional keyword arguments passed to seaborn.lineplot.

Returns

None: The function displays and/or saves the plot but does not return any value.

Bar

Bar chart visualization function.

Create a simple bar chart with optional value labels.

This function generates a bar plot (vertical or horizontal) with customizable appearance and automatically adds numerical value labels on each bar.

Parameters

ax_xcollection: Categories or labels for the x-axis (or y-axis if horizontal).
ax_ycollection: Numerical values for the bar heights (or widths if horizontal).
x_namestr, optional: Label for the x-axis. Default is None.
y_namestr, optional: Label for the y-axis. Default is None.
titlestr, optional: Title of the plot. Default is None.
colorstr, default “#70b5de”: Color of the bars.
text_colorstr, default “#000205”: Color of the value labels on bars.
widthfloat, default 2: Width of the figure in inches.
heightfloat, default 2: Height of the figure in inches.
bottomfloat, default 0: Bottom margin adjustment.
text_left_movefloat, default 0.1: Horizontal adjustment for text position on bars.
directionLiteral[‘vertical’, ‘horizontal’], default “vertical”: Orientation of the bars.
outputpath, optional: File path to save the figure. Default is None.
showbool, default True: Whether to display the plot.
closebool, default False: Whether to close the figure after saving.
**kwargsAny: Additional keyword arguments passed to matplotlib’s bar/barh function.

Returns

None: The function displays and/or saves the plot but does not return any value.

sciv.pl.bar_significance(df: DataFrame, x: str, y: str, hue: str, x_name: str | None = None, y_name: str | None = None, anchor: str | None = None, legend: str | None = None, legend_list: list | None = None, hue_order: list | None = None, width: float = 2, height: float = 2, bottom: float = 0, legend_gap: float = 1.15, line_width: float = 0.5, capsize: float = 0.1, errcolor: str = 'k', start_color_index: int = 0, color_step_size: int = 0, color_type: str = 'set', test: str = 't-test_ind', ci: str | float = 'sd', x_rotation: float = 0, x_deviation: float = 0.02, y_deviation: float = 0.02, y_limit: Tuple[float, float] = (0, 1), anno: bool = False, anno_fontsize: float = 7, line_height: float = 0.01, line_offset: float = 0.01, colors: list | dict | None = None, title: str | None = None, output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any) → None

Create a bar chart with statistical significance annotations relative to an anchor group.

This function generates a grouped bar plot with error bars and performs pairwise statistical significance testing between an anchor group and other groups within each category. It supports custom color palettes, legend positioning, and various statistical tests.

Parameters

dfDataFrame: Input DataFrame containing the data to plot.
xstr: Column name for x-axis categories.
ystr: Column name for y-axis values.
huestr: Column name for grouping bars by color.
x_namestr, optional: Label for x-axis. Default is None.
y_namestr, optional: Label for y-axis. Default is None.
anchorstr, optional: Reference group name for pairwise significance testing. If provided, statistical comparisons will be made between this group and all other groups within each x category.
legendstr, optional: Legend title. Default is “category”.
legend_listlist, optional: Subset of hue values to include in the plot. If provided, only these values will be plotted. Default is None.
hue_orderlist, optional: Order of hue categories for plotting and legend. Default is None.
widthfloat, default 2: Width of the figure in inches.
heightfloat, default 2: Height of the figure in inches.
bottomfloat, default 0: Bottom margin adjustment.
legend_gapfloat, default 1.15: Vertical gap between plot and legend, specified as a ratio of the y-axis height.
line_widthfloat, default 0.5: Width of error bars and significance annotation lines.
capsizefloat, default 0.1: Width of the error bar caps.
errcolorstr, default “k”: Color of the error bars.
start_color_indexint, default 0: Starting index in the color palette for the first hue category.
color_step_sizeint, default 0: Step size when cycling through the color palette for subsequent hue categories.
color_typestr, default “set”: Name of the seaborn color palette to use. Must be a key in plot_color_types.
teststr, default “t-test_ind”: Statistical test for pairwise comparisons. Options include: {“t-test_ind”, “t-test_welch”, “t-test_paired”, “Mann-Whitney”, “Mann-Whitney-gt”,

“Mann-Whitney-ls”, “Levene”, “Wilcoxon”, “Kruskal”, “Brunner-Munzel”}.
ciUnion[str, float], default “sd”: Confidence interval type or value for error bars. Can be “sd” for standard deviation or a float for confidence interval percentage.
x_rotationfloat, default 0: Rotation angle for x-axis tick labels in degrees.
x_deviationfloat, default 0.02: Horizontal offset for bar value annotations.
y_deviationfloat, default 0.02: Vertical offset adjustment for bar value annotations.
y_limitTuple[float, float], default (0, 1): Y-axis limits for the plot.
annobool, default False: Whether to annotate bars with their numerical values.
anno_fontsizefloat, default 7: Font size for bar value annotations.
line_heightfloat, default 0.01: Height of significance annotation lines as a fraction of y-axis range.
line_offsetfloat, default 0.01: Vertical offset for significance annotation lines from the bar tops.
colorsUnion[list, dict], optional: Custom color list or dictionary mapping hue values to colors. If provided, overrides the default color palette. Default is None.
titlestr, optional: Title of the plot. Default is None.
outputpath, optional: File path to save the figure. Default is None.
showbool, default True: Whether to display the plot.
closebool, default False: Whether to close the figure after saving.
**kwargsAny: Additional keyword arguments passed to seaborn’s barplot function.

Returns

None: The function displays and/or saves the plot but does not return any value.

sciv.pl.bar_trait(trait_df: DataFrame, trait_name: str = 'All', trait_column_name: str = 'id', value: str = 'rate', groupby: str = 'clusters', x_name: str = 'Cell type', y_name: str = 'Enrichment ratio', color: Tuple = ('#2e6fb7', '#f7f7f7'), legend: Tuple = ('Enrichment', 'Conservative'), text_color: str = '#000205', groupby_sort: list | None = None, width: float = 2, height: float = 2, bottom: float = 0, rotation: float = 65, title: str | None = None, text_left_move: float = 0.15, y_limit: Tuple[float, float] = (0, 1), output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any)

Create stacked bar charts for multiple traits or a specific trait.

This function generates enrichment bar plots for traits (e.g., diseases, gene sets) in the input DataFrame. It can plot all traits or a specific trait based on the trait_name parameter. Each trait’s enrichment data is visualized using the class_bar function, with results saved to individual files.

Parameters

trait_dfDataFrame: Input DataFrame containing trait enrichment data. Must include columns for trait identifiers, cluster labels, and enrichment values.
trait_namestr, default “All”: The specific trait to plot. If “All”, plots bar charts for all unique traits in the trait_column_name column.
trait_column_namestr, default “id”: Column name in trait_df that contains trait identifiers.
valuestr, default “rate”: Column name containing the numerical enrichment values to plot.
groupbystr, default “clusters”: Column name containing cluster or cell type labels.
x_namestr, default “Cell type”: Label for the x-axis.
y_namestr, default “Enrichment ratio”: Label for the y-axis.
colorTuple, default (“#2e6fb7”, “#f7f7f7”): Colors for the two bar segments (enrichment color, conservative color).
legendTuple, default (“Enrichment”, “Conservative”): Labels for the legend corresponding to the two bar segments.
text_colorstr, default “#000205”: Color of the value labels on bars.
groupby_sortOptional[list], default None: Custom order for clusters. If provided, clusters will be sorted according to this list. If None, clusters are sorted by enrichment value.
widthfloat, default 2: Width of the figure in inches.
heightfloat, default 2: Height of the figure in inches.
bottomfloat, default 0: Bottom margin adjustment.
rotationfloat, default 65: Rotation angle for x-axis tick labels in degrees.
titlestr, optional: Base title of the plot. The trait name will be appended to this title. Default is None.
text_left_movefloat, default 0.15: Horizontal adjustment for text position on bars.
y_limitTuple[float, float], default (0, 1): The y-axis limits for the plot.
outputpath, optional: Directory path to save the figures. If provided, each trait’s plot will be saved as a PDF file in this directory. Default is None.
showbool, default True: Whether to display the plot.
closebool, default False: Whether to close the figure after saving.
**kwargsAny: Additional keyword arguments passed to the class_bar function.

Returns

None: The function displays and/or saves the plots but does not return any value.

sciv.pl.class_bar(df: DataFrame, value: str = 'rate', by: str = 'value', groupby: str = 'clusters', color: Tuple = ('#2e6fb7', '#f7f7f7'), x_name: str = 'Cell type', y_name: str = 'Enrichment ratio', legend: Tuple = ('Enrichment', 'Conservative'), text_color: str = '#000205', groupby_sort: list | None = None, width: float = 2, height: float = 2, bottom: float = 0, rotation: float = 65, title: str | None = None, text_left_move: float = 0.15, y_limit: Tuple[float, float] = (0, 1), output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any)

Create a stacked bar chart for enrichment analysis with two categories.

This function filters a DataFrame by a binary column, sorts the data by clusters, and generates a stacked bar plot using the two_bar function. It is typically used to visualize enrichment ratios where one category represents enriched values and the other represents conservative values.

Parameters

dfDataFrame: Input DataFrame containing the data to plot.
valuestr, default “rate”: Column name containing the numerical values to plot.
bystr, default “value”: Column name used to filter the DataFrame into two categories (typically binary: 0 and 1).
groupbystr, default “clusters”: Column name containing the cluster labels or categories.
colorTuple, default (“#2e6fb7”, “#f7f7f7”): Colors for the two bar segments (enrichment color, conservative color).
x_namestr, default “Cell type”: Label for the x-axis.
y_namestr, default “Enrichment ratio”: Label for the y-axis.
legendTuple, default (“Enrichment”, “Conservative”): Labels for the legend corresponding to the two bar segments.
text_colorstr, default “#000205”: Color of the value labels on bars.
groupby_sortOptional[list], default None: Custom order for clusters. If provided, clusters will be sorted according to this list. If None, clusters will be sorted by value in descending order.
widthfloat, default 2: Width of the figure in inches.
heightfloat, default 2: Height of the figure in inches.
bottomfloat, default 0: Bottom margin adjustment.
rotationfloat, default 65: Rotation angle for x-axis tick labels in degrees.
titlestr, optional: Title of the plot. Default is None.
text_left_movefloat, default 0.15: Horizontal adjustment for text position on bars.
y_limitTuple[float, float], default (0, 1): The y-axis limits for the plot.
outputpath, optional: File path to save the figure. Default is None.
showbool, default True: Whether to display the plot.
closebool, default False: Whether to close the figure after saving.
**kwargsAny: Additional keyword arguments passed to the two_bar function.

Returns

None: The function displays and/or saves the plot but does not return any value.

sciv.pl.rate_bar_plot(adata: AnnData, layer: str | None = None, trait_name: str = 'All', dir_name: str = 'feature', column: str = 'value', groupby: str = 'clusters', color: Tuple = ('#2e6fb7', '#f7f7f7'), legend: Tuple = ('Enrichment', 'Conservative'), x_name: str = 'Cell type', y_name: str = 'Enrichment ratio', groupby_sort: list | None = None, text_color: str = '#000205', width: float = 2, height: float = 2, bottom: float = 0, rotation: float = 65, title: str | None = None, text_left_move: float = 0.15, y_limit: Tuple[float, float] = (0, 1), plot_output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any) → None

Generate a bar plot showing enrichment ratios for trait-cluster combinations.

This function calculates the completion ratio using the complete_ratio function and visualizes the results as a bar plot. It handles directory creation for output files and passes appropriate parameters to the bar_trait plotting function.

Parameters

adataAnnData: Input AnnData object containing the data to be visualized.
layerstr, optional: Specify the layer of the matrix to be processed. If None, uses the main matrix.
trait_namestr, default “All”: The name of the trait being analyzed, used for filtering data.
dir_namestr, default “feature”: Folder name for generating and saving bar plot outputs.
columnstr, default “value”: The column name containing the binary enrichment values.
groupbystr, default “clusters”: The column name in adata.obs that defines the cell clusters.
colorTuple, default (“#2e6fb7”, “#f7f7f7”): Color tuple for the bar plot (enrichment color, conservative color).
legendTuple, default (“Enrichment”, “Conservative”): Legend labels for the two categories in the plot.
x_namestr, default “Cell type”: The label for the x-axis.
y_namestr, default “Enrichment ratio”: The label for the y-axis.
groupby_sortOptional[list], optional: Custom order for clusters. If None, uses default sorting.
text_colorstr, default “#000205”: Color for text annotations in the plot.
widthfloat, default 2: The width of the output figure in inches.
heightfloat, default 2: The height of the output figure in inches.
bottomfloat, default 0: Bottom margin adjustment for the plot.
rotationfloat, default 65: Rotation angle for x-axis labels in degrees.
titlestr, optional: The title of the plot. If None, no title is displayed.
text_left_movefloat, default 0.15: Horizontal adjustment for text position.
y_limitTuple[float, float], default (0, 1): The y-axis limits for the plot.
plot_outputpath, optional: Directory path for saving output files. If None, figures are not saved.
showbool, default True: If True, display the figure interactively.
closebool, default False: If True, close the figure after saving.
**kwargsAny: Additional keyword arguments passed to bar_trait function.

Returns

None: This function does not return any value. Outputs are saved to files or displayed.

sciv.pl.two_bar(ax_x: list | set | Tuple | ndarray, ax_y: Tuple, x_name: str | None = None, y_name: str | None = None, legend: Tuple = ('1', '2'), color: Tuple = ('#2e6fb7', '#f7f7f7'), text_color: str = '#000205', width: float = 2, height: float = 2, bottom: float = 0, rotation: float = 65, text_left_move: float = 0.15, y_limit: Tuple[float, float] = (0, 1), title: str | None = None, output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any)

Create a stacked bar chart with two categories.

This function generates a stacked bar plot where two sets of values are displayed as stacked bars. It automatically adds numerical value labels on the first bar segment and includes a legend for the two categories.

Parameters

ax_xcollection: Categories or labels for the x-axis.
ax_yTuple: A tuple containing two collections of numerical values for the two bar segments. The second segment will be stacked on top of the first.
x_namestr, optional: Label for the x-axis. Default is None.
y_namestr, optional: Label for the y-axis. Default is None.
legendTuple, default (“1”, “2”): Labels for the legend corresponding to the two bar segments.
colorTuple, default (“#2e6fb7”, “#f7f7f7”): Colors for the two bar segments (first segment, second segment).
text_colorstr, default “#000205”: Color of the value labels on bars.
widthfloat, default 2: Width of the figure in inches.
heightfloat, default 2: Height of the figure in inches.
bottomfloat, default 0: Bottom margin adjustment.
rotationfloat, default 65: Rotation angle for x-axis tick labels in degrees.
text_left_movefloat, default 0.15: Horizontal adjustment for text position on bars.
y_limitTuple[float, float], default (0, 1): The y-axis limits for the plot.
titlestr, optional: Title of the plot. Default is None.
outputpath, optional: File path to save the figure. Default is None.
showbool, default True: Whether to display the plot.
closebool, default False: Whether to close the figure after saving.
**kwargsAny: Additional keyword arguments passed to matplotlib’s bar function.

Returns

None: The function displays and/or saves the plot but does not return any value.

Barcode

Barcode visualization function.

sciv.pl.barcode_base(df: DataFrame, groupby_list: list, sort_column: str = 'value', column: str = 'clusters', width: float = 1, height: float = 3, trait_column_name: str = 'id', title: str | None = None, cmap: str = 'Oranges', bar_label: str = 'TRS', is_ticks: bool = True, colors: list | None = None, ground_true: list | None = None, output: str | Path | None = None, show: bool = True, close: bool = False) → None

Plot barcode plot.

Parameters

dfDataFrame: Input data.
groupby_listlist: Cluster list.
sort_columnstr, optional: Sort column.
columnstr, optional: Column name for clusters.
widthfloat, optional: Width.
heightfloat, optional: Height.
trait_column_namestr, optional: Trait column name.
titlestr, optional: Title.
cmapstr, optional: Cmap.
bar_labelstr, optional: Bar label.
is_ticksbool, optional: Whether to show ticks.
colorslist, optional: Colors.
ground_truelist, optional: Ground true.
outputpath, optional: Output path.
showbool, optional: Whether to display the plot.
closebool, optional: Whether to close the figure after display.

sciv.pl.barcode_trait(trait_df: DataFrame, trait_name: str = 'All', trait_column_name: str = 'id', sort_column: str = 'value', groupby: str = 'clusters', cmap: str = 'viridis', width: float = 1, height: float = 3, is_ticks: bool = True, colors: list | None = None, ground_true: list | None = None, title: str | None = None, suffix: str = 'pdf', output: str | Path | None = None, show: bool = True, close: bool = False) → None

Plot barcode plots for traits/diseases.

This function generates barcode visualizations for specified traits or all traits in the dataset. It creates individual plots for each trait showing the distribution of trait scores across different clusters.

Parameters

trait_dfDataFrame: Input DataFrame containing trait scores and cluster information.
trait_namestr, optional: Name of the trait/disease to plot. Use “All” to plot all traits. Default is “All”.
trait_column_namestr, optional: Column name in the DataFrame that contains trait identifiers. Default is “id”.
sort_columnstr, optional: Column name used for sorting values in the barcode plot. Default is “value”.
groupbystr, optional: Column name in the DataFrame that contains cluster assignments. Default is “clusters”.
cmapstr, optional: Colormap name for the value heatmap. Default is “viridis”.
widthfloat, optional: Width of the figure in inches. Default is 1.
heightfloat, optional: Height of the figure in inches. Default is 3.
is_ticksbool, optional: Whether to display colorbar ticks. Default is True.
colorslist, optional: Custom color list for cluster visualization. If None, uses default colors. Default is None.
ground_truelist, optional: Ground truth cluster labels for ordering. Default is None.
titlestr, optional: Base title for the plots. Trait name will be appended. Default is None.
suffixstr, optional: File extension for output plots (e.g., “pdf”, “png”). Default is “pdf”.
outputpath, optional: Directory path for saving output files. Default is None.
showbool, optional: Whether to display the plots interactively. Default is True.
closebool, optional: Whether to close the figure after display. Default is False.

Pie

Pie chart visualization function.

sciv.pl.base_pie(values: list, labels: list, x_name: str | None = None, y_name: str | None = None, title: str | None = None, width: float = 2, height: float = 2, bottom: float = 0, pct_distance: float = 0.6, label_distance: float = 1.1, colors: list | None = None, autopct: str = '%1.2f%%', output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any) → None

Create a basic pie chart with customizable parameters.

This function generates a simple pie chart using matplotlib, with support for custom colors, labels, and various display options.

Parameters

valueslist: The values to be plotted in the pie chart.
labelslist: The labels corresponding to each value in the pie chart.
x_namestr, optional: The label for the x-axis. Default is None.
y_namestr, optional: The label for the y-axis. Default is None.
titlestr, optional: The title of the pie chart. Default is None.
widthfloat, optional: The width of the figure in inches. Default is 2.
heightfloat, optional: The height of the figure in inches. Default is 2.
bottomfloat, optional: The bottom margin of the figure. Default is 0.
pct_distancefloat, optional: The distance of the percentage labels from the center of the pie. Default is 0.6.
label_distancefloat, optional: The distance of the labels from the center of the pie. Default is 1.1.
colorslist, optional: A list of colors to use for the pie slices. If None, default colors will be used. Default is None.
autopctstr, optional: The format string for the percentage labels. Default is ‘%1.2f%%’.
outputpath, optional: The file path to save the figure. If None, the figure will not be saved. Default is None.
showbool, optional: Whether to display the figure. Default is True.
closebool, optional: Whether to close the figure after displaying. Default is False.
**kwargsAny: Additional keyword arguments passed to matplotlib’s pie function.

sciv.pl.pie_label(df: DataFrame, map_groupby: str | list | set | Tuple | ndarray, value: str = 'value', groupby: str = 'clusters', x_name: str | None = None, y_name: str | None = None, title: str | None = None, width: float = 2, height: float = 2, bottom: float = 0, radius: float = 0.6, fontsize: float = 17, pct_distance: float = 0.6, label_distance: float = 1.1, colors: list | None = None, output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any) → None

Create a donut-style pie chart showing cluster label distribution.

This function generates a pie chart with a central hole (donut chart) to visualize the distribution of predicted cluster labels against true labels. The chart displays the percentage of correctly predicted labels in the center.

Parameters

dfDataFrame: The input data containing cluster and value information.
map_groupbyUnion[str, collection]: The mapping of clusters, can be a column name or a collection of cluster labels.
valuestr, optional: The column name for values in the DataFrame. Default is “value”.
groupbystr, optional: The column name for cluster labels in the DataFrame. Default is “clusters”.
x_namestr, optional: The label for the x-axis. Default is None.
y_namestr, optional: The label for the y-axis. Default is None.
titlestr, optional: The title of the pie chart. Default is None.
widthfloat, optional: The width of the figure in inches. Default is 2.
heightfloat, optional: The height of the figure in inches. Default is 2.
bottomfloat, optional: The bottom margin of the figure. Default is 0.
radiusfloat, optional: The radius of the inner white circle to create donut effect. Default is 0.6.
fontsizefloat, optional: The font size for the percentage text in the center. Default is 17.
pct_distancefloat, optional: The distance of the percentage labels from the center of the pie. Default is 0.6.
label_distancefloat, optional: The distance of the labels from the center of the pie. Default is 1.1.
colorslist, optional: A list of colors to use for the pie slices. If None, default colors will be used. Default is None.
outputpath, optional: The file path to save the figure. If None, the figure will not be saved. Default is None.
showbool, optional: Whether to display the figure. Default is True.
closebool, optional: Whether to close the figure after displaying. Default is False.
**kwargsAny: Additional keyword arguments passed to matplotlib’s pie function.

sciv.pl.pie_trait(trait_df: DataFrame, trait_groupby_map: dict, trait_name: str = 'All', groupby: str = 'clusters', trait_column_name: str = 'id', value: str = 'value', x_name: str | None = None, y_name: str | None = None, title: str | None = None, width: float = 2, height: float = 2, radius: float = 0.6, fontsize: float = 17, pct_distance: float = 0.6, label_distance: float = 1.1, colors: list | None = None, output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any) → None

Create pie charts for trait/disease cluster distribution analysis.

This function generates donut-style pie charts to visualize the distribution of trait-specific scores across different cell clusters. It supports batch processing for multiple traits or single trait analysis.

Parameters

trait_dfDataFrame: The input data containing trait information, cluster labels, and values.
trait_groupby_mapdict: A dictionary mapping trait names to their corresponding cluster mappings. Keys are trait names, values are cluster label mappings.
trait_namestr, optional: The specific trait to plot. Use “All” to plot all traits in the data. Default is “All”.
groupbystr, optional: The column name for cluster labels in the DataFrame. Default is “clusters”.
trait_column_namestr, optional: The column name for trait identifiers in the DataFrame. Default is “id”.
valuestr, optional: The column name for values/scores in the DataFrame. Default is “value”.
x_namestr, optional: The label for the x-axis. Default is None.
y_namestr, optional: The label for the y-axis. Default is None.
titlestr, optional: The base title for the pie charts. Trait name will be appended if provided. Default is None.
widthfloat, optional: The width of the figure in inches. Default is 2.
heightfloat, optional: The height of the figure in inches. Default is 2.
radiusfloat, optional: The radius of the inner white circle to create donut effect. Default is 0.6.
fontsizefloat, optional: The font size for the percentage text in the center. Default is 17.
pct_distancefloat, optional: The distance of the percentage labels from the center of the pie. Default is 0.6.
label_distancefloat, optional: The distance of the labels from the center of the pie. Default is 1.1.
colorslist, optional: A list of colors to use for the pie slices. If None, default colors will be used. Default is None.
outputpath, optional: The directory path to save the figures. If None, figures will not be saved. Default is None.
showbool, optional: Whether to display the figure. Default is True.
closebool, optional: Whether to close the figure after displaying. Default is False.
**kwargsAny: Additional keyword arguments passed to the pie_label function.

Bubble

Bubble chart visualization function.

Create a bubble plot using seaborn’s relplot.

Parameters

dfDataFrame: Input data structure.
xstr: Column name for x-axis values.
ystr: Column name for y-axis values.
huestr, optional: Column name for color encoding.
sizestr, optional: Column name for size encoding.
x_namestr, optional: Custom label for x-axis.
y_namestr, optional: Custom label for y-axis.
titlestr, optional: Plot title.
widthfloat, default=2: Figure width in inches.
heightfloat, default=2: Figure height in inches.
bottomfloat, default=0: Bottom margin adjustment.
outputpath, optional: File path to save the figure.
showbool, default=True: Whether to display the plot.
closebool, default=False: Whether to close the figure after display.
**kwargsAny: Additional arguments passed to seaborn.relplot.

Radar

Radar visualization function.

sciv.pl.base_radar(df: DataFrame, ax_x: str, ax_y: str, hue: str, x_name: str | None = None, y_name: str | None = None, title: str | None = None, width: float = 4, height: float = 4, bottom: float = 0, colors: list | set | Tuple | ndarray | None = None, line_width: float = 0.5, y_limit: Tuple = (0, 1), bbox_to_anchor: Tuple = (1.3, 1.1), is_fill: bool = True, fill_alpha: float = 0.2, output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any) → None

Plot a radar chart with multiple groups.

Parameters

dfDataFrame: Input data containing the values to plot.
ax_xstr: Column name for category labels (x-axis categories).
ax_ystr: Column name for values to plot (y-axis values).
huestr: Column name for grouping different lines.
x_namestr, optional: Label for the x-axis.
y_namestr, optional: Label for the y-axis.
titlestr, optional: Title of the chart.
widthfloat, optional: Width of the chart figure.
heightfloat, optional: Height of the chart figure.
bottomfloat, optional: Bottom margin adjustment.
colorscollection, optional: Colors for each group line.
line_widthfloat, optional: Width of the radar lines.
y_limitTuple, optional: Y-axis limit range.
bbox_to_anchorTuple, optional: Position for the legend box.
is_fillbool, optional: Whether to fill the radar area.
fill_alphafloat, optional: Transparency level for the filled area.
outputpath, optional: Output path to save the figure.
showbool, optional: Whether to display the figure.
closebool, optional: Whether to close the figure after display.
kwargsAny, optional: Additional keyword arguments for plotting.

Returns

None

Plot a radar chart.

Parameters

ax_xcollection: Category labels for the radar chart.
ax_ycollection: Data values for each category.
x_namestr, optional: Label for the x-axis.
y_namestr, optional: Label for the y-axis.
titlestr, optional: Title of the chart.
colorscollection, optional: Colors for the radar chart.
widthfloat, optional: Width of the chart.
heightfloat, optional: Height of the chart.
bottomfloat, optional: Bottom margin adjustment.
center_textstr, optional: Center text for the chart.
rotationfloat, optional: Angle rotation for the radar chart.
value_topfloat, optional: Value top for the radar chart.
text_topfloat, optional: Text top for the radar chart.
is_fixedbool, optional: Whether to fix the radar chart.
is_anglebool, optional: Whether to use angle for the radar chart.
y_limitTuple, optional: Y-axis limit.
y_axis_scaleTuple, optional: Y-axis scale.
outputpath, optional: Output path.
showbool, optional: Whether to show.
closebool, optional: Whether to close.
kwargsAny, optional: Keyword arguments.

Returns

None

sciv.pl.radar_trait(trait_df: DataFrame, trait_name: str = 'All', trait_column_name: str = 'id', value: str = 'rate', clusters: str = 'clusters', color: list | set | Tuple | ndarray | str | None = None, clusters_sort: list | None = None, width: float = 4, height: float = 4, rotation: float = 65, title: str | None = None, value_top: float = 0.1, text_top: float = 1.2, is_fixed: bool = False, is_angle: bool = True, y_limit: Tuple = (-0.5, 1), y_axis_scale: Tuple = (0, 1), output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any)

Plot radar charts for trait enrichment analysis.

This function creates radar charts to visualize trait/disease enrichment scores across different clusters. It can plot either a single trait or all traits in the dataset.

Parameters

trait_dfDataFrame: Input dataframe containing trait enrichment data.
trait_namestr, optional: Name of the trait to plot. Use “All” to plot all traits. Default is “All”.
trait_column_namestr, optional: Column name in trait_df that contains trait identifiers. Default is “id”.
valuestr, optional: Column name containing the enrichment values to plot. Default is “rate”.
clustersstr, optional: Column name containing cluster identifiers. Default is “clusters”.
colorUnion[collection, str], optional: Colors for the radar chart bars. Can be a column name (str) or a collection of colors.
clusters_sortOptional[list], optional: Custom order for clusters. If None, clusters are sorted by value in descending order.
widthfloat, optional: Width of the figure in inches. Default is 4.
heightfloat, optional: Height of the figure in inches. Default is 4.
rotationfloat, optional: Rotation angle for text labels in degrees. Default is 65.
titlestr, optional: Base title for the plot. Trait name will be appended if provided.
value_topfloat, optional: Vertical offset for value labels above bars. Default is 0.1.
text_topfloat, optional: Radial position for category labels. Default is 1.2.
is_fixedbool, optional: If True, value labels are placed at a fixed position. Default is False.
is_anglebool, optional: If True, rotate labels to align with radar angles. Default is True.
y_limitTuple, optional: Y-axis limits as (min, max). Default is (-0.5, 1).
y_axis_scaleTuple, optional: Scale range for y-axis ticks as (min, max). Default is (0, 1).
outputpath, optional: Directory path to save output PDF files. If None, files are not saved.
showbool, optional: Whether to display the plot. Default is True.
closebool, optional: Whether to close the figure after display. Default is False.
kwargsAny, optional: Additional keyword arguments passed to the radar function.

Returns

None: The function saves plots to files and/or displays them based on parameters.

sciv.pl.rate_circular_bar_plot(adata: AnnData, layer: str | None = None, trait_name: str = 'All', dir_name: str = 'feature', column: str = 'value', groupby: str = 'clusters', color: list | set | Tuple | ndarray | str | None = None, groupby_sort: list | None = None, width: float = 2, height: float = 2, rotation: float = 25, title: str | None = None, value_top: float = 0.1, text_top: float = 1.2, is_fixed: bool = False, is_angle: bool = True, y_limit: Tuple = (-0.5, 1), y_axis_scale: Tuple = (0, 1), plot_output: str | Path | None = None, show: bool = True, close: bool = False) → None

Generate a circular bar plot (radar chart) showing enrichment ratios for trait-cluster combinations.

This function calculates the completion ratio using the complete_ratio function and visualizes the results as a circular bar plot (radar chart). It handles directory creation for output files and passes appropriate parameters to the radar_trait plotting function.

Parameters

adataAnnData: Input AnnData object containing the data to be visualized.
layerstr, optional: Specify the layer of the matrix to be processed. If None, uses the main matrix.
trait_namestr, default “All”: The name of the trait being analyzed, used for filtering data.
dir_namestr, default “feature”: Folder name for generating and saving circular bar plot outputs.
columnstr, default “value”: The column name containing the binary enrichment values.
groupbystr, default “clusters”: The column name in adata.obs that defines the cell clusters.
colorUnion[collection, str], optional: Color specification for the plot. Can be a color collection or a column name to use for coloring bars based on data values.
groupby_sortOptional[list], optional: Custom order for clusters. If None, uses default sorting.
widthfloat, default 2: The width of the output figure in inches.
heightfloat, default 2: The height of the output figure in inches.
rotationfloat, default 25: Rotation angle for the circular plot in degrees.
titlestr, optional: The title of the plot. If None, no title is displayed.
value_topfloat, default 0.1: Vertical offset for value labels in the plot.
text_topfloat, default 1.2: Vertical offset for text labels in the plot.
is_fixedbool, default False: If True, use fixed scaling for the plot.
is_anglebool, default True: If True, use angular positioning for bars.
y_limitTuple, default (-0.5, 1): The y-axis limits for the plot.
y_axis_scaleTuple, default (0, 1): The scale range for the y-axis values.
plot_outputpath, optional: Directory path for saving output files. If None, figures are not saved.
showbool, default True: If True, display the figure interactively.
closebool, default False: If True, close the figure after saving.

Returns

None: This function does not return any value. Outputs are saved to files or displayed.

Venn

Wayne diagram visualization function.

Plot three Venn diagram.

Parameters

set1collection: First set of elements.
set2collection: Second set of elements.
set3collection: Third set of elements.
name1str, optional: Name of the first set.
name2str, optional: Name of the second set.
name3str, optional: Name of the third set.
widthfloat, optional: Width of the diagram.
heightfloat, optional: Height of the diagram.
bottomfloat, optional: Bottom of the diagram.
colorslist, optional: Colors for the sets.
x_namestr, optional: X name.
y_namestr, optional: Y name.
titlestr, optional: Title of the diagram.
outputpath, optional: Output path.
showbool, optional: Whether to show.
closebool, optional: Whether to close.
kwargsAny, optional: Keyword arguments.

Returns

None

Plot two Venn diagram.

Parameters

set1collection: First set of elements.
set2collection: Second set of elements.
name1str, optional: Name of the first set.
name2str, optional: Name of the second set.
widthfloat, optional: Width of the diagram.
heightfloat, optional: Height of the diagram.
bottomfloat, optional: Bottom of the diagram.
colorslist, optional: Colors for the sets.
x_namestr, optional: X name.
y_namestr, optional: Y name.
titlestr, optional: Title of the diagram.
outputpath, optional: Output path.
showbool, optional: Whether to show.
closebool, optional: Whether to close.
kwargsAny, optional: Keyword arguments.

Returns

None

Preprocessing (.pp)

Data preprocessing interface, used for single-cell data cleaning, differential analysis, and enrichment analysis.

Return reshaped AnnData organized by given column values.

Parameters

adataAnnData: input data;
groupbystr: grouping column;
extra_columnOptional[str], optional: Extra columns reserved based on grouped column;
axisLiteral[0, 1], optional: Which dimension is used for grouping. {1: adata.obs, 0: adata.var};
layerstr, optional: Specify the matrix to be processed;
methodcollection | str, optional: The method of grouping strategy supports the following 5 types and their combinations. The five methods are {“mean”, “sum”, “median”, “max”, “min”}.

Returns

AnnData: Data grouped by AnnData.

sciv.pp.adata_map_df(adata: AnnData, column: str = 'value', layer: str | None = None) → DataFrame

Convert AnnData to a form of row column value.

Parameters

adataAnnData: Enter the AnnData data to be converted;
columnstr: Specify the column name of the value;
layerstr, optional: Specify the matrix to be processed;

Returns

DataFrame: The DataFrame data of the row column value.

sciv.pp.filter_data(adata: AnnData, min_cells: int = 1, min_peaks: int = 1, min_peaks_counts: int = 1, min_cells_counts: int = 1, cell_rate: float | None = None, peak_rate: float | None = None, is_copy: bool = False, is_min_cell: bool = True, is_min_peak: bool = False) → AnnData

Filter scATAC-seq data.

Parameters

adataAnnData: scATAC-seq data
min_peaks_countsint, optional: Minimum number of counts required for a peak to pass filtering
min_cellsint, optional: Minimum number of cells expressed required for a peak to pass filtering
min_cells_countsint, optional: Minimum number of counts required for a cell to pass filtering
min_peaksint, optional: Minimum number of peaks expressed required for a cell to pass filtering
cell_rateOptional[float], optional: Removing the percentage of cell count in total cell count only takes effect when the min_cells parameter is None
peak_rateOptional[float], optional: Removing the percentage of peak count in total peak count only takes effect when the min_peaks parameter is None
is_copybool, optional: Do you want to deeply copy data.
is_min_cellbool, optional: Whether to screen cells
is_min_peakbool, optional: Whether to screen peaks

Returns

AnnData: scATAC-seq data

sciv.pp.get_difference_genes(adata: AnnData, groupby: str, method: Literal['logreg', 't-test', 'wilcoxon', 't-test_overestim_var'] | None = 'wilcoxon', cell_anno: DataFrame | None = None, diff_genes_file: str | None = None) → AnnData

Get differentially expressed/active genes.

Parameters

adataAnnData: scATAC-seq data
groupbystr: groupby name
method_Method, optional: Method to use for differentially expressed gene analysis.
cell_annoOptional[DataFrame], optional: Cell annotation DataFrame.
diff_genes_fileOptional[str], optional: Output file name.

Returns

AnnData: scATAC-seq data

sciv.pp.get_difference_peaks(adata: AnnData, genome_anno, groupby: str, cell_anno: DataFrame | None = None, min_log_fc: float = 0.25, min_pct: float = 0.05, peak_matrix_save_file: str | Path | None = None, diff_peaks_save_file: str | Path | None = None) → AnnData

Get difference peaks.

Parameters

adataAnnData: Fragment file path.
genome_annoDataFrame: Genome annotation.
groupbystr: Cluster name.
cell_annoOptional[DataFrame], optional: Cell annotation.
min_log_fcfloat, optional: Minimum log2 fold change.
min_pctfloat, optional: Minimum percentage.
peak_matrix_save_fileOptional[path], optional: Peak matrix save file.
diff_peaks_save_fileOptional[path], optional: Difference peaks save file.

Returns

AnnData: Difference peaks data.

sciv.pp.get_gene_enrichment(adata: AnnData, top_number: int = 50, threshold: float = 0.01, layer: str | None = None, is_order_or_lt: bool = True, is_top: bool = True, gene_sets: list[str] | set = ('GO_Biological_Process_2023', 'GO_Cellular_Component_2023', 'GO_Molecular_Function_2023', 'GWAS_Catalog_2023', 'KEGG_2016'), organism: Literal['Human', 'Mouse', 'Yeast', 'Fly', 'Fish', 'Worm'] | None = 'human', output_dir: str | None = None) → DataFrame

Get gene enrichment analysis.

Parameters

adataAnnData: Input data;
top_numberint, optional: Top number of genes to use.
thresholdfloat, optional: Threshold to use.
layerOptional[str], optional: Specify the matrix to be processed;
is_order_or_ltbool, optional: Whether to order or filter by threshold.
is_topbool, optional: Whether to get top top_number genes.
gene_setsUnion[list[str], set], optional: Gene sets to use.
organism_Datasets, optional: Organism to use.
output_dirOptional[str], optional: Output directory.

Returns

DataFrame: GSEA enrichr results DataFrame.

sciv.pp.get_gene_expression(adata: AnnData, genome_anno, min_cells: int = 5, gene_save_file: str | Path | None = None) → AnnData

Get gene expression matrix.

Parameters

adataAnnData: scATAC-seq data
genome_annoDataFrame: Genome annotation.
min_cellsint, optional: Minimum cells.
gene_save_fileOptional[path], optional: Gene save file path.

Returns

AnnData: Gene expression matrix.

sciv.pp.get_peak_matrix(adata: AnnData, genome_anno, groupby: str, cell_anno: DataFrame | None = None, peak_matrix_save_file: str | Path | None = None) → AnnData

Generate peak matrix from scATAC-seq data.

This function processes scATAC-seq fragment files to generate a cell-by-peak matrix through peak calling using MACS3. It performs quality control, tile matrix generation, feature selection, and peak calling at the specified cluster level.

Parameters

adataAnnData: scATAC-seq data
genome_annoDataFrame: Genome annotation information.
groupbystr: Column name in cell annotation indicating cluster labels for peak calling.
cell_annoOptional[DataFrame], optional: Cell annotation DataFrame containing cluster information.
peak_matrix_save_fileOptional[path], optional: Path to save the output peak matrix h5ad file.

Returns

AnnData: Cell-by-peak matrix.

sciv.pp.get_sc_atac(fragment_file: str | Path, genome_anno, h5ad_file: str | Path | None = None, min_num_fragments: int = 200, sorted_by_barcode: bool = False, bin_size: int = 500, min_tsse: float = 5.0, counting_strategy: Literal['fragment', 'insertion', 'paired-insertion'] = 'paired-insertion', need_features: int | float | None = None, is_filter_doublets: bool = True) → AnnData

Get scATAC-seq data from fragment file or h5ad file.

This function processes scATAC-seq data by importing fragment files, performing quality control, adding tile matrices, selecting features, and filtering doublets. It can also read pre-processed h5ad files.

Parameters

fragment_filepath: Path to the fragment file or h5ad file.
genome_annoDataFrame: Genome annotation.
h5ad_fileOptional[path], optional: Path to save the h5ad file. If None, a temporary cache file will be used.
min_num_fragmentsint, optional: Minimum number of fragments required for a cell to pass filtering.
sorted_by_barcodebool, optional: Whether the input fragment file is sorted by barcode.
bin_sizeint, optional: Size of consecutive genomic regions used to record the counts.
min_tssefloat, optional: Minimum TSS enrichment score required for a cell to pass filtering.
counting_strategyLiteral[‘fragment’, ‘insertion’, ‘paired-insertion’], optional: Strategy to count fragments in bins.
need_featuresOptional[Union[int | float]], optional: Number or proportion of features to select.
is_filter_doubletsbool, optional: Whether to filter doublets.

Returns

AnnData: Processed scATAC-seq data.

Get TF data.

Parameters

adataAnnData: scATAC-seq data
genome_annoDataFrame: Genome annotation.
groupbystr: Cluster name.
cell_annoOptional[DataFrame], optional: Cell annotation.
p_valuefloat, optional: P-value threshold.
peak_matrix_save_fileOptional[path], optional: Peak matrix save file.
tf_save_fileOptional[path], optional: TF save file.

Returns

AnnData: TF data.

sciv.pp.gsea_enrichr(gene_list: list[str], gene_sets: list[str] | set = ('GO_Biological_Process_2023', 'GO_Cellular_Component_2023', 'GO_Molecular_Function_2023', 'GWAS_Catalog_2023', 'KEGG_2016'), organism: Literal['Human', 'Mouse', 'Yeast', 'Fly', 'Fish', 'Worm'] | None = 'human', is_verbose: bool = True, output_dir: str | None = None) → DataFrame

GSEA enrichr analysis.

Parameters

gene_listlist[str]: Gene list.
gene_setsUnion[list[str], set], optional: Gene sets to use.
organism_Datasets, optional: Organism to use.
is_verbosebool, optional: Whether to print verbose messages.
output_dirOptional[str], optional: Output directory.

Returns

DataFrame: GSEA enrichr results DataFrame.

sciv.pp.merge_sc_atac(files: dict, genome_anno, merge_key: str = 'merge_sc_atac', min_num_fragments: int = 200, sorted_by_barcode: bool = False, bin_size: int = 500, min_tsse: float = 5.0, counting_strategy: Literal['fragment', 'insertion', 'paired-insertion'] = 'paired-insertion', max_iter_harmony: int = 20, harmony_groupby: str | list[str] | None = None, is_selected: bool = False, is_batch: bool = True, need_features: int | float | None = None, output_path: str | Path | None = None) → AnnData

Integrate multiple scATAC-seq data through snapATAC2.

This function integrates multiple scATAC-seq datasets using snapATAC2. Reference: https://kzhang.org/SnapATAC2/tutorials/integration.html

Note: Please do not move the generated files during this processing.

Parameters

filesdict: Dictionary mapping sample names to file paths of scATAC-seq data. Format: {file_key: file_path, …}
genome_annoDataFrame: Genome annotation. Commonly snap.genome.hg38 or snap.genome.hg19.
merge_keystr, optional: Key used to form the final H5AD file name. Default is “merge_sc_atac”.
min_num_fragmentsint, optional: Minimum number of unique fragments required for a cell to pass filtering. Default is 200.
sorted_by_barcodebool, optional: Whether the input fragment file is sorted by barcode. Default is False.
bin_sizeint, optional: Size of consecutive genomic regions used to record counts. Default is 500.
min_tssefloat, optional: Minimum TSS enrichment score required for a cell to pass filtering. Default is 5.0.
counting_strategyLiteral[‘fragment’, ‘insertion’, ‘paired-insertion’], optional: Strategy to count fragments in bins. Default is ‘paired-insertion’.
max_iter_harmonyint, optional: Maximum number of iterations for the harmony algorithm. Default is 20.
harmony_groupbyOptional[Union[str, list[str]]], optional: If specified, split data into groups and perform batch correction on each group separately.
is_selectedbool, optional: If True, perform additional filtering based on feature selection from each sample using the snap.pp.select_features method.
is_batchbool, optional: If True, perform batch correction by sample. Default is True.
need_featuresOptional[Union[int, float]], optional: Number or proportion of features to select. If <= 1, interpreted as a proportion of total features. If > 1, interpreted as absolute number.
output_pathOptional[path], optional: Directory path for output files. If None, temporary files are used.

Returns

AnnData: Integrated scATAC-seq data.

sciv.pp.paga_trajectory(adata: AnnData, layer: str | None = None, latent: str = 'X_pca', groups: str = 'louvain', position: list | set | Tuple | ndarray | None = None, lsi_components: int = 50, root_cluster: str | None = None, n_neighbors: int = 15, resolution: float = 1.0, is_denoise: bool = True) → None

Get paga trajectory.

Parameters

adataAnnData: scATAC-seq data
layerOptional[str], optional: Specify the matrix to be processed;
latentstr, optional: Latent space to use.
groupsstr, optional: Group name to use.
positionOptional[collection], optional: Position to use.
lsi_componentsint, optional: Number of components to use.
root_clusterOptional[str], optional: Root cluster to use.
n_neighborsint, optional: Number of neighbors to use.
resolutionfloat, optional: Resolution to use.
is_denoisebool, optional: Whether to denoise.

Returns

None

sciv.pp.poisson_vi(adata: AnnData, max_epochs: int = 500, lr: float = 0.0001, batch_size: int = 128, eps: float = 1e-08, early_stopping: bool = True, early_stopping_patience: int = 50, strategy: str = 'ddp_notebook_find_unused_parameters_true', batch_key: str | None = None, resolution: float = 0.5, dp_delta: float = 0.05, latent_name: str = 'latent', model_dir: str | Path | None = None) → AnnData

PoissonVI processing of the data results in the current sample representation and peak difference data after Leiden clustering.

Parameters

adataAnnData: Input data to be processed.
max_epochsint, default 500: The maximum number of epochs for PoissonVI training.
lrfloat, default 1e-4: Learning rate for optimization.
batch_sizeint, default 128: Minibatch size to use during training.
epsfloat, default 1e-08: Optimizer epsilon.
early_stoppingbool, default True: Whether to perform early stopping with respect to the validation set.
early_stopping_patienceint, default 50: How many epochs to wait for improvement before early stopping.
strategystr, default “ddp_notebook_find_unused_parameters_true”: DDP strategy.
batch_keystr, optional: Batch information in scATAC-seq data.
resolutionfloat, default 0.5: Resolution of the Leiden clustering.
dp_deltafloat, default 0.05: Empirical effect size threshold for PeakVI method in differential analysis.
latent_namestr, default “latent”: The name of latent representation.
model_dirstr, optional: The folder name for saving the trained model.

Returns

AnnData: Differential peak data of clustering types.

Tool (.tl)

Tool function interface, including core computing functions such as algorithms, matrix operations, and random walks.

Algorithm

Algorithm related functions.

Add Bernoulli fluctuation noise to the counts matrix (add 1 with probability noise_level)

Parameters

counts_matrixmatrix_data: Input counts matrix
noise_levelfloat, default 0.1: Noise level, i.e., the probability of randomly adding 1 (range: 0.0 - 1.0)

Returns

matrix_data: Matrix after adding noise

Add peak percentage noise to each cell

Parameters

datamatrix_data: Input counts matrix
ratefloat: Noise level, i.e., the probability of randomly adding 1 (range: 0.0 - 1.0)

Returns

matrix_data: Matrix after adding noise

AMI (0, 1)

Parameters

labels_predcollection: Predictive labels for clustering;
labels_truecollection: Real labels for clustering.

Returns

float: AMI score.

ARI (-1, 1)

Parameters

labels_predcollection: Predictive labels for clustering;
labels_truecollection: Real labels for clustering.

Returns

float: ARI score.

Accuracy, Recall, F1, FPR, TPR, AUROC, AUPRC.

Parameters

labels_truecollection: Real labels for clustering;
labels_predcollection: Predictive labels for clustering.

Returns

tuple: Binary Indicators.

Calculate the initial trait- or disease-related cell score.

Parameters

input_datadict

data: Convert the counts matrix to the fragments matrix using the scvi.data.reads_to_fragments
overlap_data: Peaks-traits/diseases data

block_sizeint

The size of the segmentation stored in block wise matrix multiplication. If the value is less than or equal to zero, no block operation will be performed

Returns

matrix_data: Initial TRS.

sciv.tl.calculate_init_score_weight(adata: AnnData, da_peaks_adata: AnnData, overlap_adata: AnnData, layer: str | None = 'fragments', diff_peak_value: Literal['emp_effect', 'bayes_factor', 'emp_prob1', 'all'] = 'emp_effect', is_simple: bool = True, block_size: int = -1) → AnnData

Calculate the initial trait- or disease-related cell score with weight.

Parameters

adataAnnData: scATAC-seq data;
da_peaks_adataAnnData: Differential peak data;
overlap_adataAnnData: Peaks-traits/diseases data;
layerstr: Optional. The layer value of scATAC-seq data;
diff_peak_valuedifference_peak_optional: Specify the correction value in peak correction of clustering type differences. {‘emp_effect’, ‘bayes_factor’, ‘emp_prob1’, ‘all’}
is_simplebool: True represents not adding unnecessary intermediate variables, only adding the final result. It is worth noting that when set to True, the is_ablation parameter will become invalid, and when set to False, is_ablation will only take effect;
block_sizeint: The size of the segmentation stored in block wise matrix multiplication. If the value is less than or equal to zero, no block operation will be performed

Returns

AnnData: Initial TRS with weight.

The Calinski-Harabasz index is also one of the indicators used to evaluate the quality of clustering models. It measures the compactness within the cluster and the separation between clusters in the clustering results. The larger the value, the better the clustering effect.

Parameters

datamatrix_data: First data.
labelscollection: Predicted labels for each sample.

Returns

float: Calinski-Harabasz index.

Calculate the coefficient of variation.

Parameters

matrixmatrix_data: Input matrix data.
axisLiteral[0, 1, -1], optional: Axis to calculate the coefficient of variation. Default is 0.
defaultfloat, optional: Default value for division by zero. Default is 0.

Returns

Union[float, collection]: Coefficient of variation.

Davies-Bouldin index (DBI).

Parameters

datamatrix_data: A list of n_features-dimensional data points. Each row corresponds to a single data point;
labelscollection: Predicted labels for each sample.

Returns

float: Davies-Bouldin index.

Calculate the Euclidean distance between two matrices.

Parameters

data1matrix_data: First data;
data2matrix_data: Second data (If the second data is empty, it will default to the first data.)
block_sizeint: The size of the segmentation stored in block wise matrix multiplication. If the value is less than or equal to zero, no block operation will be performed.

Returns

matrix_data: Data of Euclidean distance.

sciv.tl.is_asc_sort(positions_list: list) → bool

Judge whether the site is in ascending order.

Parameters

positions_listlist: Positions list.

Returns

bool: True for ascending order, otherwise False.

Calculate the Jaccard similarity matrix.

Parameters

datamatrix_data: Input cell feature data;
n_jobsint, optional: The number of jobs to use for the computation.
is_to_densebool, optional: Whether to convert the data into a dense matrix.

Returns

matrix_data: Jaccard similarity matrix.

Perform k-means clustering on data.

Parameters

datamatrix_data: Input data matrix;
n_clustersint, optional: The number of clusters to form as well as the number of centroids to generate.
is_to_densebool, optional: Whether to convert the data into a dense matrix.

Returns

collection: Tags after k-means clustering.

Calculate KL divergence for two data.

Parameters

data1matrix_data: First data.
data2matrix_data: Second data.

Returns

float: KL divergence score.

SVD LSI.

Parameters

datamatrix_data: Input cell feature data;
n_componentsint, optional: Dimensions that need to be reduced to.
is_to_densebool, optional: Whether to convert the data into a dense matrix.

Returns

dense_data: Reduced dimensional data (SVD LSI model).

Marginal standardization.

Parameters

matrixmatrix_data: Standardized data matrix required;
axisLiteral[0, 1], optional: Standardize according to which dimension;
defaultfloat, optional: To prevent division by 0, this value needs to be added to the denominator.

Returns

matrix_data: Standardized data.

Calculate the mean symmetric.

Parameters

datamatrix_data: Input data;
axisLiteral[0, 1, -1], optional: Standardize according to which dimension.
is_verbosebool, optional: log information.

Returns

matrix_data: Standardized data after average symmetry.

Calculate min max standardized data.

Parameters

datamatrix_data: Input data;
axisLiteral[0, 1, -1], optional: Standardize according to which dimension.

Returns

dense_data: Standardized data.

Calculate cell-cell correlation

Parameters

adataAnnData: scATAC-seq data;
kint: When building an M-KNN network, the number of nodes connected by each node (and);
or_kint: When building an M-KNN network, the number of nodes connected by each node (or);
weightfloat: The weight of interactions or operations;
local_kint: Number of neighbors for the adaptive kernel;
kernelLiteral[“laplacian”, “gaussian”]: Determine the kernel function to be used;
gammaOptional[Union[float, str, collection]]: When the value of kernel is “laplacian”, if it is None, then it is the reciprocal of the latent representation dimension of the cell. When the value of kernel is “gaussian”, if it is None, then it defaults to an adaptive value obtained through local information of the parameter local_k. Otherwise, it should be strictly positive;
is_simplebool: True represents not adding unnecessary intermediate variables, only adding the final result. It is worth noting that when set to True, the is_ablation parameter will become invalid, and when set to False, is_ablation will only take effect;

Returns

AnnData: Cell similarity data.

sciv.tl.overlap(regions: DataFrame, variants: DataFrame) → DataFrame

Relate the peak region and variant site.

Parameters

regionsDataFrame: Information of peaks.
variantsDataFrame: Information of variants.

Returns

DataFrame: The variant maps data in the peak region.

sciv.tl.overlap_sum(regions: AnnData, variants: dict, trait_info: DataFrame, n_jobs: int = -1) → AnnData

Overlap regional data and mutation data and sum the PP values of all mutations in a region as the values for that region.

Parameters

regionsAnnData: Data of peaks.
variantsdict: Data of variants.
trait_infoDataFrame: Information of traits.
n_jobsint: The maximum number of concurrently running jobs.

PCA.

Parameters

datamatrix_data: Input cell feature data;
n_componentsint, optional: Dimensions that need to be reduced to.
is_to_densebool, optional: Whether to convert the data into a dense matrix.

Returns

dense_data: Reduced dimensional data.

Randomly perturbs the positions of a percentage of data.

Parameters

datacollection: List of data elements to be perturbed.
percentagefloat: Percentage of data to be perturbed.

Returns

collection: Perturbed data list.

Safe KL divergence calculation to avoid division by zero.

Parameters

pcollection: First data.
qcollection: Second data.
epsilonfloat, optional: The small value to add to the denominator to avoid zeros.

Returns

float: KL divergence score.

Mutual KNN with weight.

Parameters

datamatrix_data: Input data matrix;
kint, optional: The number of nearest neighbors (AND);
or_kint, optional: The number of or nearest neighbors (OR);
weightfloat, optional: The weight of interactions or operations;
is_forbool, optional: Obtain the nearest neighbors of each node from each row of the for loop matrix; Setting it to True is very suitable for situations with large samples and insufficient memory.
is_mknn_fully_connectedbool, optional: Is the network of MKNN an all connected graph? If the value is True, it ensures that a node is connected to at least the node that is not closest to itself. This parameter does not affect the result of SM-KNN (the first result), but only affects the result of traditional M-KNN (the second result).

Returns

matrix_data: Adjacency weight matrix.

Sigmoid function.

Parameters

datacollection, matrix_data: Input data.

Returns

collection, matrix_data: Sigmoid output.

silhouette score.

Parameters

datamatrix_data: An array of pairwise distances between samples, or a feature array.
labelscollection: Predicted labels for each sample.

Returns

float: silhouette score.

Spectral clustering on data.

Parameters

datamatrix_data: Input data matrix;
n_clustersint, optional: The dimension of the projection subspace.
n_componentsint, optional: The dimension of the projection subspace.
eigen_solverstr, optional: Default use of Nyström approximation.
is_to_densebool, optional: Whether to convert the data into a dense matrix.

Returns

collection: Tags after spectral clustering.

Spectral Eigenmaps.

Parameters

datamatrix_data: Input cell feature data;
n_componentsint, optional: Dimensions that need to be reduced to.
eigen_solverOptional[_EigenSolver], optional: The eigenvalue decomposition strategy to use.

affinity: method n_jobs : int, optional

The number of jobs to use for the computation.

is_to_densebool, optional: Whether to convert the data into a dense matrix.

Returns

dense_data: Reduced dimensional data.

Symmetric scale Function.

Parameters

datamatrix_data: Input data;
axisLiteral[0, 1, -1], optional: Standardize according to which dimension;
scaleUnion[number, collection], optional: scaling factor.
is_verbosebool, optional: log information.

Returns

matrix_data: Standardized data.

TF-IDF transformer.

Parameters

datamatrix_data: Matrix data that needs to be converted;
ri_sparsebool, optional: (return_is_sparse) Whether to return sparse matrix.

Returns

matrix_data: Matrix processed by TF-IDF.

T-SNE dimensionality reduction on data.

Parameters

datamatrix_data: Data matrix that requires dimensionality reduction;
n_componentsint, optional: Dimension of the embedded space.
is_to_densebool, optional: Whether to convert the data into a dense matrix.

Returns

matrix_data: Reduced dimensional data matrix.

UMAP dimensionality reduction on data.

Parameters

datamatrix_data: Data matrix that requires dimensionality reduction;
n_neighborsfloat, optional: The size of local neighborhood (in terms of number of neighboring sample points) used for manifold approximation. Larger values result in more global views of the manifold, while smaller values result in more local data being preserved. In general values should be in the range 2 to 100;
n_componentsint, optional: The dimension of the space to embed into. This defaults to 2 to provide easy visualization, but can reasonably be set to any integer value in the range 2 to 100.
min_distfloat, optional: The effective minimum distance between embedded points. Smaller values will result in a more clustered/clumped embedding where nearby points on the manifold are drawn closer together, while larger values will result on a more even dispersal of points. The value should be set relative to the spread value, which determines the scale at which embedded points will be spread out.
is_to_densebool, optional: Whether to convert the data into a dense matrix.

Returns

matrix_data: Reduced dimensional data matrix.

Matrix standardization (z-score, marginal).

Parameters

matrixmatrix_data: Standardized data matrix required.
axisLiteral[0, 1], optional: Standardize according to which dimension.

Returns

matrix_data, matrix_data: Standardized data. First element is the z-score data, second element is the mean data.

Matrix standardization (z-score).

Parameters

datamatrix_data: Standardized data matrix required.
with_meanbool, optional: If True, center the data before scaling.
ri_sparsebool | None, optional: (return_is_sparse) Whether to return sparse matrix.
is_sklearnbool, optional: This parameter represents whether to use the sklearn package.

Returns

dense_data, sparse_matrix: Standardized matrix.

Convert z-score to p-value.

Parameters

z_scorematrix_data: Input z-score data.

Returns

matrix_data: P-value data.

Random Walk

Random walk related functions.

class sciv.tl.RandomWalk(cc_adata: AnnData, init_status: AnnData, epsilon: float = 1e-05, max_steps: int = 300, gamma: float = 0.05, enrichment_gamma: float = 0.05, p: int = 2, n_jobs: int = -1, min_seed_cell_rate: float = 0.01, max_seed_cell_rate: float = 0.05, credible_threshold: float = 0, enrichment_threshold: Literal['golden', 'half', 'e', 'pi', 'none'] | float = 'golden', benchmark_count: int = 10, is_ablation: bool = False, is_simple: bool = True)

Bases: object

Random walk analysis.

run_ablation_m_knn() → None: Using M-KNN fully connected cellular network.

run_ablation_ncsw() → None: Removed cell weights in random walk and cluster type weights in initial scores.

run_ablation_ncw() → None: Removed cell cluster type weights in initial scores

run_ablation_nsw() → None: Removed cell weights from random walk.

run_benchmark() → None: Perform random walk of random seeds on all traits.

run_core() → None: Calculate weighted random walk.

run_en_ablation_m_knn() → None: Using M-KNN fully connected cellular network (Enrichment analysis)

run_en_ablation_ncsw() → None: Removed cell weights in random walk and cluster type weights in initial scores.

run_en_ablation_ncw() → None: Removed cell cluster type weights in initial scores.

run_en_ablation_nsw() → None: Removed cell weights from random walk

run_enrichment() → None: Enrichment analysis.

run_knock(trs: AnnData, knock_trait: str, is_control: bool = False) → None

Knockout analysis.

Parameters

trsAnnData: Input AnnData object.
knock_traitstr: Knockout trait or disease.
is_controlbool, optional: Whether to control the knockout. default is False.

Scale normalization of the score matrix.

Parameters

scorematrix_data: Score matrix.
is_verbosebool, optional: Whether to print the progress. Defaults to False.

Returns

matrix_data: The normalized score matrix.

class sciv.tl.TraitDataParallel(module: T, device_ids: Sequence[int | device] | None = None, output_device: int | device | None = None, dim: int = 0)

Bases: DataParallel

Data parallel module for trait analysis.

gather(outputs, output_device)

Collect the results after parallel processing, check for the existence of results, and merge the results by column (each result matrix has the same number of rows but different numbers of columns).

Parameters

outputslist: Output results of each device
output_deviceint: Output device ID.

Returns

Tensor: The merged results sorted by column.

scatter(inputs, kwargs, device_ids)

Scatter the input data to multiple devices.

Parameters

inputslist: List of input data to be scattered.
kwargsdict: Dictionary of keyword arguments to be scattered.
device_idslist: List of device IDs to scatter the data to.

Returns

tuple: Tuple of scattered input data and keyword arguments.

Random walk analysis.

Parameters

seed_cell_weightmatrix_data: Seed cell weight matrix, where each column represents a seed cell.
weightmatrix_data: Transition probability matrix (weight matrix). Defaults to None.
gammafloat: Random walk parameter. Defaults to 0.05.
epsilonfloat: Convergence threshold. Defaults to 1e-5.
max_stepsint: Maximum number of steps. Defaults to 300.
pint: Order of the random walk. Defaults to 2.
n_jobsint: Number of jobs to run in parallel. Defaults to -1, which means using all available processors.
devicestr: Device to run the analysis on. Defaults to ‘auto’.

Returns

matrix_data: The association score matrix, where each column represents the association score of a seed cell.

Standardize and normalize the cell scores.

Parameters

scorematrix_data: Cell scores matrix.
axisLiteral[0, 1, -1]: Axis to apply the standardization and normalization. Defaults to 0.
is_verbosebool: Whether to print the progress. Defaults to True.

Returns

matrix_data: The standardized and normalized cell scores matrix.

Matrix

Matrix operation related functions.

Down-sampling data.

Parameters

dataUnion[matrix_data | collection]: Data that requires down-sampling;
sample_numberint, optional: How many samples (values) were down-sampled.

Returns

list: Data after down-sampling.

Callback matrix.

Parameters

matrixmatrix_data: Matrix
callbackcallable: callback function
block_sizeint, optional: The size of the segmentation stored in block wise matrix multiplication. If the value is less than or equal to zero, no block operation will be performed.
datamatrix_data, optional: Return the placeholder variables of the result matrix. If there is value, it will reduce the consumption of memory space.

Returns

matrix_data: Result Matrix (CSR format)

Dividing a matrix by another value, vector, or matrix.

Parameters

matrixmatrix_data: Matrix
valueUnion[float, int, collection, matrix_data]: Value, vector, or matrix
block_sizeint, optional: The size of the segmentation stored in block wise matrix multiplication. If the value is less than or equal to zero, no block operation will be performed.
datamatrix_data, optional: Return the placeholder variables of the result matrix. If there is value, it will reduce the consumption of memory space.

Returns

matrix_data: Result Matrix (CSR format)

Perform Cartesian product of two matrices through block storage method.

Parameters

data1matrix_data: Matrix 1
data2matrix_data: Matrix 2
block_sizeint, optional: The size of the segmentation stored in block wise matrix multiplication. If the value is less than or equal to zero, no block operation will be performed.
is_return_sparsebool, optional: Whether to return sparse matrix.
datamatrix_data, optional: Return the placeholder variables of the result matrix. If there is value, it will reduce the consumption of memory space.

Returns

matrix_data: Cartesian product result

Perform Hadamard product of two matrices through block storage method.

Parameters

data1matrix_data: Matrix 1
data2matrix_data: Matrix 2
block_sizeint, optional: The size of the segmentation stored in block wise matrix multiplication. If the value is less than or equal to zero, no block operation will be performed.
datamatrix_data, optional: Return the placeholder variables of the result matrix. If there is value, it will reduce the consumption of memory space.

Returns

matrix_data: Hadamard product result

Perform element-wise addition, subtraction, multiplication, and division on two sparse matrices by blocks, supporting memory-efficient processing.

Parameters

data1matrix_data: Sparse matrix 1
data2Union[matrix_data, number]: Sparse matrix 2
chunk_sizeint, optional: The size of the segmentation stored in block wise element-wise operation. If the value is less than or equal to zero, no block operation will be performed.
defaultfloat, optional: Default value for division operation when denominator is 0. If the value is 0, it will raise a ValueError.
operationLiteral[‘+’, ‘-’, ‘*’, ‘/’], optional: Element-wise operation type, optional ‘+’, ‘-’, ‘*’, ‘/’

Returns

sparse_matrix: Result sparse matrix (CSR format)

Merge multiple matrix data into one matrix.

Parameters

dataslist: List of matrix data.
axisLiteral[0, 1], optional: Axis to merge the matrix. Default is 0.

Returns

matrix_data: Merged matrix data.

Split matrix into multiple parts.

Parameters

datamatrix_data: Input matrix data.
axisLiteral[0, 1], optional: Axis to split the matrix. Default is 0.
chunk_numberint, optional: Number of parts to split the matrix. Default is 1000.

Returns

list: List of split matrix data.

Two vectors are broadcast in rows and columns respectively and multiplied by Hadamard product.

Parameters

data1collection: Vector 1
data2collection: Vector 2
block_sizeint, optional: The size of the segmentation stored in block wise matrix multiplication. If the value is less than or equal to zero, no block operation will be performed.
datamatrix_data, optional: Return the placeholder variables of the result matrix. If there is value, it will reduce the consumption of memory space.

Returns

matrix_data: Result Matrix (CSR format)

Util (.ul)

A universal tool interface that includes constant definitions, logging, and auxiliary functions.

sciv.ul.check_adata_get(adata: AnnData, layer: str | None = None, is_dense: bool = True, is_matrix: bool = False) → AnnData

Check if layer is in .layers, and instantiate a new AnnData with it as .X.

Parameters

adataAnnData: Input AnnData object.
layerstr, optional: Layer of the data. Default is None.
is_densebool, optional: Whether to return dense matrix. Default is True.
is_matrixbool, optional: Whether to return matrix. Default is False.

Returns

AnnData: Data.

sciv.ul.check_gpu_availability(verbose: bool = False) → bool

Check the availability of GPU.

Parameters

verbosebool, optional: Whether to print the information. Default is False.

Returns

bool: Whether the GPU is available.

Examples

>>> availability = sciv.ul.check_gpu_availability()

sciv.ul.file_method(name: str | None = None, is_verbose: bool = False) → StaticMethod

Create file method handler class

Create a StaticMethod class instance based on the given name for handling file operations. If a name is provided, it will be combined with the project name as the handler file name; otherwise, only the project name will be used.

Parameters

namestr, optional: File handler name suffix, default is None
is_verbosebool, default is False: Is log information displayed

Returns

StaticMethod: Configured StaticMethod class instance

sciv.ul.generate_hex_colors(num_colors: int) → list

Generate random hex colors.

Parameters

num_colorsint: Number of colors to generate.

Returns

list: List of random hex colors.

Examples

>>> colors3 = sciv.ul.generate_hex_colors(3)
>>> colors5 = sciv.ul.generate_hex_colors(5)
>>> print(f"Generate three colors: {colors3}")
>>> print(f"Generate five colors: {colors5}")

sciv.ul.generate_str(length: int = 10) → str

Generate a random string.

Parameters

lengthint, optional: Length of the string. Default is 10.

Returns

str: Random string.

sciv.ul.get_index(position: int | float, positions_list: list, is_sort: bool = True) → int | Tuple[int, int]

Search for position information. Similar to half search.: If the position exists in the list, return the index. If it does not exist, return the index located between the two indexes.

Parameters

positionnumber: Position to search for.
positions_listlist: Position list.
is_sortbool, optional: Whether to sort the list. Default is True.

Returns

Union[int, Tuple[int, int]]: Position index.

sciv.ul.get_real_predict_label(df: DataFrame, map_groupby: str | list | set | Tuple | ndarray, groupby: str = 'clusters', value: str = 'value') → Tuple[DataFrame, int, list]

Get the real and predict label of the trait.

Parameters

dfDataFrame: Input data.
map_groupbyUnion[str, collection]: Map of the cluster.
groupbystr, optional: Name of the column of the cluster. Default is “clusters”.
valuestr, optional: Name of the column of the value. Default is “value”.

Returns

Tuple[DataFrame, int, list]: Sorted DataFrame. Number of the cluster. List of the cluster.

sciv.ul.list_duplicate_set(data: list) → list

Append numbering to duplicate information. If data is None, return an empty list. If data is a list, return it as is. If data is a collection, return it converted to a list.

Parameters

datalist: Input data.

Returns

list: Unique data with constant quantity.

sciv.ul.list_index(data: list) → Tuple[list, list | set | Tuple | ndarray]

Get the index of each element in a list.

Parameters

datalist: Input data.

Returns

Tuple[list, collection]: Index of each element in the list. Types of the elements in the list.

sciv.ul.log(name: str | None = None) → Logger

Create log handler class

Create a Logger class instance based on the given name for logging. If a name is provided, it will be combined with the project name as the log file name; otherwise, only the project name will be used.

Parameters

namestr, optional: Log handler name suffix, default is None

Returns

Logger: Configured Logger class instance

sciv.ul.merge_matrix(datas: list, axis: Literal[0, 1] = 0) → list

Merge multiple matrices into one matrix.

Parameters

dataslist: Input data.
axisLiteral[0, 1], optional: Axis to merge the matrices. Default is 0.

Returns

list: Merged data.

sciv.ul.numerical_bisection_step(min_value: float, max_value: float, step_length: float) → Tuple[list | set | Tuple | ndarray, int]

Get the numerical bisection step.

Parameters

min_valuefloat: Minimum value of the step.
max_valuefloat: Maximum value of the step.
step_lengthfloat: Step length of the bisection.

Returns

Tuple[collection, int]: Numerical bisection step. Number of steps.

Set the infinite value of the matrix to the maximum value of the matrix.

Parameters

matrixmatrix_data: Input matrix.

Split a matrix into multiple parts.

Parameters

datamatrix_data: Input data.
axisLiteral[0, 1], optional: Axis to split the matrix. Default is 0.
chunk_numberint, optional: Number of parts to split the matrix. Default is 1000.

Returns

list: Split data.

sciv.ul.strings_map_numbers(str_list: list, start: int = 0) → list

Map strings to numerical values.

Parameters

str_listlist: Input strings.
startint, optional: Start value of the mapping. Default is 0.

Returns

list: Mapped numerical values.

Obtain the minimum/maximum sum of rows in the matrix

Obtain the minimum/maximum sum of rows in the matrix. If data is None, return (0, 0). If data is a dense matrix, return the minimum/maximum sum of rows. If data is a sparse matrix, return the minimum/maximum sum of rows.

Returns

Tuple[number, number]: Minimum value of rows, maximum value of rows.

Convert sparse matrix to dense matrix

Convert a sparse matrix to a dense matrix. If sm is None, return an empty dense matrix. If sm is a dense matrix, return it as is. If sm is a sparse matrix, return it converted to array form.

Returns

dense_data: Converted dense matrix.

Convert dense matrix to sparse matrix

Convert a dense matrix. If dm is None, return an empty sparse matrix. If dm is a sparse matrix, return it as is. If dm is a dense matrix, return it converted to sparse form.

Returns

sparse_matrix: Converted sparse matrix.

sciv.ul.track_with_memory(interval: float = 60) → Callable

Create memory tracking decorator

Create a decorator function that records memory usage at fixed intervals during function execution. Returns the result, elapsed time, and memory list.

Parameters

intervalfloat, optional: Sampling interval (seconds), default is 60 seconds.

Returns

Callable: Decorator function; when the wrapped function is called, it returns a dictionary containing:<br/> - ‘result’: the original function’s return value.<br/> - ‘time’: function execution time (seconds) if is_monitor is True, otherwise None.<br/> - ‘memory’: list of sampled memory usage (bytes) if is_monitor is True, otherwise None.<br/>