2. SCIV API

Download (.dl)

Data download interface, used to download single-cell data and trait files.

sciv.dl.download_sc_atac_file(is_force: bool = False) None

Download the scATAC2 file from the remote server to the local cache.

Parameters

is_forcebool, optional

If True, force re-download even if the file exists. Default is False.

Examples

>>> sciv.dl.download_sc_atac_file()
sciv.dl.download_trait_file(is_force: bool = False) None

Download the trait file from the remote server to the local cache.

Parameters

is_forcebool, optional

If True, force re-download even if the file exists. Default is False.

Examples

>>> sciv.dl.download_trait_file()
sciv.dl.download_trs_file(is_force: bool = False) None

Download the TRS file from the remote server to the local cache.

Parameters

is_forcebool, optional

If True, force re-download even if the file exists. Default is False.

Examples

>>> sciv.dl.download_trs_file()
sciv.dl.download_trs_score_file(is_force: bool = False) None

Download the TRS score file from the remote server to the local cache.

Parameters

is_forcebool, optional

If True, force re-download even if the file exists. Default is False.

Examples

>>> sciv.dl.download_trs_score_file()
sciv.dl.read_sc_atac_file() AnnData

Read the scATAC-seq file from the local cache.

Returns

AnnData

The scATAC-seq data.

Examples

>>> adata = sciv.dl.read_sc_atac_file()
sciv.dl.read_trait_file() Tuple[dict, DataFrame]

Read the trait files from the local cache.

Returns

Tuple[dict, DataFrame]

The trait data.

Examples

>>> variants, trait_info = sciv.dl.read_trait_file()
sciv.dl.read_trs_file() AnnData

Read the TRS file from the local cache.

Returns

AnnData

The TRS data.

Examples

>>> trs = sciv.dl.read_trs_file()
sciv.dl.read_trs_score_file() AnnData

Read the TRS score file from the local cache.

Returns

AnnData

The TRS score data.

Examples

>>> trs_score = sciv.dl.read_trs_score_file()

File (.fl)

File read-write interface, used for processing single-cell ATAC data, H5AD, H5 and other format files.

sciv.fl.barcodes_add_anno(annotation_file: str | Path, cell_anno: DataFrame, clusters: str | None = None) DataFrame

Add user inputted cell information to the cell annotation data.

Parameters

annotation_filepath

The file that adds information about cells must contain the column name barcodes, the file input by the user.

cell_annoDataFrame

Read the cell description in the scATAC-seq data generated from the file.

clustersstr, optional

The column name for cell clusters or cell types. (In most cases, this column can be ignored.) It is worth noting that only the values in this column are judged to determine whether they contain NA values. If they do, they are assigned the value unknown, and if not, no operation is performed.

Returns

Complete cell annotation data

Complete cell annotation data with user inputted cell information.

sciv.fl.read_barcodes_file(barcodes_file: str | Path, clusters: str | None = None, barcode_split_character: str = '-', annotation_file: str | Path | None = None) DataFrame

Read barcodes file.

Parameters

barcodes_filepath

Barcodes file.

clustersstr, optional

The column name for cell clusters or cell types. (In most cases, this column can be ignored.) It is worth noting that only the values in this column are judged to determine whether they contain NA values. If they do, they are assigned the value unknown, and if not, no operation is performed.

barcode_split_characterstr, default=’-’

A barcode separated character symbol. (meta)

annotation_filepath, optional

The file that adds information about cells must contain the column name barcodes, the file input by the user.

Returns

Cell annotation data

Cell annotation data with user inputted cell information.

sciv.fl.read_h5(file: str | Path, is_close: bool = False)

Read AnnData data from an h5 file.

Parameters

filepath

Path to the h5 file.

is_closebool, default=False

If True, close the file. Default is False.

Returns

AnnData data.

The loaded AnnData data from the h5 file.

sciv.fl.read_h5ad(file: str | Path, is_verbose: bool = True) AnnData

Read AnnData from an h5ad file.

Parameters

filepath

Path to the h5ad file.

is_verbosebool, default=True

If True, print log information. Default is True.

Returns

AnnData

The loaded AnnData object.

sciv.fl.read_pkl(file: str | Path, is_verbose: bool = True)

Read data from a pickle file.

Parameters

filepath

Path to the pickle file.

is_verbosebool, default=True

If True, print log information. Default is True.

Returns

Python variable data.

The loaded Python variable data from the pickle file.

sciv.fl.read_sc_atac(resource: str | Path | None = None, is_transpose: bool = True, barcode_split_character: str = '-', on_barcode_split_character: str | None = None, annotation_file: str | Path | None = None, clusters: str | None = None, peak_split_character: Tuple = (':', '-')) AnnData

Read scATAC-seq data and return it in AnnData format.

Parameters

resourcepath, optional

Input data source. Can be one of the following: 1. Path to directory containing matrix, bed file, etc. (output from cell-ranger) 2. H5 file obtained through cell-ranger 3. A comprehensive h5ad file 4. A table file with cell or peak columns and indexes, where content is fragment counts Default is None.

is_transposebool, default=True

Whether transpose is required to read the matrix file.

barcode_split_characterstr, default=’-’

Character used to split barcode information (for metadata).

on_barcode_split_characterstr, optional

Character used to split barcode information (for matrix). If None, uses barcode_split_character. Default is None.

annotation_filepath, optional

File containing additional cell information. Must contain a ‘barcodes’ column. Default is None.

clustersstr, optional

Column name for cell clusters or cell types. If NA values exist in this column, they will be assigned as ‘unknown’. Default is None.

peak_split_characterTuple, default=(“:”, “-“)

Characters used to split peak information (chromosome, start, end). First element splits chromosome from start, second splits start from end.

Returns

AnnData

scATAC-seq data in AnnData format with cell and peak annotations.

sciv.fl.read_sc_atac_10x_h5(file: str | Path, clusters: str | None = None, barcode_split_character: str = '-', annotation_file: str | Path | None = None, peak_split_character: Tuple = (':', '-')) AnnData

Read hdf5 file from Cell Ranger v3 or later versions.

Parameters

filepath

A comprehensive h5ad file. (It can be obtained through cell-ranger)

clustersstr, optional

The column name for cell clusters or cell types. (In most cases, this column can be ignored.) It is worth noting that only the values in this column are judged to determine whether they contain NA values. If they do, they are assigned the value unknown, and if not, no operation is performed.

barcode_split_characterstr, default=’-’

A barcode separated character symbol (meta)

annotation_filepath, optional

The file that adds information about cells must contain the column name barcodes

peak_split_charactertuple, default=(‘:’,’-‘)

A peak separated character symbol

Returns

AnnData

scATAC-seq data.

sciv.fl.read_variants(base_path: str | Path | None = None, files: list | set | Tuple | ndarray | None = None, labels: dict | None = None, column_map: dict | None = None, repeat_symbol: str = '_#') Tuple[dict, DataFrame]

Read variant file set.

Parameters

base_pathpath, optional

Path for storing mutation trait data. The file must contain the following column names: chr, position, rsId, pp, where ID represents the representative of the trait name. Default is None.

filescollection, optional

Collection of mutation trait data file paths. Default is None.

labelsdict, optional

Classification labels for each trait or disease. Default is None.

column_mapdict, optional

Mapping of column names to facilitate mapping the corresponding column names in the mutation file to the specified column name information. For example: {0: “chr”, 1: “position”, 2: “rsId”, 3: “pp”}. Default is None.

repeat_symbolstr, default=”_#”

Symbol used to distinguish duplicate trait names. If two files have the same name abbreviation, a symbol and numerical value will be added to one of the abbreviations.

Returns

dict

Dictionary containing AnnData objects for each trait or disease, where keys are trait names and values are AnnData objects with variant information.

DataFrame

Annotated information on traits or diseases, including summary statistics such as pp_sum, pp_mean, count, and filename.

sciv.fl.save_h5(data: dict, save_file: str | Path, group_name: str = 'matrix') None

Save H5 data to H5 file.

Parameters

datadict

Input H5 data to save.

save_filepath

Input path to save file.

group_name: str, default=”matrix”

The group name.

Returns

H5 file

The input H5 file.

sciv.fl.save_h5ad(data: AnnData, file: str | Path) AnnData

Save AnnData data to h5ad file.

Parameters

dataAnnData

Input AnnData object to save.

filepath

Path to save file.

Returns

AnnData

The input AnnData object.

sciv.fl.save_pkl(data, save_file: str | Path, is_verbose: bool = False) None

Save pkl data to pkl file.

Parameters

dataany

Input data to save.

save_filepath

Input path to save file.

is_verbose: Set true to print log;

Returns

pkl file

The input pkl file.

sciv.fl.to_fragments(adata: AnnData, fragments: str, layer: str | None = None, batch_size: int = 100000, is_sort: bool = True, is_gz: bool = True, is_keep: bool = False) None

Convert AnnData format data into fragments format file.

Parameters

adataAnnData

Input AnnData object containing single-cell data.

fragmentsstr

Output file path for the fragments file.

layerstr, optional

The layer of data to use for generating fragments file. If None, uses the main data matrix (adata.X).

batch_sizeint, default=50000

Batch size for processing data. Larger values reduce memory consumption.

is_sortbool, default=True

Whether to sort the output by chromosome and start position. Sorts chromosomes in natural order (chr1, chr2, …, chrX, chrY, chrM).

is_gzbool, default=True

Whether to compress the output file using gzip. Uses pysam.tabix_compress for compression.

is_keepbool, default=False

Whether to keep the uncompressed fragments file after compression. Only effective when is_gz is True. If False, the uncompressed file is deleted after successful compression.

Returns

None

Writes fragments file to the specified path.

Note

To export results processed by SnapATAC2, please use snapatac2.ex.export_fragments directly. Using this function is not recommended.

sciv.fl.to_meta(adata: AnnData, dir_path: str | Path, layer: str | None = None, feature_name: str = 'peaks.bed', field: Literal['real', 'complex', 'pattern', 'integer'] | None = None) None

Convert AnnData object into metadata directory containing matrix, feature files, etc.

This function exports single-cell data into standard 10x Genomics format, including: - matrix.mtx: Sparse matrix file in Matrix Market format - annotation.txt: Cell annotation information - barcodes.tsv: Cell barcodes list - peaks.bed or specified feature file: Genomic feature information

Parameters

adataAnnData

Input AnnData object containing single-cell data.

dir_pathpath

Output directory path for storing generated metadata files.

layerstr, optional

layer: The layer of data that needs to form meta files; If None, uses adata.X as the main data matrix.

feature_namestr, default=”peaks.bed”

Output name for the feature file. If starts with “peaks”, feature indices will be parsed by chromosome position into BED format.

field_Field, optional

Matrix data type field, available values: - ‘real’: Real numbers - ‘complex’: Complex numbers - ‘pattern’: Pattern matrix (no values) - ‘integer’: Integer values If None, automatically determined from data type.

Returns

Directory

The input directory.

Model (.ml)

The core interface of the model provides functions for cell type association analysis and causal variation recognition.

sciv.ml.association_score(adata: AnnData, score_name: str = 'association_score', layer: str = 'trs_source', axis: Literal[0, 1] = 0) None

Calculate association score for traits or diseases. This function calculates the association score for traits or diseases based on the TRS (Trait Relevance Score) data in the input AnnData object.

Parameters

adataAnnData

Input AnnData object containing TRS data.

score_namestr, optional

Name of the score column in the AnnData object. Default is “association_score”.

layerstr, optional

Layer name in the AnnData object containing TRS data. Default is “trs_source”.

axisLiteral[0, 1], optional

Axis to calculate the score (0 for traits, 1 for diseases). Default is 0.

return:

None

sciv.ml.core(adata: AnnData, variants: dict, trait_info: DataFrame, cell_rate: float | None = None, peak_rate: float | None = None, max_epochs: int = 500, lr: float = 1e-05, batch_size: int = 128, eps: float = 1e-08, early_stopping: bool = True, early_stopping_patience: int = 50, strategy: str = 'ddp_notebook_find_unused_parameters_true', batch_key: str | None = None, resolution: float = 0.5, k: int = 30, or_k: int = 10, weight: float = 0.5, kernel: Literal['laplacian', 'gaussian'] = 'gaussian', local_k: int = 10, kernel_gamma: float | str | list | set | Tuple | ndarray | None = None, epsilon: float = 1e-05, max_steps: int = 300, gamma: float = 0.05, enrichment_gamma: float = 0.05, p: int = 2, n_jobs: int = -1, min_seed_cell_rate: float = 0.01, max_seed_cell_rate: float = 0.05, credible_threshold: float = 0, diff_peak_value: Literal['emp_effect', 'bayes_factor', 'emp_prob1', 'all'] = 'emp_effect', enrichment_threshold: Literal['golden', 'half', 'e', 'pi', 'none'] | float = 'golden', is_ablation: bool = False, model_dir: str | Path | None = None, save_path: str | Path | None = None, is_simple: bool = True, is_save_random_walk_model: bool = False, is_file_exist_loading: bool = False, filename_dict: dict | None = None, block_size: int = -1) AnnData

The core algorithm of sciv includes the flow of all algorithms, as well as drawing and saving data. In the entire algorithm, the samples are in the row position, and the traits or diseases are in the column position, while ensuring that there is no interaction between the traits or diseases, ensuring the stability of the results;

Meaning of main variables:
  1. overlap_adata, (obs: peaks, var: traits/diseases) Peaks-traits/diseases data obtained by overlaying variant data with peaks.

  2. da_peaks, (obs: clusters (Leiden), var: peaks) Differential peak data of cell clustering, used for weight correction of cells.

  3. init_score, (obs: cells, var: traits/diseases) This is the initial TRS data.

  4. cc_data, (obs: cells, var: cells) Cell similarity data.

  5. random_walk, RandomWalk class.

  6. trs, (obs: cells, var: traits/diseases) This is the final TRS data.

Parameters

adataAnnData

scATAC-seq data.

variantsdict

Variant data. This data is recommended to be obtained by executing the fl.read_variants method.

trait_infoDataFrame

Variant annotation file information.

cell_rateOptional[float], default None

Removing the percentage of cell count in total cell count only takes effect when the min_cells parameter is None.

peak_rateOptional[float], default None

Removing the percentage of peak count in total peak count only takes effect when the min_peaks parameter is None.

max_epochsint, default 500

The maximum number of epochs for PoissonVI training.

lrfloat, default 1e-5

Learning rate for optimization.

batch_sizeint, default 128

Minibatch size to use during training.

epsfloat, default 1e-08

Optimizer eps.

early_stoppingbool, default True

Whether to perform early stopping with respect to the validation set.

early_stopping_patienceint, default 50

How many epochs to wait for improvement before early stopping.

strategystr, default “ddp_notebook_find_unused_parameters_true”

DDP strategy.

batch_keyOptional[str], default None

Batch information in scATAC-seq data.

resolutionfloat, default 0.5

Resolution of the Leiden Cluster. The recommended values are any one of 0.4, 0.9, 1.3, 1.5.

kint, default 30

When building an mKNN network, the number of nodes connected by each node (and operation).

or_kint, default 10

When building an mKNN network, the number of nodes connected by each node (or operation).

weightfloat, default 0.5

The weight of interactions or operations.

kernelLiteral[“laplacian”, “gaussian”], default “gaussian”

Determine the kernel function to be used.

local_kint, default 10

Determining the number of neighbors for the adaptive kernel.

kernel_gammaOptional[Union[float, str, collection]], default None

When the value of kernel is “laplacian”, if it is None, then it is the reciprocal of the latent representation dimension of the cell. When the value of kernel is “gaussian”, if it is None, then it defaults to an adaptive value obtained through local information of the parameter local_k. Otherwise, it should be strictly positive.

epsilonfloat, default 1e-05

Conditions for stopping in random walk.

max_stepsint, default 300

Maximum number of steps in a random walk with restart.

gammafloat, default 0.05

Reset weight for random walk.

enrichment_gammafloat, default 0.05

Reset weight for random walk for enrichment.

pint, default 2

Distance used for loss {1: Manhattan distance, 2: Euclidean distance}.

n_jobsint, default -1

The maximum number of concurrently running jobs.

min_seed_cell_ratefloat, default 0.01

The minimum percentage of seed cells in all cells.

max_seed_cell_ratefloat, default 0.05

The maximum percentage of seed cells in all cells.

credible_thresholdfloat, default 0

The threshold for determining the credibility of enriched cells in the context of enrichment, i.e. the threshold for judging enriched cells.

diff_peak_valuedifference_peak_optional, default ‘emp_effect’

Specify the correction value in peak correction of clustering type differences. {‘emp_effect’, ‘bayes_factor’, ‘emp_prob1’}

enrichment_thresholdUnion[enrichment_optional, float], default ‘golden’

Only by setting a threshold for the standardized output TRS can a portion of the enrichment results be obtained. Parameters support string types {‘golden’, ‘half’, ‘e’, ‘pi’, ‘none’}, or valid floating-point types within the range of (0, log1p(1)).

is_ablationbool, default False

True represents obtaining the results of the ablation experiment. This parameter is limited by the is_simple parameter, and its effectiveness requires setting is_simple to False.

model_dirOptional[path], default None

The folder name saved by the training module. It is worth noting that if the training model file (model.pt) exists in this path, it will be automatically read and skip the training of PoissonVI model.

save_pathOptional[path], default None

Save path for process files and result files.

is_simplebool, default True

True represents not adding unnecessary intermediate variables, only adding the final result. It is worth noting that when set to True, the is_ablation parameter will become invalid, and when set to False, is_ablation will only take effect.

is_save_random_walk_modelbool, default False

Default to False, do not save random walk model. When setting True, please ensure sufficient storage as the saved pkl file is relatively large.

is_file_exist_loadingbool, default False

By default, the file will be overwritten. When set to True, if the file exists, the process will be skipped and the file will be directly read as the result.

filename_dictOptional[dict], default None

The name of the file that exists. default: {

“sc_atac”: “sc_atac.h5ad”, “da_peaks”: “da_peaks.h5ad”, “atac_overlap”: “atac_overlap.h5ad”, “init_score”: “init_score.h5ad”, “cc_data”: “cc_data.h5ad”, “random_walk”: “random_walk.h5ad”, “trs”: “trs.h5ad”

}

block_sizeint

The size of the segmentation stored in block wise matrix multiplication. By sacrificing time and space to reduce memory consumption to a certain extent. If the value is less than or equal to zero, no block operation will be performed.

Returns

AnnData

AnnData object containing TRS (Trait Relevance Score) results. (obs: cells, var: traits/diseases) This is the final TRS data.

sciv.ml.knock(trs: AnnData, sc_atac: AnnData, da_peaks: AnnData, cc_data: AnnData, knock_trait: str, knock_info: dict[str, Union[str, list, set, Tuple, numpy.ndarray]], knock_value: float = 0, is_add_control: bool = False) AnnData

Perform gene knockdown or knockout analysis on a specific trait.

This function simulates the effect of knocking down or knocking out specific variants associated with a trait, and re-runs the random walk algorithm to compute the resulting TRS (Trait Relevance Score) changes.

Parameters

trsAnnData

TRS result data from ml.core, containing parameters, variants, trait_info and trs_source.

sc_atacAnnData

scATAC-seq data used in the original analysis.

da_peaksAnnData

Differential accessibility peaks data from the original analysis.

cc_dataAnnData

Cell-cell similarity network data from the original analysis.

knock_traitstr

The trait ID to perform knockdown/knockout on.

knock_infodict[str, Union[str, collection]]

Dictionary mapping knock group names to variant IDs (rsId) to be knocked down. Each key is a group name, and each value is either a single variant ID (str) or a collection of variant IDs to knock down together.

knock_valuefloat, default 0

The value to set for knocked-down variants. Default is 0 (complete knockout). Values >= 1e-3 are not recommended as they may not achieve the desired effect.

is_add_controlbool, default False

Whether to add control experiments (knocking out background variants).

Returns

AnnData

AnnData object containing TRS results after knockdown/knockout. Includes knock parameters in .uns[“params”] and knock-specific metadata.

Plot (.pl)

Visual interface, including multiple chart types for data analysis and presentation.

Graph

Network diagram visualization function.

sciv.pl.communities_graph(adata: AnnData, labels: list | set | Tuple | ndarray, layer: str | None = None, groupby: str = 'clusters', x_name: str | None = None, y_name: str | None = None, title: str | None = None, width: float = 2, height: float = 2, bottom: float = 0, node_size: float = 2.0, line_widths: float = 0.001, start_color_index: int = 0, color_step_size: int = 0, output: str | Path | None = None, show: bool = True, close: bool = False)

Plot a cell-cell network diagram with community detection coloring.

This function visualizes a network graph where nodes represent cells and edges represent connections between cells. Nodes are colored based on their community assignments.

Parameters

adataAnnData

Annotated data matrix with observations (cells) and variables (genes).

labelscollection

Community labels for grouping nodes. Each community is a collection of node indices.

layerstr, optional

Name of the layer in adata to use for adjacency matrix. If None, uses adata.X.

groupbystr, default=”clusters”

Column name in adata.obs containing cluster information for color assignment.

x_namestr, optional

Label for the x-axis.

y_namestr, optional

Label for the y-axis.

titlestr, optional

Title of the plot.

widthfloat, default=2

Width of the figure in inches.

heightfloat, default=2

Height of the figure in inches.

bottomfloat, default=0

Bottom margin adjustment.

node_sizefloat, default=2.0

Size of the nodes in the network.

line_widthsfloat, default=0.001

Width of the node edges and network edges.

start_color_indexint, default=0

Starting index for color selection from the color palette.

color_step_sizeint, default=0

Step size for selecting colors from the palette for different communities.

outputpath, optional

Path to save the figure.

showbool, default=True

Whether to display the figure.

closebool, default=False

Whether to close the figure after display.

Returns

None

The function displays and/or saves the network plot.

sciv.pl.graph(data: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list, labels: list | set | Tuple | ndarray | None = None, node_size: int = 50, name: str | None = None, x_name: str | None = None, y_name: str | None = None, title: str | None = None, width: float = 2, height: float = 2, bottom: float = 0, is_font: bool = False, output: str | Path | None = None, show: bool = True, close: bool = False) None

Plot a graph from an adjacency matrix.

Parameters

datamatrix_data

Adjacency matrix representing the graph connections.

labelscollection, optional

Labels for each node in the graph.

node_sizeint, default=50

Size of the nodes in the plot.

namestr, optional

Name of the graph.

x_namestr, optional

Label for the x-axis.

y_namestr, optional

Label for the y-axis.

titlestr, optional

Title of the plot.

widthfloat, default=2

Width of the figure in inches.

heightfloat, default=2

Height of the figure in inches.

bottomfloat, default=0

Bottom margin adjustment.

is_fontbool, default=False

Whether to display node labels.

outputpath, optional

Path to save the figure.

showbool, default=True

Whether to display the figure.

closebool, default=False

Whether to close the figure after display.

sciv.pl.network_two_types(data_pairs: list, type1_scores: dict, type2_scores: dict, type1_node_size: dict | list | float | None = 50, type2_node_size: dict | list | float | None = 50, label_nodes: list | None = None, width: float = 4, height: float = 3, k: float | None = None, iterations: int = 50, scale: float = 1, radius: float = 0.35, type1_node_shape: str = 'o', type2_node_shape: str = 's', type1_bar_label: str = 'Score', type2_bar_label: str = 'Score', type1_cmap_str: str = 'winter', type2_cmap_str: str = 'YlOrRd', node_alpha: float = 0.8, edge_alpha: float = 0.8, is_fluctuate: bool = True, layout_type: str = 'spring', output: str | Path | None = None, show: bool = True, close: bool = False)

Plot a bipartite network graph with two types of nodes.

This function visualizes a network where nodes are divided into two distinct types (e.g., genes and variations), with edges representing connections between them. Each node type can have different sizes, colors, and shapes based on their scores.

Parameters

data_pairslist

List of tuples representing edges between type1 and type2 nodes.

type1_scoresdict

Dictionary mapping type1 node names to their score values for color mapping.

type2_scoresdict

Dictionary mapping type2 node names to their score values for color mapping.

type1_node_sizeUnion[dict, list, float], default=50

Size of type1 nodes. Can be a single value, list, or dict mapping nodes to sizes.

type2_node_sizeUnion[dict, list, float], default=50

Size of type2 nodes. Can be a single value, list, or dict mapping nodes to sizes.

label_nodeslist, optional

List of node names to display labels for.

widthfloat, default=4

Width of the figure in inches.

heightfloat, default=3

Height of the figure in inches.

kfloat, optional

Optimal distance between nodes for spring layout. If None, uses default.

iterationsint, default=50

Number of iterations for spring layout optimization.

scalefloat, default=1

Scale factor for the layout positions.

radiusfloat, default=0.35

Radius for positioning connected nodes around their parent nodes in custom layouts.

type1_node_shapestr, default=’o’

Matplotlib marker shape for type1 nodes.

type2_node_shapestr, default=’s’

Matplotlib marker shape for type2 nodes.

type1_bar_labelstr, default=’Score’

Label for the color bar of type1 nodes.

type2_bar_labelstr, default=’Score’

Label for the color bar of type2 nodes.

type1_cmap_strstr, default=”winter”

Colormap name for type1 node colors.

type2_cmap_strstr, default=”YlOrRd”

Colormap name for type2 node colors.

node_alphafloat, default=0.8

Transparency level for nodes (0-1).

edge_alphafloat, default=0.8

Transparency level for edges (0-1).

is_fluctuatebool, default=True

Whether to add random fluctuation to node positions in custom layouts.

layout_typestr, default=’spring’

Layout algorithm to use. Options: ‘spring’, ‘kamada_kawai’, ‘circular’, ‘shell’, ‘circular_type1’, ‘circular_type2’, ‘square_type1’, ‘square_type2’.

outputpath, optional

Path to save the figure.

showbool, default=True

Whether to display the figure.

closebool, default=False

Whether to close the figure after display.

Returns

None

The function displays and/or saves the network plot.

Heatmap

Heatmap visualization function.

sciv.pl.heatmap(adata: AnnData, layer: str | None = None, title: str | None = None, width: float = 4, height: float = 4, bottom: float = 0, annot: bool = False, square: bool = True, is_cluster: bool = False, cmap: str = 'Oranges', line_widths: float = 1, fmt: str = '.2f', rotation: float = 65, x_name: str | None = None, y_name: str | None = None, output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any) None

Generate a simple heatmap using seaborn.

Parameters

adataAnnData

Input AnnData object containing the data matrix.

layerstr, default None

Layer name in adata.layers to use for plotting. If None, uses adata.X.

titleOptional[str], default None

Title of the figure.

widthfloat, default 4

Width of the figure in inches.

heightfloat, default 4

Height of the figure in inches.

bottomfloat, default 0

Bottom margin of the figure.

annotbool, default False

Whether to annotate each cell with its numeric value.

squarebool, default True

Whether to make cells square-shaped.

is_clusterbool, default False

Whether to perform hierarchical clustering (uses clustermap instead of heatmap).

cmapstr, default “Oranges”

Colormap for the heatmap.

line_widthsfloat, default 1

Width of the lines that divide cells.

fmtstr, default “.2f”

String formatting code for annotations.

rotationfloat, default 65

Rotation angle for x-axis labels.

x_namestr, default None

Label for the x-axis.

y_namestr, default None

Label for the y-axis.

outputpath, default None

File path to save the figure. If None, figure is not saved.

showbool, default True

Whether to display the figure.

closebool, default False

Whether to close the figure after saving.

**kwargsAny

Additional keyword arguments passed to seaborn heatmap or clustermap.

Returns

None

Displays or saves the heatmap figure.

sciv.pl.heatmap_annotation(adata: AnnData, layer: str | None = None, width: float = 4, height: float = 4, title: str | None = None, label: str = 'value', row_name: str | None = None, col_name: str | None = None, row_names: str | None = None, col_names: str | None = None, row_anno_label: bool = False, col_anno_label: bool = False, row_anno_text: bool = False, col_anno_text: bool = False, row_legend: bool = False, col_legend: bool = False, row_show_names: bool = False, col_show_names: bool = False, row_cluster: bool = False, col_cluster: bool = False, cluster_method: str = 'average', cluster_metric: str = 'correlation', row_names_side: str = 'left', col_names_side: str = 'bottom', bottom: float = 0.01, label_size: float = 9, fontsize: float = 9, level_bar_height: float | None = None, anno_specific_labels: list | None = None, x_label_rotation: float = 245, y_label_rotation: float = 0, row_color_start_index: int = 0, col_color_start_index: int = 10, row_split: int | Series | None = None, col_split: int | Series | None = None, row_split_order: list | str | None = None, col_split_order: list | str | None = None, row_split_gap: float = 0.5, col_split_gap: float = 0.2, frac: float = 0.2, relpos: Tuple = (0, 1), anno_label_height: float | None = None, selected_anno_label_height: float = 2.5, category_height: float | None = 2.5, x_name: str | None = None, y_name: str | None = None, row_score_name: str = 'association_score', cmap: str = 'Oranges', is_sort: bool = True, show: bool = True, close: bool = False, output: str | Path | None = None, **kwargs) None

Generate a heatmap with row and column annotations.

Parameters

adataAnnData

Input AnnData object containing the data matrix and metadata.

layerOptional[str], default None

Layer name in adata.layers to use for plotting. If None, uses adata.X.

widthfloat, default 4

Width of the figure in inches.

heightfloat, default 4

Height of the figure in inches.

titleOptional[str], default None

Title of the figure.

labelstr, default “value”

Label for the heatmap color bar.

row_nameOptional[str], default None

Column name in adata.obs for row annotations.

col_nameOptional[str], default None

Column name in adata.var for column annotations.

row_namesOptional[str], default None

Column name in adata.obs to use as row index labels.

col_namesOptional[str], default None

Column name in adata.var to use as column index labels.

row_anno_labelbool, default False

Whether to display merged labels for row annotations.

col_anno_labelbool, default False

Whether to display merged labels for column annotations.

row_anno_textbool, default False

Whether to display text labels on row annotation bars.

col_anno_textbool, default False

Whether to display text labels on column annotation bars.

row_legendbool, default False

Whether to show legend for row annotations.

col_legendbool, default False

Whether to show legend for column annotations.

row_show_namesbool, default False

Whether to display row names (index labels) on the heatmap.

col_show_namesbool, default False

Whether to display column names (index labels) on the heatmap.

row_clusterbool, default False

Whether to perform hierarchical clustering on rows.

col_clusterbool, default False

Whether to perform hierarchical clustering on columns.

cluster_methodstr, default “average”

Linkage method for hierarchical clustering (e.g., “average”, “single”, “complete”).

cluster_metricstr, default “correlation”

Distance metric for hierarchical clustering (e.g., “correlation”, “euclidean”).

row_names_sidestr, default “left”

Side to display row names (“left” or “right”).

col_names_sidestr, default “bottom”

Side to display column names (“top” or “bottom”).

bottomfloat, default 0.01

Bottom margin of the figure.

label_sizefloat, default 9

Font size for row and column name labels.

fontsizefloat, default 9

Font size for axis titles.

level_bar_heightfloat, default None

Height of the association score bar plot annotation.

anno_specific_labelslist, default None

List of specific row labels to highlight in the annotation.

x_label_rotationfloat, default 245

Rotation angle for x-axis labels (column names).

y_label_rotationfloat, default 0

Rotation angle for y-axis labels (row names).

row_color_start_indexint, default 0

Starting index in the color palette for row annotations.

col_color_start_indexint, default 10

Starting index in the color palette for column annotations.

row_splitUnion[int, pd.Series], default None

Number of clusters or grouping series for splitting rows.

col_splitUnion[int, pd.Series], default None

Number of clusters or grouping series for splitting columns.

row_split_orderUnion[list, str], default None

Order for row splits or ‘cluster_between_groups’ for auto-clustering.

col_split_orderUnion[list, str], default None

Order for column splits or ‘cluster_between_groups’ for auto-clustering.

row_split_gapfloat, default 0.5

Gap size between row splits in mm.

col_split_gapfloat, default 0.2

Gap size between column splits in mm.

fracfloat, default 0.2

Fraction parameter for annotation label positioning.

relposTuple, default (0, 1)

Relative position for annotation labels.

anno_label_heightOptional[float], default None

Height of the annotation label bar.

selected_anno_label_heightfloat, default 2.5

Height of the selected annotation label bar.

category_heightOptional[float], default 2.5

Height of the category annotation bar.

x_nameOptional[str], default None

Label for the x-axis.

y_nameOptional[str], default None

Label for the y-axis.

row_score_namestr, default “association_score”

Column name in adata.obs for the association score bar plot.

cmapstr, default “Oranges”

Colormap for the heatmap.

is_sortbool, default True

Whether to sort rows and columns before plotting.

showbool, default True

Whether to display the figure.

closebool, default False

Whether to close the figure after saving.

outputpath, default None

File path to save the figure. If None, figure is not saved.

**kwargs

Additional keyword arguments passed to ClusterMapPlotter.

Returns

None

Displays or saves the heatmap figure.

Scatter

Scatter chart visualization function.

sciv.pl.manhattan_causal_variant(df: DataFrame, y: str = 'pp', chr_name: str = 'chr', label: str = 'rsId', size: int = 30, labels: list | None = None, colors: list | None = None, width: float = 8, height: float = 2, bottom: float = 0, title: str | None = None, is_sort: bool = True, line_width: float = 0.5, y_round: int = 3, x_name: str | None = 'Chromosome', y_name: str | None = 'pp', y_limit: Tuple[float, float] = (0, 1), output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any) None

Create a Manhattan plot for causal variant visualization across chromosomes.

Parameters

dfDataFrame

Input data containing variant information with chromosome and position data

ystr, default “pp”

Column name for y-axis values (typically posterior probability or p-value)

chr_namestr, default “chr”

Column name for chromosome identifiers

labelstr, default “rsId”

Column name for variant labels/identifiers

sizeint, default 30

Size of scatter points

labelsOptional[list], optional

List of specific variant labels to annotate on the plot

colorsOptional[list], optional

Custom color palette for different chromosomes

widthfloat, default 8

Figure width in inches

heightfloat, default 2

Figure height in inches

bottomfloat, default 0

Bottom margin adjustment

titlestr, optional

Plot title

is_sortbool, default True

Whether to sort data by chromosome

line_widthfloat, default 0.5

Width of separator lines between chromosomes and grid lines

y_roundint, default 3

Number of decimal places for y-value annotations

x_nameOptional[str], default “Chromosome”

Label for x-axis

y_nameOptional[str], default “pp”

Label for y-axis

y_limitTuple[float, float], default (0, 1)

Y-axis limits for the plot

outputpath, optional

Output file path

showbool, default True

Whether to display the plot

closebool, default False

Whether to close the figure after saving

**kwargsAny

Additional arguments passed to ax.axvline

sciv.pl.pseudo_time_score(df: DataFrame, x: str, y: str, x_name: str | None = None, y_name: str | None = None, title: str | None = None, width: float = 2, height: float = 1.2, bottom: float = 0, alpha: float = 0.65, line_width: float = 1.5, step_length: int = 5, polyorder: int = 1, size: float | list | set | Tuple | ndarray = 1.0, output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any) None

Create a scatter plot showing pseudo-time scores with a smoothed trend line.

Parameters

dfDataFrame

Input data containing pseudo-time and score values

xstr

Column name for pseudo-time values (x-axis)

ystr

Column name for score values (y-axis)

x_namestr, optional

Label for x-axis

y_namestr, optional

Label for y-axis

titlestr, optional

Plot title

widthfloat, default 2

Figure width in inches

heightfloat, default 1.2

Figure height in inches

bottomfloat, default 0

Bottom margin adjustment

alphafloat, default 0.65

Transparency of scatter points

line_widthfloat, default 1.5

Width of the smoothed trend line

step_lengthint, default 5

Step length for determining Savitzky-Golay filter window size

polyorderint, default 1

Polynomial order for Savitzky-Golay filter

sizeUnion[float, collection], default 1.0

Size of scatter points

outputpath, optional

Output file path

showbool, default True

Whether to display the plot

closebool, default False

Whether to close the figure after saving

**kwargsAny

Additional arguments passed to ax.scatter

sciv.pl.scatter_3d(df: DataFrame, x: str, y: str, z: str, hue: str | None = None, x_name: str | None = None, y_name: str | None = None, z_name: str | None = None, title: str | None = None, width: float = 7, height: float = 7, elev: float = 30, azim: float = -60, is_add_legend: bool = True, cmap: str | ListedColormap = 'tab20', font_size: int = 14, edge_color: str | None = None, size: float | list | set | Tuple | ndarray = 0.1, legend_name: str | None = None, is_add_max_label: bool = False, text_left_offset: float = 0.5, output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any)

Create a 3D scatter plot with customizable aesthetics.

Parameters

dfDataFrame

Input data containing x, y, z coordinates

xstr

Column name for x-axis values

ystr

Column name for y-axis values

zstr

Column name for z-axis values

huestr, optional

Column name for color grouping

x_namestr, optional

Label for x-axis

y_namestr, optional

Label for y-axis

z_namestr, optional

Label for z-axis

titlestr, optional

Plot title

widthfloat, default 7

Figure width in inches

heightfloat, default 7

Figure height in inches

elevfloat, default 30

Elevation angle for 3D view

azimfloat, default -60

Azimuth angle for 3D view

is_add_legendbool, default True

Whether to add legend

cmapUnion[str, ListedColormap], default ‘tab20’

Colormap for coloring

font_sizeint, default 14

Font size for labels and title

edge_colorstr, optional

Edge color for scatter points

sizeUnion[float, collection], default 0.1

Size of scatter points

legend_namestr, optional

Title for legend

is_add_max_labelbool, default False

Whether to add label for maximum z value point

text_left_offsetfloat, default 0.5

Horizontal offset for max value label

outputpath, optional

Output file path

showbool, default True

Whether to display the plot

closebool, default False

Whether to close the figure after saving

**kwargsAny

Additional arguments passed to ax.scatter

sciv.pl.scatter_atac(adata: AnnData, columns: Tuple[str, str] = ('UMAP1', 'UMAP2'), groupby: str = 'clusters', hue_order: list | None = None, width: float = 2, height: float = 2, x_name: str | None = None, y_name: str | None = None, start_color_index: int = 0, color_step_size: int = 0, type_colors: list | set | Tuple | ndarray | None = None, edge_color: str | None = None, size: float = 1.0, text_fontsize: float = 7, legend_fontsize: float = 7, is_text: bool = False, output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any) None

Create a scatter plot for ATAC-seq data with cluster coloring.

Parameters

adataAnnData

AnnData object containing observations and coordinates

columnsTuple[str, str], default (“UMAP1”, “UMAP2”)

Column names for x and y coordinates in adata.obs

groupbystr, default “clusters”

Column name for cluster labels in adata.obs

hue_orderlist, optional

Order of clusters for legend

widthfloat, default 2

Figure width in inches

heightfloat, default 2

Figure height in inches

x_namestr, optional

Label for x-axis

y_namestr, optional

Label for y-axis

start_color_indexint, default 0

Starting index in color palette

color_step_sizeint, default 0

Step size for color selection

type_colorscollection, optional

Custom color palette

edge_colorstr, optional

Edge color for scatter points

sizefloat, default 1.0

Size of scatter points

text_fontsizefloat, default 7

Font size for annotation text

legend_fontsizefloat, default 7

Font size for legend text

is_textbool, default False

Whether to add text annotations

outputpath, optional

Output file path

showbool, default True

Whether to display the plot

closebool, default False

Whether to close the figure after saving

**kwargsAny

Additional arguments passed to scatter_base

sciv.pl.scatter_base(df: DataFrame, x: str, y: str, hue: str | None = None, hue_order: list | None = None, x_name: str | None = None, y_name: str | None = None, title: str | None = None, bar_label: str | None = None, cmap: str = 'Oranges', width: float = 2, height: float = 2, right: float = 0.9, bottom: float = 0, text_fontsize: float = 7, legend_fontsize: float = 7, start_color_index: int = 0, color_step_size: int = 0, type_colors: list | set | Tuple | ndarray | None = None, edge_color: str | None = None, size: float | list | set | Tuple | ndarray = 1.0, legend: dict | None = None, number: bool = False, is_text: bool = False, output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any) None

Create a base scatter plot with customizable aesthetics.

Parameters

dfDataFrame

Input data containing x, y coordinates and optional hue values

xstr

Column name for x-axis values

ystr

Column name for y-axis values

huestr, optional

Column name for color grouping

hue_orderlist, optional

Order of hue categories for legend

x_namestr, optional

Label for x-axis

y_namestr, optional

Label for y-axis

titlestr, optional

Plot title

bar_labelstr, optional

Label for colorbar when number=True

cmapstr, default “Oranges”

Colormap for continuous coloring

widthfloat, default 2

Figure width in inches

heightfloat, default 2

Figure height in inches

rightfloat, default 0.9

Position for legend anchor

bottomfloat, default 0

Bottom margin adjustment

text_fontsizefloat, default 7

Font size for annotation text

legend_fontsizefloat, default 7

Font size for legend text

start_color_indexint, default 0

Starting index in color palette

color_step_sizeint, default 0

Step size for color selection

type_colorscollection, optional

Custom color palette

edge_colorstr, optional

Edge color for scatter points

sizeUnion[float, collection], default 1.0

Size of scatter points

legenddict, optional

Mapping to rename hue categories

numberbool, default False

Whether to use continuous color scale

is_textbool, default False

Whether to add text annotations

outputpath, optional

Output file path

showbool, default True

Whether to display the plot

closebool, default False

Whether to close the figure after saving

**kwargsAny

Additional arguments passed to sns.scatterplot

sciv.pl.scatter_trait(trait_adata: AnnData, title: str | None = None, bar_label: str | None = None, trait_name: str = 'All', layers: list | set | Tuple | ndarray | None = None, columns: Tuple[str, str] = ('UMAP1', 'UMAP2'), cmap: str = 'viridis', width: float = 2, height: float = 2, right: float = 0.9, x_name: str | None = None, y_name: str | None = None, number: bool = True, edge_color: str | None = None, size: float | list | set | Tuple | ndarray = 1.0, text_fontsize: float = 7, legend_fontsize: float = 7, start_color_index: int = 0, color_step_size: int = 0, type_colors: list | set | Tuple | ndarray | None = None, is_text: bool = False, legend: dict | None = None, output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any) None

Plot trait data scatter plot.

Parameters

trait_adataAnnData

AnnData object containing trait/disease scores and cell metadata

titlestr, optional

Title prefix for the plot

bar_labelstr, optional

Label for colorbar when number=True

trait_namestr, default “All”

Name of trait/disease to plot, or “All” to plot all traits

layersUnion[None, collection], optional

List of layer names to plot from trait_adata.layers

columnsTuple[str, str], default (“UMAP1”, “UMAP2”)

Column names for x and y coordinates in trait_adata.obs

cmapstr, default “viridis”

Colormap for continuous coloring

widthfloat, default 2

Figure width in inches

heightfloat, default 2

Figure height in inches

rightfloat, default 0.9

Position for legend anchor

x_namestr, optional

Label for x-axis

y_namestr, optional

Label for y-axis

numberbool, default True

Whether to use continuous color scale for trait scores

edge_colorstr, optional

Edge color for scatter points

sizeUnion[float, collection], default 1.0

Size of scatter points

text_fontsizefloat, default 7

Font size for annotation text

legend_fontsizefloat, default 7

Font size for legend text

start_color_indexint, default 0

Starting index in color palette

color_step_sizeint, default 0

Step size for color selection

type_colorscollection, optional

Custom color palette

is_textbool, default False

Whether to add text annotations

legenddict, optional

Mapping to rename hue categories

outputpath, optional

Output directory path for saving plots

showbool, default True

Whether to display the plot

closebool, default False

Whether to close the figure after saving

**kwargsAny

Additional arguments passed to scatter_base

sciv.pl.volcano_base(df: DataFrame, x: str = 'Log2(Fold change)', y: str = '-Log10(P value)', hue: str = 'type', size: int = 3, palette: list | None = None, width: float = 2, height: float = 2, bottom: float = 0, y_min: float = 0, axh_value: float = np.float64(3.0), axv_left_value: float = -1, axv_right_value: float = 1, title: str | None = None, x_name: str | None = None, y_name: str | None = None, output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any) None

Plot volcano plot.

Parameters

dfDataFrame

Data frame.

xstr, optional

X-axis.

ystr, optional

Y-axis.

huestr, optional

Hue.

sizeint, optional

Size.

paletteOptional[list], optional

Palette.

widthfloat, optional

Width.

heightfloat, optional

Height.

bottomfloat, optional

Bottom.

y_minfloat, optional

Y-min.

axh_valuefloat, optional

Axh-value.

axv_left_valuefloat, optional

Axv-left-value.

axv_right_valuefloat, optional

Axv-right-value.

titlestr, optional

Title.

x_nameOptional[str], optional

X-name.

y_nameOptional[str], optional

Y-name.

outputpath, optional

Output.

showbool, optional

Show to display the plot.

closebool, optional

Close to close the figure after saving.

kwargsAny, optional

Additional keyword arguments passed to sns.scatterplot.

Returns

None

Violin

Violin chart visualization function.

sciv.pl.violin_base(df: DataFrame, value: str = 'value', x_name: str | None = None, y_name: str = 'value', kind: Literal['strip', 'swarm', 'box', 'violin', 'boxen', 'point', 'bar', 'count'] = 'violin', groupby: str = 'clusters', palette: Tuple | list | None = None, hue: str | None = None, width: float = 2, height: float = 2, bottom: float = 0.3, rotation: float = 65, line_width: float = 0.5, title: str | None = None, split: bool = False, is_sort: bool = True, order_names: list | None = None, output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any) None

Plot violin plot.

Parameters

dfDataFrame

Input data.

valuestr, optional

Value column.

x_namestr, optional

X name.

y_namestr, optional

Y name.

kind_Kind, optional

Kind of plot.

groupbystr, optional

Clusters column.

paletteUnion[Tuple, list], optional

Palette.

huestr, optional

Hue column.

widthfloat, optional

Width.

heightfloat, optional

Height.

bottomfloat, optional

Bottom.

rotationfloat, optional

Rotation.

line_widthfloat, optional

Line width.

titlestr, optional

Title.

splitbool, optional

Whether to split.

is_sortbool, optional

Whether to sort.

order_nameslist, optional

Order names.

outputpath, optional

Output path.

showbool, optional

Whether to show.

closebool, optional

Whether to close.

kwargsAny, optional

Keyword arguments.

Returns

None

sciv.pl.violin_trait(trait_df: DataFrame, trait_name: str | list = 'All', trait_column_name: str = 'id', value: str = 'value', groupby: str = 'clusters', kind: Literal['strip', 'swarm', 'box', 'violin', 'boxen', 'point', 'bar', 'count'] = 'violin', x_name: str | None = None, y_name: str = 'value', palette: Tuple | None = None, width: float = 2, height: float = 2, rotation: float = 65, line_width: float = 0.1, bottom: float = 0.3, split: bool = False, is_sort: bool = True, order_names: list | None = None, title: str | None = None, output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any) None

Plot violin plot for trait data.

This function creates violin plots (or other categorical plots) for trait data, allowing visualization of trait distributions across different clusters.

Parameters

trait_dfDataFrame

Input trait data containing trait information and values.

trait_nameUnion[str, list], optional

Name(s) of the trait(s) to plot. Use “All” to plot all traits.

trait_column_namestr, optional

Column name in trait_df that contains trait identifiers.

valuestr, optional

Column name containing the values to plot.

groupbystr, optional

Column name containing cluster assignments.

kind_Kind, optional

Type of categorical plot to create (e.g., “violin”, “box”, “strip”).

x_namestr, optional

Label for the x-axis.

y_namestr, optional

Label for the y-axis.

paletteTuple, optional

Color palette for the plot.

widthfloat, optional

Width of the figure in inches.

heightfloat, optional

Height of the figure in inches.

rotationfloat, optional

Rotation angle for x-axis labels in degrees.

line_widthfloat, optional

Width of the plot lines.

bottomfloat, optional

Bottom margin of the figure.

splitbool, optional

Whether to split the violin plot when using hue.

is_sortbool, optional

Whether to sort clusters by median value.

order_nameslist, optional

Custom order for cluster names.

titlestr, optional

Title prefix for the plot.

outputpath, optional

Directory path to save the output files.

showbool, optional

Whether to display the plot.

closebool, optional

Whether to close the figure after saving.

kwargsAny, optional

Additional keyword arguments passed to violin_base.

Returns

None

Box

Visualization function of box diagram.

sciv.pl.box_base(df: DataFrame, x: str = 'clusters', y: str = 'value', x_name: str | None = None, y_name: str = 'value', palette: Tuple | list | None = None, width: float = 2, height: float = 2, bottom: float = 0.3, line_width: float = 0.3, marker_size: float = 0.2, rotation: float = 65, orient: str | None = None, title: str | None = None, whis: float = 1.5, show_fliers: bool = True, is_sort: bool = True, order_names: list | None = None, output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any) None

Create a box plot with customizable styling options.

Parameters

dfDataFrame

Input data containing the values to plot.

xstr, default “clusters”

Column name for the x-axis categorical variable.

ystr, default “value”

Column name for the y-axis numerical variable.

x_namestr, optional

Custom label for the x-axis. If None, uses the x column name.

y_namestr, default “value”

Custom label for the y-axis.

paletteUnion[Tuple, list], optional

Color palette for the boxes. If None and “color” column exists, uses that.

widthfloat, default 2

Width of the figure in inches.

heightfloat, default 2

Height of the figure in inches.

bottomfloat, default 0.3

Bottom margin adjustment for the plot.

line_widthfloat, default 0.3

Width of lines in the plot (box edges, whiskers, etc.).

marker_sizefloat, default 0.2

Size of outlier markers.

rotationfloat, default 65

Rotation angle for x-axis tick labels in degrees.

orientstr, optional

Orientation of the plot (“v” for vertical, “h” for horizontal).

titlestr, optional

Title of the plot.

whisfloat, default 1.5

Proportion of the IQR past the low and high quartiles to extend the whiskers.

show_fliersbool, default True

Whether to display outlier points beyond the whiskers.

is_sortbool, default True

Whether to sort boxes by median value in descending order.

order_nameslist, optional

Custom order for x-axis categories. Only used if is_sort is False.

outputpath, optional

File path to save the plot. If None, plot is not saved.

showbool, default True

Whether to display the plot.

closebool, default False

Whether to close the figure after displaying.

**kwargsAny

Additional keyword arguments passed to seaborn.boxplot.

sciv.pl.box_trait(trait_df: DataFrame, trait_name: str = 'All', trait_column_name: str = 'id', value: str = 'value', groupby: str = 'clusters', x_name: str | None = None, y_name: str = 'value', palette: Tuple | list | None = None, orient: str | None = None, width: float = 2, height: float = 2, line_width: float = 0.1, marker_size: float = 0.5, bottom: float = 0.3, rotation: float = 65, whis: float = 1.5, show_fliers: bool = True, is_sort: bool = True, order_names: list | None = None, title: str | None = None, output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any) None

Create box plots for trait/disease data across different clusters.

This function generates box plots for each trait or a specific trait from the input dataframe. It filters data by trait and creates individual box plots using the box_base function.

Parameters

trait_dfDataFrame

Input data containing trait/disease information and values to plot.

trait_namestr, default “All”

Name of the trait/disease to plot. Use “All” to plot all traits.

trait_column_namestr, default “id”

Column name in trait_df that contains trait/disease identifiers.

valuestr, default “value”

Column name for the numerical values to be plotted on y-axis.

groupbystr, default “clusters”

Column name for the cluster categories to be plotted on x-axis.

x_namestr, optional

Custom label for the x-axis. If None, uses the clusters column name.

y_namestr, default “value”

Custom label for the y-axis.

paletteUnion[Tuple, list], optional

Color palette for the boxes.

orientstr, optional

Orientation of the plot (“v” for vertical, “h” for horizontal).

widthfloat, default 2

Width of the figure in inches.

heightfloat, default 2

Height of the figure in inches.

line_widthfloat, default 0.1

Width of lines in the plot.

marker_sizefloat, default 0.5

Size of outlier markers.

bottomfloat, default 0.3

Bottom margin adjustment for the plot.

rotationfloat, default 65

Rotation angle for x-axis tick labels in degrees.

whisfloat, default 1.5

Proportion of the IQR to extend the whiskers.

show_fliersbool, default True

Whether to display outlier points beyond the whiskers.

is_sortbool, default True

Whether to sort boxes by median value.

order_nameslist, optional

Custom order for x-axis categories.

titlestr, optional

Base title for the plots. Trait name will be appended.

outputpath, optional

Directory path to save the plots. If None, plots are not saved.

showbool, default True

Whether to display the plots.

closebool, default False

Whether to close the figure after displaying.

**kwargsAny

Additional keyword arguments passed to box_base function.

KDE

Visualization function of kernel density estimation map.

sciv.pl.kde(adata: AnnData, layer: str | None = None, x_name: str | None = None, y_name: str | None = None, title: str | None = None, width: float = 4, height: float = 2, bottom: float = 0.3, axis: Literal[-1, 0, 1] = -1, sample_number: int = 1000000, is_legend: bool = True, output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any) None

Plot Kernel Density Estimation (KDE) for single-cell data.

Parameters

adataAnnData

Annotated data matrix with observations (rows) and variables (columns).

layerstr, optional

Which layer of adata to use. If None, uses adata.X.

x_namestr, optional

Label for the x-axis.

y_namestr, optional

Label for the y-axis.

titlestr, optional

Title of the plot.

widthfloat, default=4

Width of the figure in inches.

heightfloat, default=2

Height of the figure in inches.

bottomfloat, default=0.3

Bottom margin of the figure.

axisLiteral[-1, 0, 1], default=-1

Axis along which to compute KDE: - -1: Flatten all data and compute single KDE. - 0: Compute KDE for each column (variable). - 1: Compute KDE for each row (observation).

sample_numberint, default=1000000

Maximum number of samples to use for KDE computation. If data exceeds this, random downsampling is applied.

is_legendbool, default=True

Whether to display legend when axis is 0 or 1.

outputpath, optional

Path to save the figure. If None, figure is not saved.

showbool, default=True

Whether to display the figure.

closebool, default=False

Whether to close the figure after displaying.

**kwargsAny

Additional keyword arguments passed to seaborn.kdeplot.

Line

Line chart visualization function.

sciv.pl.base_line(data: AnnData | DataFrame, x: str, y: str, layer: str | None = None, width: float = 2, height: float = 2, bottom: float = 0, title: str | None = None, x_name: str | None = None, y_name: str | None = None, label: str | None = None, legend: str | None = None, legend_list: list | None = None, start_color_index: int = 0, color_step_size: int = 0, color_type: str = 'set', colors: list | None = None, line_width: float = 1.5, x_name_rotation: float = 65, x_ticks: int | list | set | Tuple | ndarray | None = None, y_limit: Tuple[float, float] = (0, 1), output: str | Path | None = None, is_str: bool = True, show: bool = True, close: bool = False, **kwargs: Any) None

Base line plot function for visualizing data trends over time or categories.

This function creates a line plot from either AnnData or DataFrame objects, supporting grouped data visualization with customizable colors, legends, and styling.

Parameters

dataUnion[AnnData, DataFrame]

Input data object, can be either AnnData (single-cell data) or pandas DataFrame.

xstr

Column name to use for x-axis values.

ystr

Column name to use for y-axis values.

layerOptional[str], default None

Specific layer to use from AnnData.layers when data is AnnData.

widthfloat, default 2

Figure width in inches.

heightfloat, default 2

Figure height in inches.

bottomfloat, default 0

Bottom margin adjustment for the plot.

titleOptional[str], default None

Title of the plot.

x_nameOptional[str], default None

Label for x-axis. If None, uses x column name.

y_nameOptional[str], default None

Label for y-axis. If None, uses y column name.

labelOptional[str], default None

Column name used for grouping data (creates separate lines).

legendOptional[str], default None

Title for the legend. If None and label is provided, uses “category”.

legend_listlist, default None

List of specific group values to include in the plot.

start_color_indexint, default 0

Starting index for color selection from the color palette.

color_step_sizeint, default 0

Step size for selecting colors from the palette.

color_typestr, default “set”

Type of color palette to use (key from plot_color_types).

colorslist, default None

Custom list of colors to use for the plot.

line_widthfloat, default 1.5

Width of the lines in the plot.

x_name_rotationfloat, default 65

Rotation angle for x-axis tick labels (in degrees).

x_ticksOptional[Union[int, collection]], default None

Custom tick positions or number of ticks for x-axis.

y_limitTuple[float, float], default (0, 1)

Y-axis limits as (min, max) tuple.

outputOptional[path], default None

File path to save the figure. If None, figure is not saved.

is_strbool, default True

Whether to treat x-axis values as strings (affects tick formatting).

showbool, default True

Whether to display the plot.

closebool, default False

Whether to close the figure after display.

**kwargsAny

Additional keyword arguments passed to seaborn.lineplot.

Returns

None

The function displays and/or saves the plot but does not return any value.

Bar

Bar chart visualization function.

sciv.pl.bar(ax_x: list | set | Tuple | ndarray, ax_y: list | set | Tuple | ndarray, x_name: str | None = None, y_name: str | None = None, title: str | None = None, color: str = '#70b5de', text_color: str = '#000205', width: float = 2, height: float = 2, bottom: float = 0, text_left_move: float = 0.1, direction: Literal['vertical', 'horizontal'] = 'vertical', output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any) None

Create a simple bar chart with optional value labels.

This function generates a bar plot (vertical or horizontal) with customizable appearance and automatically adds numerical value labels on each bar.

Parameters

ax_xcollection

Categories or labels for the x-axis (or y-axis if horizontal).

ax_ycollection

Numerical values for the bar heights (or widths if horizontal).

x_namestr, optional

Label for the x-axis. Default is None.

y_namestr, optional

Label for the y-axis. Default is None.

titlestr, optional

Title of the plot. Default is None.

colorstr, default “#70b5de”

Color of the bars.

text_colorstr, default “#000205”

Color of the value labels on bars.

widthfloat, default 2

Width of the figure in inches.

heightfloat, default 2

Height of the figure in inches.

bottomfloat, default 0

Bottom margin adjustment.

text_left_movefloat, default 0.1

Horizontal adjustment for text position on bars.

directionLiteral[‘vertical’, ‘horizontal’], default “vertical”

Orientation of the bars.

outputpath, optional

File path to save the figure. Default is None.

showbool, default True

Whether to display the plot.

closebool, default False

Whether to close the figure after saving.

**kwargsAny

Additional keyword arguments passed to matplotlib’s bar/barh function.

Returns

None

The function displays and/or saves the plot but does not return any value.

sciv.pl.bar_significance(df: DataFrame, x: str, y: str, hue: str, x_name: str | None = None, y_name: str | None = None, anchor: str | None = None, legend: str | None = None, legend_list: list | None = None, hue_order: list | None = None, width: float = 2, height: float = 2, bottom: float = 0, legend_gap: float = 1.15, line_width: float = 0.5, capsize: float = 0.1, errcolor: str = 'k', start_color_index: int = 0, color_step_size: int = 0, color_type: str = 'set', test: str = 't-test_ind', ci: str | float = 'sd', x_rotation: float = 0, x_deviation: float = 0.02, y_deviation: float = 0.02, y_limit: Tuple[float, float] = (0, 1), anno: bool = False, anno_fontsize: float = 7, line_height: float = 0.01, line_offset: float = 0.01, colors: list | dict | None = None, title: str | None = None, output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any) None

Create a bar chart with statistical significance annotations relative to an anchor group.

This function generates a grouped bar plot with error bars and performs pairwise statistical significance testing between an anchor group and other groups within each category. It supports custom color palettes, legend positioning, and various statistical tests.

Parameters

dfDataFrame

Input DataFrame containing the data to plot.

xstr

Column name for x-axis categories.

ystr

Column name for y-axis values.

huestr

Column name for grouping bars by color.

x_namestr, optional

Label for x-axis. Default is None.

y_namestr, optional

Label for y-axis. Default is None.

anchorstr, optional

Reference group name for pairwise significance testing. If provided, statistical comparisons will be made between this group and all other groups within each x category.

legendstr, optional

Legend title. Default is “category”.

legend_listlist, optional

Subset of hue values to include in the plot. If provided, only these values will be plotted. Default is None.

hue_orderlist, optional

Order of hue categories for plotting and legend. Default is None.

widthfloat, default 2

Width of the figure in inches.

heightfloat, default 2

Height of the figure in inches.

bottomfloat, default 0

Bottom margin adjustment.

legend_gapfloat, default 1.15

Vertical gap between plot and legend, specified as a ratio of the y-axis height.

line_widthfloat, default 0.5

Width of error bars and significance annotation lines.

capsizefloat, default 0.1

Width of the error bar caps.

errcolorstr, default “k”

Color of the error bars.

start_color_indexint, default 0

Starting index in the color palette for the first hue category.

color_step_sizeint, default 0

Step size when cycling through the color palette for subsequent hue categories.

color_typestr, default “set”

Name of the seaborn color palette to use. Must be a key in plot_color_types.

teststr, default “t-test_ind”

Statistical test for pairwise comparisons. Options include: {“t-test_ind”, “t-test_welch”, “t-test_paired”, “Mann-Whitney”, “Mann-Whitney-gt”,

“Mann-Whitney-ls”, “Levene”, “Wilcoxon”, “Kruskal”, “Brunner-Munzel”}.

ciUnion[str, float], default “sd”

Confidence interval type or value for error bars. Can be “sd” for standard deviation or a float for confidence interval percentage.

x_rotationfloat, default 0

Rotation angle for x-axis tick labels in degrees.

x_deviationfloat, default 0.02

Horizontal offset for bar value annotations.

y_deviationfloat, default 0.02

Vertical offset adjustment for bar value annotations.

y_limitTuple[float, float], default (0, 1)

Y-axis limits for the plot.

annobool, default False

Whether to annotate bars with their numerical values.

anno_fontsizefloat, default 7

Font size for bar value annotations.

line_heightfloat, default 0.01

Height of significance annotation lines as a fraction of y-axis range.

line_offsetfloat, default 0.01

Vertical offset for significance annotation lines from the bar tops.

colorsUnion[list, dict], optional

Custom color list or dictionary mapping hue values to colors. If provided, overrides the default color palette. Default is None.

titlestr, optional

Title of the plot. Default is None.

outputpath, optional

File path to save the figure. Default is None.

showbool, default True

Whether to display the plot.

closebool, default False

Whether to close the figure after saving.

**kwargsAny

Additional keyword arguments passed to seaborn’s barplot function.

Returns

None

The function displays and/or saves the plot but does not return any value.

sciv.pl.bar_trait(trait_df: DataFrame, trait_name: str = 'All', trait_column_name: str = 'id', value: str = 'rate', groupby: str = 'clusters', x_name: str = 'Cell type', y_name: str = 'Enrichment ratio', color: Tuple = ('#2e6fb7', '#f7f7f7'), legend: Tuple = ('Enrichment', 'Conservative'), text_color: str = '#000205', groupby_sort: list | None = None, width: float = 2, height: float = 2, bottom: float = 0, rotation: float = 65, title: str | None = None, text_left_move: float = 0.15, y_limit: Tuple[float, float] = (0, 1), output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any)

Create stacked bar charts for multiple traits or a specific trait.

This function generates enrichment bar plots for traits (e.g., diseases, gene sets) in the input DataFrame. It can plot all traits or a specific trait based on the trait_name parameter. Each trait’s enrichment data is visualized using the class_bar function, with results saved to individual files.

Parameters

trait_dfDataFrame

Input DataFrame containing trait enrichment data. Must include columns for trait identifiers, cluster labels, and enrichment values.

trait_namestr, default “All”

The specific trait to plot. If “All”, plots bar charts for all unique traits in the trait_column_name column.

trait_column_namestr, default “id”

Column name in trait_df that contains trait identifiers.

valuestr, default “rate”

Column name containing the numerical enrichment values to plot.

groupbystr, default “clusters”

Column name containing cluster or cell type labels.

x_namestr, default “Cell type”

Label for the x-axis.

y_namestr, default “Enrichment ratio”

Label for the y-axis.

colorTuple, default (“#2e6fb7”, “#f7f7f7”)

Colors for the two bar segments (enrichment color, conservative color).

legendTuple, default (“Enrichment”, “Conservative”)

Labels for the legend corresponding to the two bar segments.

text_colorstr, default “#000205”

Color of the value labels on bars.

groupby_sortOptional[list], default None

Custom order for clusters. If provided, clusters will be sorted according to this list. If None, clusters are sorted by enrichment value.

widthfloat, default 2

Width of the figure in inches.

heightfloat, default 2

Height of the figure in inches.

bottomfloat, default 0

Bottom margin adjustment.

rotationfloat, default 65

Rotation angle for x-axis tick labels in degrees.

titlestr, optional

Base title of the plot. The trait name will be appended to this title. Default is None.

text_left_movefloat, default 0.15

Horizontal adjustment for text position on bars.

y_limitTuple[float, float], default (0, 1)

The y-axis limits for the plot.

outputpath, optional

Directory path to save the figures. If provided, each trait’s plot will be saved as a PDF file in this directory. Default is None.

showbool, default True

Whether to display the plot.

closebool, default False

Whether to close the figure after saving.

**kwargsAny

Additional keyword arguments passed to the class_bar function.

Returns

None

The function displays and/or saves the plots but does not return any value.

sciv.pl.class_bar(df: DataFrame, value: str = 'rate', by: str = 'value', groupby: str = 'clusters', color: Tuple = ('#2e6fb7', '#f7f7f7'), x_name: str = 'Cell type', y_name: str = 'Enrichment ratio', legend: Tuple = ('Enrichment', 'Conservative'), text_color: str = '#000205', groupby_sort: list | None = None, width: float = 2, height: float = 2, bottom: float = 0, rotation: float = 65, title: str | None = None, text_left_move: float = 0.15, y_limit: Tuple[float, float] = (0, 1), output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any)

Create a stacked bar chart for enrichment analysis with two categories.

This function filters a DataFrame by a binary column, sorts the data by clusters, and generates a stacked bar plot using the two_bar function. It is typically used to visualize enrichment ratios where one category represents enriched values and the other represents conservative values.

Parameters

dfDataFrame

Input DataFrame containing the data to plot.

valuestr, default “rate”

Column name containing the numerical values to plot.

bystr, default “value”

Column name used to filter the DataFrame into two categories (typically binary: 0 and 1).

groupbystr, default “clusters”

Column name containing the cluster labels or categories.

colorTuple, default (“#2e6fb7”, “#f7f7f7”)

Colors for the two bar segments (enrichment color, conservative color).

x_namestr, default “Cell type”

Label for the x-axis.

y_namestr, default “Enrichment ratio”

Label for the y-axis.

legendTuple, default (“Enrichment”, “Conservative”)

Labels for the legend corresponding to the two bar segments.

text_colorstr, default “#000205”

Color of the value labels on bars.

groupby_sortOptional[list], default None

Custom order for clusters. If provided, clusters will be sorted according to this list. If None, clusters will be sorted by value in descending order.

widthfloat, default 2

Width of the figure in inches.

heightfloat, default 2

Height of the figure in inches.

bottomfloat, default 0

Bottom margin adjustment.

rotationfloat, default 65

Rotation angle for x-axis tick labels in degrees.

titlestr, optional

Title of the plot. Default is None.

text_left_movefloat, default 0.15

Horizontal adjustment for text position on bars.

y_limitTuple[float, float], default (0, 1)

The y-axis limits for the plot.

outputpath, optional

File path to save the figure. Default is None.

showbool, default True

Whether to display the plot.

closebool, default False

Whether to close the figure after saving.

**kwargsAny

Additional keyword arguments passed to the two_bar function.

Returns

None

The function displays and/or saves the plot but does not return any value.

sciv.pl.rate_bar_plot(adata: AnnData, layer: str | None = None, trait_name: str = 'All', dir_name: str = 'feature', column: str = 'value', groupby: str = 'clusters', color: Tuple = ('#2e6fb7', '#f7f7f7'), legend: Tuple = ('Enrichment', 'Conservative'), x_name: str = 'Cell type', y_name: str = 'Enrichment ratio', groupby_sort: list | None = None, text_color: str = '#000205', width: float = 2, height: float = 2, bottom: float = 0, rotation: float = 65, title: str | None = None, text_left_move: float = 0.15, y_limit: Tuple[float, float] = (0, 1), plot_output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any) None

Generate a bar plot showing enrichment ratios for trait-cluster combinations.

This function calculates the completion ratio using the complete_ratio function and visualizes the results as a bar plot. It handles directory creation for output files and passes appropriate parameters to the bar_trait plotting function.

Parameters

adataAnnData

Input AnnData object containing the data to be visualized.

layerstr, optional

Specify the layer of the matrix to be processed. If None, uses the main matrix.

trait_namestr, default “All”

The name of the trait being analyzed, used for filtering data.

dir_namestr, default “feature”

Folder name for generating and saving bar plot outputs.

columnstr, default “value”

The column name containing the binary enrichment values.

groupbystr, default “clusters”

The column name in adata.obs that defines the cell clusters.

colorTuple, default (“#2e6fb7”, “#f7f7f7”)

Color tuple for the bar plot (enrichment color, conservative color).

legendTuple, default (“Enrichment”, “Conservative”)

Legend labels for the two categories in the plot.

x_namestr, default “Cell type”

The label for the x-axis.

y_namestr, default “Enrichment ratio”

The label for the y-axis.

groupby_sortOptional[list], optional

Custom order for clusters. If None, uses default sorting.

text_colorstr, default “#000205”

Color for text annotations in the plot.

widthfloat, default 2

The width of the output figure in inches.

heightfloat, default 2

The height of the output figure in inches.

bottomfloat, default 0

Bottom margin adjustment for the plot.

rotationfloat, default 65

Rotation angle for x-axis labels in degrees.

titlestr, optional

The title of the plot. If None, no title is displayed.

text_left_movefloat, default 0.15

Horizontal adjustment for text position.

y_limitTuple[float, float], default (0, 1)

The y-axis limits for the plot.

plot_outputpath, optional

Directory path for saving output files. If None, figures are not saved.

showbool, default True

If True, display the figure interactively.

closebool, default False

If True, close the figure after saving.

**kwargsAny

Additional keyword arguments passed to bar_trait function.

Returns

None

This function does not return any value. Outputs are saved to files or displayed.

sciv.pl.two_bar(ax_x: list | set | Tuple | ndarray, ax_y: Tuple, x_name: str | None = None, y_name: str | None = None, legend: Tuple = ('1', '2'), color: Tuple = ('#2e6fb7', '#f7f7f7'), text_color: str = '#000205', width: float = 2, height: float = 2, bottom: float = 0, rotation: float = 65, text_left_move: float = 0.15, y_limit: Tuple[float, float] = (0, 1), title: str | None = None, output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any)

Create a stacked bar chart with two categories.

This function generates a stacked bar plot where two sets of values are displayed as stacked bars. It automatically adds numerical value labels on the first bar segment and includes a legend for the two categories.

Parameters

ax_xcollection

Categories or labels for the x-axis.

ax_yTuple

A tuple containing two collections of numerical values for the two bar segments. The second segment will be stacked on top of the first.

x_namestr, optional

Label for the x-axis. Default is None.

y_namestr, optional

Label for the y-axis. Default is None.

legendTuple, default (“1”, “2”)

Labels for the legend corresponding to the two bar segments.

colorTuple, default (“#2e6fb7”, “#f7f7f7”)

Colors for the two bar segments (first segment, second segment).

text_colorstr, default “#000205”

Color of the value labels on bars.

widthfloat, default 2

Width of the figure in inches.

heightfloat, default 2

Height of the figure in inches.

bottomfloat, default 0

Bottom margin adjustment.

rotationfloat, default 65

Rotation angle for x-axis tick labels in degrees.

text_left_movefloat, default 0.15

Horizontal adjustment for text position on bars.

y_limitTuple[float, float], default (0, 1)

The y-axis limits for the plot.

titlestr, optional

Title of the plot. Default is None.

outputpath, optional

File path to save the figure. Default is None.

showbool, default True

Whether to display the plot.

closebool, default False

Whether to close the figure after saving.

**kwargsAny

Additional keyword arguments passed to matplotlib’s bar function.

Returns

None

The function displays and/or saves the plot but does not return any value.

Barcode

Barcode visualization function.

sciv.pl.barcode_base(df: DataFrame, groupby_list: list, sort_column: str = 'value', column: str = 'clusters', width: float = 1, height: float = 3, trait_column_name: str = 'id', title: str | None = None, cmap: str = 'Oranges', bar_label: str = 'TRS', is_ticks: bool = True, colors: list | None = None, ground_true: list | None = None, output: str | Path | None = None, show: bool = True, close: bool = False) None

Plot barcode plot.

Parameters

dfDataFrame

Input data.

groupby_listlist

Cluster list.

sort_columnstr, optional

Sort column.

columnstr, optional

Column name for clusters.

widthfloat, optional

Width.

heightfloat, optional

Height.

trait_column_namestr, optional

Trait column name.

titlestr, optional

Title.

cmapstr, optional

Cmap.

bar_labelstr, optional

Bar label.

is_ticksbool, optional

Whether to show ticks.

colorslist, optional

Colors.

ground_truelist, optional

Ground true.

outputpath, optional

Output path.

showbool, optional

Whether to display the plot.

closebool, optional

Whether to close the figure after display.

sciv.pl.barcode_trait(trait_df: DataFrame, trait_name: str = 'All', trait_column_name: str = 'id', sort_column: str = 'value', groupby: str = 'clusters', cmap: str = 'viridis', width: float = 1, height: float = 3, is_ticks: bool = True, colors: list | None = None, ground_true: list | None = None, title: str | None = None, suffix: str = 'pdf', output: str | Path | None = None, show: bool = True, close: bool = False) None

Plot barcode plots for traits/diseases.

This function generates barcode visualizations for specified traits or all traits in the dataset. It creates individual plots for each trait showing the distribution of trait scores across different clusters.

Parameters

trait_dfDataFrame

Input DataFrame containing trait scores and cluster information.

trait_namestr, optional

Name of the trait/disease to plot. Use “All” to plot all traits. Default is “All”.

trait_column_namestr, optional

Column name in the DataFrame that contains trait identifiers. Default is “id”.

sort_columnstr, optional

Column name used for sorting values in the barcode plot. Default is “value”.

groupbystr, optional

Column name in the DataFrame that contains cluster assignments. Default is “clusters”.

cmapstr, optional

Colormap name for the value heatmap. Default is “viridis”.

widthfloat, optional

Width of the figure in inches. Default is 1.

heightfloat, optional

Height of the figure in inches. Default is 3.

is_ticksbool, optional

Whether to display colorbar ticks. Default is True.

colorslist, optional

Custom color list for cluster visualization. If None, uses default colors. Default is None.

ground_truelist, optional

Ground truth cluster labels for ordering. Default is None.

titlestr, optional

Base title for the plots. Trait name will be appended. Default is None.

suffixstr, optional

File extension for output plots (e.g., “pdf”, “png”). Default is “pdf”.

outputpath, optional

Directory path for saving output files. Default is None.

showbool, optional

Whether to display the plots interactively. Default is True.

closebool, optional

Whether to close the figure after display. Default is False.

Pie

Pie chart visualization function.

sciv.pl.base_pie(values: list, labels: list, x_name: str | None = None, y_name: str | None = None, title: str | None = None, width: float = 2, height: float = 2, bottom: float = 0, pct_distance: float = 0.6, label_distance: float = 1.1, colors: list | None = None, autopct: str = '%1.2f%%', output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any) None

Create a basic pie chart with customizable parameters.

This function generates a simple pie chart using matplotlib, with support for custom colors, labels, and various display options.

Parameters

valueslist

The values to be plotted in the pie chart.

labelslist

The labels corresponding to each value in the pie chart.

x_namestr, optional

The label for the x-axis. Default is None.

y_namestr, optional

The label for the y-axis. Default is None.

titlestr, optional

The title of the pie chart. Default is None.

widthfloat, optional

The width of the figure in inches. Default is 2.

heightfloat, optional

The height of the figure in inches. Default is 2.

bottomfloat, optional

The bottom margin of the figure. Default is 0.

pct_distancefloat, optional

The distance of the percentage labels from the center of the pie. Default is 0.6.

label_distancefloat, optional

The distance of the labels from the center of the pie. Default is 1.1.

colorslist, optional

A list of colors to use for the pie slices. If None, default colors will be used. Default is None.

autopctstr, optional

The format string for the percentage labels. Default is ‘%1.2f%%’.

outputpath, optional

The file path to save the figure. If None, the figure will not be saved. Default is None.

showbool, optional

Whether to display the figure. Default is True.

closebool, optional

Whether to close the figure after displaying. Default is False.

**kwargsAny

Additional keyword arguments passed to matplotlib’s pie function.

sciv.pl.pie_label(df: DataFrame, map_groupby: str | list | set | Tuple | ndarray, value: str = 'value', groupby: str = 'clusters', x_name: str | None = None, y_name: str | None = None, title: str | None = None, width: float = 2, height: float = 2, bottom: float = 0, radius: float = 0.6, fontsize: float = 17, pct_distance: float = 0.6, label_distance: float = 1.1, colors: list | None = None, output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any) None

Create a donut-style pie chart showing cluster label distribution.

This function generates a pie chart with a central hole (donut chart) to visualize the distribution of predicted cluster labels against true labels. The chart displays the percentage of correctly predicted labels in the center.

Parameters

dfDataFrame

The input data containing cluster and value information.

map_groupbyUnion[str, collection]

The mapping of clusters, can be a column name or a collection of cluster labels.

valuestr, optional

The column name for values in the DataFrame. Default is “value”.

groupbystr, optional

The column name for cluster labels in the DataFrame. Default is “clusters”.

x_namestr, optional

The label for the x-axis. Default is None.

y_namestr, optional

The label for the y-axis. Default is None.

titlestr, optional

The title of the pie chart. Default is None.

widthfloat, optional

The width of the figure in inches. Default is 2.

heightfloat, optional

The height of the figure in inches. Default is 2.

bottomfloat, optional

The bottom margin of the figure. Default is 0.

radiusfloat, optional

The radius of the inner white circle to create donut effect. Default is 0.6.

fontsizefloat, optional

The font size for the percentage text in the center. Default is 17.

pct_distancefloat, optional

The distance of the percentage labels from the center of the pie. Default is 0.6.

label_distancefloat, optional

The distance of the labels from the center of the pie. Default is 1.1.

colorslist, optional

A list of colors to use for the pie slices. If None, default colors will be used. Default is None.

outputpath, optional

The file path to save the figure. If None, the figure will not be saved. Default is None.

showbool, optional

Whether to display the figure. Default is True.

closebool, optional

Whether to close the figure after displaying. Default is False.

**kwargsAny

Additional keyword arguments passed to matplotlib’s pie function.

sciv.pl.pie_trait(trait_df: DataFrame, trait_groupby_map: dict, trait_name: str = 'All', groupby: str = 'clusters', trait_column_name: str = 'id', value: str = 'value', x_name: str | None = None, y_name: str | None = None, title: str | None = None, width: float = 2, height: float = 2, radius: float = 0.6, fontsize: float = 17, pct_distance: float = 0.6, label_distance: float = 1.1, colors: list | None = None, output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any) None

Create pie charts for trait/disease cluster distribution analysis.

This function generates donut-style pie charts to visualize the distribution of trait-specific scores across different cell clusters. It supports batch processing for multiple traits or single trait analysis.

Parameters

trait_dfDataFrame

The input data containing trait information, cluster labels, and values.

trait_groupby_mapdict

A dictionary mapping trait names to their corresponding cluster mappings. Keys are trait names, values are cluster label mappings.

trait_namestr, optional

The specific trait to plot. Use “All” to plot all traits in the data. Default is “All”.

groupbystr, optional

The column name for cluster labels in the DataFrame. Default is “clusters”.

trait_column_namestr, optional

The column name for trait identifiers in the DataFrame. Default is “id”.

valuestr, optional

The column name for values/scores in the DataFrame. Default is “value”.

x_namestr, optional

The label for the x-axis. Default is None.

y_namestr, optional

The label for the y-axis. Default is None.

titlestr, optional

The base title for the pie charts. Trait name will be appended if provided. Default is None.

widthfloat, optional

The width of the figure in inches. Default is 2.

heightfloat, optional

The height of the figure in inches. Default is 2.

radiusfloat, optional

The radius of the inner white circle to create donut effect. Default is 0.6.

fontsizefloat, optional

The font size for the percentage text in the center. Default is 17.

pct_distancefloat, optional

The distance of the percentage labels from the center of the pie. Default is 0.6.

label_distancefloat, optional

The distance of the labels from the center of the pie. Default is 1.1.

colorslist, optional

A list of colors to use for the pie slices. If None, default colors will be used. Default is None.

outputpath, optional

The directory path to save the figures. If None, figures will not be saved. Default is None.

showbool, optional

Whether to display the figure. Default is True.

closebool, optional

Whether to close the figure after displaying. Default is False.

**kwargsAny

Additional keyword arguments passed to the pie_label function.

Bubble

Bubble chart visualization function.

sciv.pl.bubble(df: DataFrame, x: str, y: str, hue: str | None = None, size: str | None = None, x_name: str | None = None, y_name: str | None = None, title: str | None = None, width: float = 2, height: float = 2, bottom: float = 0, output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any)

Create a bubble plot using seaborn’s relplot.

Parameters

dfDataFrame

Input data structure.

xstr

Column name for x-axis values.

ystr

Column name for y-axis values.

huestr, optional

Column name for color encoding.

sizestr, optional

Column name for size encoding.

x_namestr, optional

Custom label for x-axis.

y_namestr, optional

Custom label for y-axis.

titlestr, optional

Plot title.

widthfloat, default=2

Figure width in inches.

heightfloat, default=2

Figure height in inches.

bottomfloat, default=0

Bottom margin adjustment.

outputpath, optional

File path to save the figure.

showbool, default=True

Whether to display the plot.

closebool, default=False

Whether to close the figure after display.

**kwargsAny

Additional arguments passed to seaborn.relplot.

Radar

Radar visualization function.

sciv.pl.base_radar(df: DataFrame, ax_x: str, ax_y: str, hue: str, x_name: str | None = None, y_name: str | None = None, title: str | None = None, width: float = 4, height: float = 4, bottom: float = 0, colors: list | set | Tuple | ndarray | None = None, line_width: float = 0.5, y_limit: Tuple = (0, 1), bbox_to_anchor: Tuple = (1.3, 1.1), is_fill: bool = True, fill_alpha: float = 0.2, output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any) None

Plot a radar chart with multiple groups.

Parameters

dfDataFrame

Input data containing the values to plot.

ax_xstr

Column name for category labels (x-axis categories).

ax_ystr

Column name for values to plot (y-axis values).

huestr

Column name for grouping different lines.

x_namestr, optional

Label for the x-axis.

y_namestr, optional

Label for the y-axis.

titlestr, optional

Title of the chart.

widthfloat, optional

Width of the chart figure.

heightfloat, optional

Height of the chart figure.

bottomfloat, optional

Bottom margin adjustment.

colorscollection, optional

Colors for each group line.

line_widthfloat, optional

Width of the radar lines.

y_limitTuple, optional

Y-axis limit range.

bbox_to_anchorTuple, optional

Position for the legend box.

is_fillbool, optional

Whether to fill the radar area.

fill_alphafloat, optional

Transparency level for the filled area.

outputpath, optional

Output path to save the figure.

showbool, optional

Whether to display the figure.

closebool, optional

Whether to close the figure after display.

kwargsAny, optional

Additional keyword arguments for plotting.

Returns

None

sciv.pl.radar(ax_x: list | set | Tuple | ndarray, ax_y: list | set | Tuple | ndarray, x_name: str | None = None, y_name: str | None = None, title: str | None = None, colors: list | set | Tuple | ndarray | None = None, width: float = 4, height: float = 4, bottom: float = 0, center_text: str | None = None, rotation: float = 25, value_top: float = 0.1, text_top: float = 1.2, is_fixed: bool = False, is_angle: bool = True, y_limit: Tuple = (-0.5, 1), y_axis_scale: Tuple = (0, 1), output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any) None

Plot a radar chart.

Parameters

ax_xcollection

Category labels for the radar chart.

ax_ycollection

Data values for each category.

x_namestr, optional

Label for the x-axis.

y_namestr, optional

Label for the y-axis.

titlestr, optional

Title of the chart.

colorscollection, optional

Colors for the radar chart.

widthfloat, optional

Width of the chart.

heightfloat, optional

Height of the chart.

bottomfloat, optional

Bottom margin adjustment.

center_textstr, optional

Center text for the chart.

rotationfloat, optional

Angle rotation for the radar chart.

value_topfloat, optional

Value top for the radar chart.

text_topfloat, optional

Text top for the radar chart.

is_fixedbool, optional

Whether to fix the radar chart.

is_anglebool, optional

Whether to use angle for the radar chart.

y_limitTuple, optional

Y-axis limit.

y_axis_scaleTuple, optional

Y-axis scale.

outputpath, optional

Output path.

showbool, optional

Whether to show.

closebool, optional

Whether to close.

kwargsAny, optional

Keyword arguments.

Returns

None

sciv.pl.radar_trait(trait_df: DataFrame, trait_name: str = 'All', trait_column_name: str = 'id', value: str = 'rate', clusters: str = 'clusters', color: list | set | Tuple | ndarray | str | None = None, clusters_sort: list | None = None, width: float = 4, height: float = 4, rotation: float = 65, title: str | None = None, value_top: float = 0.1, text_top: float = 1.2, is_fixed: bool = False, is_angle: bool = True, y_limit: Tuple = (-0.5, 1), y_axis_scale: Tuple = (0, 1), output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any)

Plot radar charts for trait enrichment analysis.

This function creates radar charts to visualize trait/disease enrichment scores across different clusters. It can plot either a single trait or all traits in the dataset.

Parameters

trait_dfDataFrame

Input dataframe containing trait enrichment data.

trait_namestr, optional

Name of the trait to plot. Use “All” to plot all traits. Default is “All”.

trait_column_namestr, optional

Column name in trait_df that contains trait identifiers. Default is “id”.

valuestr, optional

Column name containing the enrichment values to plot. Default is “rate”.

clustersstr, optional

Column name containing cluster identifiers. Default is “clusters”.

colorUnion[collection, str], optional

Colors for the radar chart bars. Can be a column name (str) or a collection of colors.

clusters_sortOptional[list], optional

Custom order for clusters. If None, clusters are sorted by value in descending order.

widthfloat, optional

Width of the figure in inches. Default is 4.

heightfloat, optional

Height of the figure in inches. Default is 4.

rotationfloat, optional

Rotation angle for text labels in degrees. Default is 65.

titlestr, optional

Base title for the plot. Trait name will be appended if provided.

value_topfloat, optional

Vertical offset for value labels above bars. Default is 0.1.

text_topfloat, optional

Radial position for category labels. Default is 1.2.

is_fixedbool, optional

If True, value labels are placed at a fixed position. Default is False.

is_anglebool, optional

If True, rotate labels to align with radar angles. Default is True.

y_limitTuple, optional

Y-axis limits as (min, max). Default is (-0.5, 1).

y_axis_scaleTuple, optional

Scale range for y-axis ticks as (min, max). Default is (0, 1).

outputpath, optional

Directory path to save output PDF files. If None, files are not saved.

showbool, optional

Whether to display the plot. Default is True.

closebool, optional

Whether to close the figure after display. Default is False.

kwargsAny, optional

Additional keyword arguments passed to the radar function.

Returns

None

The function saves plots to files and/or displays them based on parameters.

sciv.pl.rate_circular_bar_plot(adata: AnnData, layer: str | None = None, trait_name: str = 'All', dir_name: str = 'feature', column: str = 'value', groupby: str = 'clusters', color: list | set | Tuple | ndarray | str | None = None, groupby_sort: list | None = None, width: float = 2, height: float = 2, rotation: float = 25, title: str | None = None, value_top: float = 0.1, text_top: float = 1.2, is_fixed: bool = False, is_angle: bool = True, y_limit: Tuple = (-0.5, 1), y_axis_scale: Tuple = (0, 1), plot_output: str | Path | None = None, show: bool = True, close: bool = False) None

Generate a circular bar plot (radar chart) showing enrichment ratios for trait-cluster combinations.

This function calculates the completion ratio using the complete_ratio function and visualizes the results as a circular bar plot (radar chart). It handles directory creation for output files and passes appropriate parameters to the radar_trait plotting function.

Parameters

adataAnnData

Input AnnData object containing the data to be visualized.

layerstr, optional

Specify the layer of the matrix to be processed. If None, uses the main matrix.

trait_namestr, default “All”

The name of the trait being analyzed, used for filtering data.

dir_namestr, default “feature”

Folder name for generating and saving circular bar plot outputs.

columnstr, default “value”

The column name containing the binary enrichment values.

groupbystr, default “clusters”

The column name in adata.obs that defines the cell clusters.

colorUnion[collection, str], optional

Color specification for the plot. Can be a color collection or a column name to use for coloring bars based on data values.

groupby_sortOptional[list], optional

Custom order for clusters. If None, uses default sorting.

widthfloat, default 2

The width of the output figure in inches.

heightfloat, default 2

The height of the output figure in inches.

rotationfloat, default 25

Rotation angle for the circular plot in degrees.

titlestr, optional

The title of the plot. If None, no title is displayed.

value_topfloat, default 0.1

Vertical offset for value labels in the plot.

text_topfloat, default 1.2

Vertical offset for text labels in the plot.

is_fixedbool, default False

If True, use fixed scaling for the plot.

is_anglebool, default True

If True, use angular positioning for bars.

y_limitTuple, default (-0.5, 1)

The y-axis limits for the plot.

y_axis_scaleTuple, default (0, 1)

The scale range for the y-axis values.

plot_outputpath, optional

Directory path for saving output files. If None, figures are not saved.

showbool, default True

If True, display the figure interactively.

closebool, default False

If True, close the figure after saving.

Returns

None

This function does not return any value. Outputs are saved to files or displayed.

Venn

Wayne diagram visualization function.

sciv.pl.three_venn(set1: list | set | Tuple | ndarray, set2: list | set | Tuple | ndarray, set3: list | set | Tuple | ndarray, name1: str = 'Set1', name2: str = 'Set2', name3: str = 'Set3', width: float = 2, height: float = 2, bottom: float = 0, colors: list | None = None, x_name: str | None = None, y_name: str | None = None, title: str | None = None, output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any) None

Plot three Venn diagram.

Parameters

set1collection

First set of elements.

set2collection

Second set of elements.

set3collection

Third set of elements.

name1str, optional

Name of the first set.

name2str, optional

Name of the second set.

name3str, optional

Name of the third set.

widthfloat, optional

Width of the diagram.

heightfloat, optional

Height of the diagram.

bottomfloat, optional

Bottom of the diagram.

colorslist, optional

Colors for the sets.

x_namestr, optional

X name.

y_namestr, optional

Y name.

titlestr, optional

Title of the diagram.

outputpath, optional

Output path.

showbool, optional

Whether to show.

closebool, optional

Whether to close.

kwargsAny, optional

Keyword arguments.

Returns

None

sciv.pl.two_venn(set1: list | set | Tuple | ndarray, set2: list | set | Tuple | ndarray, name1: str = 'Set1', name2: str = 'Set2', width: float = 2, height: float = 2, bottom: float = 0, colors: list | None = None, x_name: str | None = None, y_name: str | None = None, title: str | None = None, output: str | Path | None = None, show: bool = True, close: bool = False, **kwargs: Any) None

Plot two Venn diagram.

Parameters

set1collection

First set of elements.

set2collection

Second set of elements.

name1str, optional

Name of the first set.

name2str, optional

Name of the second set.

widthfloat, optional

Width of the diagram.

heightfloat, optional

Height of the diagram.

bottomfloat, optional

Bottom of the diagram.

colorslist, optional

Colors for the sets.

x_namestr, optional

X name.

y_namestr, optional

Y name.

titlestr, optional

Title of the diagram.

outputpath, optional

Output path.

showbool, optional

Whether to show.

closebool, optional

Whether to close.

kwargsAny, optional

Keyword arguments.

Returns

None

Preprocessing (.pp)

Data preprocessing interface, used for single-cell data cleaning, differential analysis, and enrichment analysis.

sciv.pp.adata_group(adata: AnnData, groupby: str, extra_column: str | None = None, axis: Literal[0, 1] = 1, layer: str | None = None, method: list | set | Tuple | ndarray | str = ('mean', 'sum', 'median')) AnnData

Return reshaped AnnData organized by given column values.

Parameters

adataAnnData

input data;

groupbystr

grouping column;

extra_columnOptional[str], optional

Extra columns reserved based on grouped column;

axisLiteral[0, 1], optional

Which dimension is used for grouping. {1: adata.obs, 0: adata.var};

layerstr, optional

Specify the matrix to be processed;

methodcollection | str, optional

The method of grouping strategy supports the following 5 types and their combinations. The five methods are {“mean”, “sum”, “median”, “max”, “min”}.

Returns

AnnData

Data grouped by AnnData.

sciv.pp.adata_map_df(adata: AnnData, column: str = 'value', layer: str | None = None) DataFrame

Convert AnnData to a form of row column value.

Parameters

adataAnnData

Enter the AnnData data to be converted;

columnstr

Specify the column name of the value;

layerstr, optional

Specify the matrix to be processed;

Returns

DataFrame

The DataFrame data of the row column value.

sciv.pp.filter_data(adata: AnnData, min_cells: int = 1, min_peaks: int = 1, min_peaks_counts: int = 1, min_cells_counts: int = 1, cell_rate: float | None = None, peak_rate: float | None = None, is_copy: bool = False, is_min_cell: bool = True, is_min_peak: bool = False) AnnData

Filter scATAC-seq data.

Parameters

adataAnnData

scATAC-seq data

min_peaks_countsint, optional

Minimum number of counts required for a peak to pass filtering

min_cellsint, optional

Minimum number of cells expressed required for a peak to pass filtering

min_cells_countsint, optional

Minimum number of counts required for a cell to pass filtering

min_peaksint, optional

Minimum number of peaks expressed required for a cell to pass filtering

cell_rateOptional[float], optional

Removing the percentage of cell count in total cell count only takes effect when the min_cells parameter is None

peak_rateOptional[float], optional

Removing the percentage of peak count in total peak count only takes effect when the min_peaks parameter is None

is_copybool, optional

Do you want to deeply copy data.

is_min_cellbool, optional

Whether to screen cells

is_min_peakbool, optional

Whether to screen peaks

Returns

AnnData

scATAC-seq data

sciv.pp.get_difference_genes(adata: AnnData, groupby: str, method: Literal['logreg', 't-test', 'wilcoxon', 't-test_overestim_var'] | None = 'wilcoxon', cell_anno: DataFrame | None = None, diff_genes_file: str | None = None) AnnData

Get differentially expressed/active genes.

Parameters

adataAnnData

scATAC-seq data

groupbystr

groupby name

method_Method, optional

Method to use for differentially expressed gene analysis.

cell_annoOptional[DataFrame], optional

Cell annotation DataFrame.

diff_genes_fileOptional[str], optional

Output file name.

Returns

AnnData

scATAC-seq data

sciv.pp.get_difference_peaks(adata: AnnData, genome_anno, groupby: str, cell_anno: DataFrame | None = None, min_log_fc: float = 0.25, min_pct: float = 0.05, peak_matrix_save_file: str | Path | None = None, diff_peaks_save_file: str | Path | None = None) AnnData

Get difference peaks.

Parameters

adataAnnData

Fragment file path.

genome_annoDataFrame

Genome annotation.

groupbystr

Cluster name.

cell_annoOptional[DataFrame], optional

Cell annotation.

min_log_fcfloat, optional

Minimum log2 fold change.

min_pctfloat, optional

Minimum percentage.

peak_matrix_save_fileOptional[path], optional

Peak matrix save file.

diff_peaks_save_fileOptional[path], optional

Difference peaks save file.

Returns

AnnData

Difference peaks data.

sciv.pp.get_gene_enrichment(adata: AnnData, top_number: int = 50, threshold: float = 0.01, layer: str | None = None, is_order_or_lt: bool = True, is_top: bool = True, gene_sets: list[str] | set = ('GO_Biological_Process_2023', 'GO_Cellular_Component_2023', 'GO_Molecular_Function_2023', 'GWAS_Catalog_2023', 'KEGG_2016'), organism: Literal['Human', 'Mouse', 'Yeast', 'Fly', 'Fish', 'Worm'] | None = 'human', output_dir: str | None = None) DataFrame

Get gene enrichment analysis.

Parameters

adataAnnData

Input data;

top_numberint, optional

Top number of genes to use.

thresholdfloat, optional

Threshold to use.

layerOptional[str], optional

Specify the matrix to be processed;

is_order_or_ltbool, optional

Whether to order or filter by threshold.

is_topbool, optional

Whether to get top top_number genes.

gene_setsUnion[list[str], set], optional

Gene sets to use.

organism_Datasets, optional

Organism to use.

output_dirOptional[str], optional

Output directory.

Returns

DataFrame

GSEA enrichr results DataFrame.

sciv.pp.get_gene_expression(adata: AnnData, genome_anno, min_cells: int = 5, gene_save_file: str | Path | None = None) AnnData

Get gene expression matrix.

Parameters

adataAnnData

scATAC-seq data

genome_annoDataFrame

Genome annotation.

min_cellsint, optional

Minimum cells.

gene_save_fileOptional[path], optional

Gene save file path.

Returns

AnnData

Gene expression matrix.

sciv.pp.get_peak_matrix(adata: AnnData, genome_anno, groupby: str, cell_anno: DataFrame | None = None, peak_matrix_save_file: str | Path | None = None) AnnData

Generate peak matrix from scATAC-seq data.

This function processes scATAC-seq fragment files to generate a cell-by-peak matrix through peak calling using MACS3. It performs quality control, tile matrix generation, feature selection, and peak calling at the specified cluster level.

Parameters

adataAnnData

scATAC-seq data

genome_annoDataFrame

Genome annotation information.

groupbystr

Column name in cell annotation indicating cluster labels for peak calling.

cell_annoOptional[DataFrame], optional

Cell annotation DataFrame containing cluster information.

peak_matrix_save_fileOptional[path], optional

Path to save the output peak matrix h5ad file.

Returns

AnnData

Cell-by-peak matrix.

sciv.pp.get_sc_atac(fragment_file: str | Path, genome_anno, h5ad_file: str | Path | None = None, min_num_fragments: int = 200, sorted_by_barcode: bool = False, bin_size: int = 500, min_tsse: float = 5.0, counting_strategy: Literal['fragment', 'insertion', 'paired-insertion'] = 'paired-insertion', need_features: int | float | None = None, is_filter_doublets: bool = True) AnnData

Get scATAC-seq data from fragment file or h5ad file.

This function processes scATAC-seq data by importing fragment files, performing quality control, adding tile matrices, selecting features, and filtering doublets. It can also read pre-processed h5ad files.

Parameters

fragment_filepath

Path to the fragment file or h5ad file.

genome_annoDataFrame

Genome annotation.

h5ad_fileOptional[path], optional

Path to save the h5ad file. If None, a temporary cache file will be used.

min_num_fragmentsint, optional

Minimum number of fragments required for a cell to pass filtering.

sorted_by_barcodebool, optional

Whether the input fragment file is sorted by barcode.

bin_sizeint, optional

Size of consecutive genomic regions used to record the counts.

min_tssefloat, optional

Minimum TSS enrichment score required for a cell to pass filtering.

counting_strategyLiteral[‘fragment’, ‘insertion’, ‘paired-insertion’], optional

Strategy to count fragments in bins.

need_featuresOptional[Union[int | float]], optional

Number or proportion of features to select.

is_filter_doubletsbool, optional

Whether to filter doublets.

Returns

AnnData

Processed scATAC-seq data.

sciv.pp.get_tf_data(adata: AnnData, genome_anno, groupby: str, cell_anno: DataFrame | None = None, p_value: float = 0.01, peak_matrix_save_file: str | Path | None = None, tf_save_file: str | Path | None = None) AnnData

Get TF data.

Parameters

adataAnnData

scATAC-seq data

genome_annoDataFrame

Genome annotation.

groupbystr

Cluster name.

cell_annoOptional[DataFrame], optional

Cell annotation.

p_valuefloat, optional

P-value threshold.

peak_matrix_save_fileOptional[path], optional

Peak matrix save file.

tf_save_fileOptional[path], optional

TF save file.

Returns

AnnData

TF data.

sciv.pp.gsea_enrichr(gene_list: list[str], gene_sets: list[str] | set = ('GO_Biological_Process_2023', 'GO_Cellular_Component_2023', 'GO_Molecular_Function_2023', 'GWAS_Catalog_2023', 'KEGG_2016'), organism: Literal['Human', 'Mouse', 'Yeast', 'Fly', 'Fish', 'Worm'] | None = 'human', is_verbose: bool = True, output_dir: str | None = None) DataFrame

GSEA enrichr analysis.

Parameters

gene_listlist[str]

Gene list.

gene_setsUnion[list[str], set], optional

Gene sets to use.

organism_Datasets, optional

Organism to use.

is_verbosebool, optional

Whether to print verbose messages.

output_dirOptional[str], optional

Output directory.

Returns

DataFrame

GSEA enrichr results DataFrame.

sciv.pp.merge_sc_atac(files: dict, genome_anno, merge_key: str = 'merge_sc_atac', min_num_fragments: int = 200, sorted_by_barcode: bool = False, bin_size: int = 500, min_tsse: float = 5.0, counting_strategy: Literal['fragment', 'insertion', 'paired-insertion'] = 'paired-insertion', max_iter_harmony: int = 20, harmony_groupby: str | list[str] | None = None, is_selected: bool = False, is_batch: bool = True, need_features: int | float | None = None, output_path: str | Path | None = None) AnnData

Integrate multiple scATAC-seq data through snapATAC2.

This function integrates multiple scATAC-seq datasets using snapATAC2. Reference: https://kzhang.org/SnapATAC2/tutorials/integration.html

Note: Please do not move the generated files during this processing.

Parameters

filesdict

Dictionary mapping sample names to file paths of scATAC-seq data. Format: {file_key: file_path, …}

genome_annoDataFrame

Genome annotation. Commonly snap.genome.hg38 or snap.genome.hg19.

merge_keystr, optional

Key used to form the final H5AD file name. Default is “merge_sc_atac”.

min_num_fragmentsint, optional

Minimum number of unique fragments required for a cell to pass filtering. Default is 200.

sorted_by_barcodebool, optional

Whether the input fragment file is sorted by barcode. Default is False.

bin_sizeint, optional

Size of consecutive genomic regions used to record counts. Default is 500.

min_tssefloat, optional

Minimum TSS enrichment score required for a cell to pass filtering. Default is 5.0.

counting_strategyLiteral[‘fragment’, ‘insertion’, ‘paired-insertion’], optional

Strategy to count fragments in bins. Default is ‘paired-insertion’.

max_iter_harmonyint, optional

Maximum number of iterations for the harmony algorithm. Default is 20.

harmony_groupbyOptional[Union[str, list[str]]], optional

If specified, split data into groups and perform batch correction on each group separately.

is_selectedbool, optional

If True, perform additional filtering based on feature selection from each sample using the snap.pp.select_features method.

is_batchbool, optional

If True, perform batch correction by sample. Default is True.

need_featuresOptional[Union[int, float]], optional

Number or proportion of features to select. If <= 1, interpreted as a proportion of total features. If > 1, interpreted as absolute number.

output_pathOptional[path], optional

Directory path for output files. If None, temporary files are used.

Returns

AnnData

Integrated scATAC-seq data.

sciv.pp.paga_trajectory(adata: AnnData, layer: str | None = None, latent: str = 'X_pca', groups: str = 'louvain', position: list | set | Tuple | ndarray | None = None, lsi_components: int = 50, root_cluster: str | None = None, n_neighbors: int = 15, resolution: float = 1.0, is_denoise: bool = True) None

Get paga trajectory.

Parameters

adataAnnData

scATAC-seq data

layerOptional[str], optional

Specify the matrix to be processed;

latentstr, optional

Latent space to use.

groupsstr, optional

Group name to use.

positionOptional[collection], optional

Position to use.

lsi_componentsint, optional

Number of components to use.

root_clusterOptional[str], optional

Root cluster to use.

n_neighborsint, optional

Number of neighbors to use.

resolutionfloat, optional

Resolution to use.

is_denoisebool, optional

Whether to denoise.

Returns

None

sciv.pp.poisson_vi(adata: AnnData, max_epochs: int = 500, lr: float = 0.0001, batch_size: int = 128, eps: float = 1e-08, early_stopping: bool = True, early_stopping_patience: int = 50, strategy: str = 'ddp_notebook_find_unused_parameters_true', batch_key: str | None = None, resolution: float = 0.5, dp_delta: float = 0.05, latent_name: str = 'latent', model_dir: str | Path | None = None) AnnData

PoissonVI processing of the data results in the current sample representation and peak difference data after Leiden clustering.

Parameters

adataAnnData

Input data to be processed.

max_epochsint, default 500

The maximum number of epochs for PoissonVI training.

lrfloat, default 1e-4

Learning rate for optimization.

batch_sizeint, default 128

Minibatch size to use during training.

epsfloat, default 1e-08

Optimizer epsilon.

early_stoppingbool, default True

Whether to perform early stopping with respect to the validation set.

early_stopping_patienceint, default 50

How many epochs to wait for improvement before early stopping.

strategystr, default “ddp_notebook_find_unused_parameters_true”

DDP strategy.

batch_keystr, optional

Batch information in scATAC-seq data.

resolutionfloat, default 0.5

Resolution of the Leiden clustering.

dp_deltafloat, default 0.05

Empirical effect size threshold for PeakVI method in differential analysis.

latent_namestr, default “latent”

The name of latent representation.

model_dirstr, optional

The folder name for saving the trained model.

Returns

AnnData

Differential peak data of clustering types.

Tool (.tl)

Tool function interface, including core computing functions such as algorithms, matrix operations, and random walks.

Algorithm

Algorithm related functions.

sciv.tl.add_bernoulli_fluctuation_noise(counts_matrix: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list, noise_level: float = 0.1) coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list

Add Bernoulli fluctuation noise to the counts matrix (add 1 with probability noise_level)

Parameters

counts_matrixmatrix_data

Input counts matrix

noise_levelfloat, default 0.1

Noise level, i.e., the probability of randomly adding 1 (range: 0.0 - 1.0)

Returns

matrix_data

Matrix after adding noise

sciv.tl.add_noise_perturb(data: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list, rate: float) coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list

Add peak percentage noise to each cell

Parameters

datamatrix_data

Input counts matrix

ratefloat

Noise level, i.e., the probability of randomly adding 1 (range: 0.0 - 1.0)

Returns

matrix_data

Matrix after adding noise

sciv.tl.ami(labels_pred: list | set | Tuple | ndarray, labels_true: list | set | Tuple | ndarray) float

AMI (0, 1)

Parameters

labels_predcollection

Predictive labels for clustering;

labels_truecollection

Real labels for clustering.

Returns

float

AMI score.

sciv.tl.ari(labels_pred: list | set | Tuple | ndarray, labels_true: list | set | Tuple | ndarray) float

ARI (-1, 1)

Parameters

labels_predcollection

Predictive labels for clustering;

labels_truecollection

Real labels for clustering.

Returns

float

ARI score.

sciv.tl.binary_indicator(labels_true: list | set | Tuple | ndarray, labels_pred: list | set | Tuple | ndarray) Tuple[float, float, float, float, float, float, float]

Accuracy, Recall, F1, FPR, TPR, AUROC, AUPRC.

Parameters

labels_truecollection

Real labels for clustering;

labels_predcollection

Predictive labels for clustering.

Returns

tuple

Binary Indicators.

sciv.tl.calculate_fragment_weighted_accessibility(input_data: dict, block_size: int = -1) coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list

Calculate the initial trait- or disease-related cell score.

Parameters

input_datadict
  1. data: Convert the counts matrix to the fragments matrix using the scvi.data.reads_to_fragments

  2. overlap_data: Peaks-traits/diseases data

block_sizeint

The size of the segmentation stored in block wise matrix multiplication. If the value is less than or equal to zero, no block operation will be performed

Returns

matrix_data

Initial TRS.

sciv.tl.calculate_init_score_weight(adata: AnnData, da_peaks_adata: AnnData, overlap_adata: AnnData, layer: str | None = 'fragments', diff_peak_value: Literal['emp_effect', 'bayes_factor', 'emp_prob1', 'all'] = 'emp_effect', is_simple: bool = True, block_size: int = -1) AnnData

Calculate the initial trait- or disease-related cell score with weight.

Parameters

adataAnnData

scATAC-seq data;

da_peaks_adataAnnData

Differential peak data;

overlap_adataAnnData

Peaks-traits/diseases data;

layerstr

Optional. The layer value of scATAC-seq data;

diff_peak_valuedifference_peak_optional

Specify the correction value in peak correction of clustering type differences. {‘emp_effect’, ‘bayes_factor’, ‘emp_prob1’, ‘all’}

is_simplebool

True represents not adding unnecessary intermediate variables, only adding the final result. It is worth noting that when set to True, the is_ablation parameter will become invalid, and when set to False, is_ablation will only take effect;

block_sizeint

The size of the segmentation stored in block wise matrix multiplication. If the value is less than or equal to zero, no block operation will be performed

Returns

AnnData

Initial TRS with weight.

sciv.tl.calinski_harabasz(data: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list, labels: list | set | Tuple | ndarray) float

The Calinski-Harabasz index is also one of the indicators used to evaluate the quality of clustering models. It measures the compactness within the cluster and the separation between clusters in the clustering results. The larger the value, the better the clustering effect.

Parameters

datamatrix_data

First data.

labelscollection

Predicted labels for each sample.

Returns

float

Calinski-Harabasz index.

sciv.tl.coefficient_of_variation(matrix: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list, axis: Literal[0, 1, -1] = 0, default: float = 0) float | list | set | Tuple | ndarray

Calculate the coefficient of variation.

Parameters

matrixmatrix_data

Input matrix data.

axisLiteral[0, 1, -1], optional

Axis to calculate the coefficient of variation. Default is 0.

defaultfloat, optional

Default value for division by zero. Default is 0.

Returns

Union[float, collection]

Coefficient of variation.

sciv.tl.davies_bouldin(data: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list, labels: list | set | Tuple | ndarray) float

Davies-Bouldin index (DBI).

Parameters

datamatrix_data

A list of n_features-dimensional data points. Each row corresponds to a single data point;

labelscollection

Predicted labels for each sample.

Returns

float

Davies-Bouldin index.

sciv.tl.euclidean_distances(data1: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list, data2: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list | None = None, block_size: int = -1) coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list

Calculate the Euclidean distance between two matrices.

Parameters

data1matrix_data

First data;

data2matrix_data

Second data (If the second data is empty, it will default to the first data.)

block_sizeint

The size of the segmentation stored in block wise matrix multiplication. If the value is less than or equal to zero, no block operation will be performed.

Returns

matrix_data

Data of Euclidean distance.

sciv.tl.is_asc_sort(positions_list: list) bool

Judge whether the site is in ascending order.

Parameters

positions_listlist

Positions list.

Returns

bool

True for ascending order, otherwise False.

sciv.tl.jaccard_similarity(data: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list, n_jobs: int = -1, is_to_dense: bool = False) coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list

Calculate the Jaccard similarity matrix.

Parameters

datamatrix_data

Input cell feature data;

n_jobsint, optional

The number of jobs to use for the computation.

is_to_densebool, optional

Whether to convert the data into a dense matrix.

Returns

matrix_data

Jaccard similarity matrix.

sciv.tl.k_means(data: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list, n_clusters: int = 8, is_to_dense: bool = False) list | set | Tuple | ndarray

Perform k-means clustering on data.

Parameters

datamatrix_data

Input data matrix;

n_clustersint, optional

The number of clusters to form as well as the number of centroids to generate.

is_to_densebool, optional

Whether to convert the data into a dense matrix.

Returns

collection

Tags after k-means clustering.

sciv.tl.kl_divergence(data1: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list, data2: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list) float

Calculate KL divergence for two data.

Parameters

data1matrix_data

First data.

data2matrix_data

Second data.

Returns

float

KL divergence score.

sciv.tl.lsi(data: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list, n_components: int = 50, is_to_dense: bool = False) ndarray | matrix | list

SVD LSI.

Parameters

datamatrix_data

Input cell feature data;

n_componentsint, optional

Dimensions that need to be reduced to.

is_to_densebool, optional

Whether to convert the data into a dense matrix.

Returns

dense_data

Reduced dimensional data (SVD LSI model).

sciv.tl.marginal_normalize(matrix: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list, axis: Literal[0, 1] = 0, default: float = 1e-50) coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list

Marginal standardization.

Parameters

matrixmatrix_data

Standardized data matrix required;

axisLiteral[0, 1], optional

Standardize according to which dimension;

defaultfloat, optional

To prevent division by 0, this value needs to be added to the denominator.

Returns

matrix_data

Standardized data.

sciv.tl.mean_symmetric_scale(data: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list, axis: Literal[0, 1, -1] = -1, is_verbose: bool = True) coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list

Calculate the mean symmetric.

Parameters

datamatrix_data

Input data;

axisLiteral[0, 1, -1], optional

Standardize according to which dimension.

is_verbosebool, optional

log information.

Returns

matrix_data

Standardized data after average symmetry.

sciv.tl.min_max_norm(data: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list, axis: Literal[0, 1, -1] = -1) ndarray | matrix | list

Calculate min max standardized data.

Parameters

datamatrix_data

Input data;

axisLiteral[0, 1, -1], optional

Standardize according to which dimension.

Returns

dense_data

Standardized data.

sciv.tl.obtain_cell_cell_network(adata: AnnData, k: int = 30, or_k: int = 10, weight: float = 0.5, kernel: Literal['laplacian', 'gaussian'] = 'gaussian', local_k: int = 10, gamma: float | str | list | set | Tuple | ndarray | None = None, is_simple: bool = True) AnnData

Calculate cell-cell correlation

Parameters

adataAnnData

scATAC-seq data;

kint

When building an M-KNN network, the number of nodes connected by each node (and);

or_kint

When building an M-KNN network, the number of nodes connected by each node (or);

weightfloat

The weight of interactions or operations;

local_kint

Number of neighbors for the adaptive kernel;

kernelLiteral[“laplacian”, “gaussian”]

Determine the kernel function to be used;

gammaOptional[Union[float, str, collection]]

When the value of kernel is “laplacian”, if it is None, then it is the reciprocal of the latent representation dimension of the cell. When the value of kernel is “gaussian”, if it is None, then it defaults to an adaptive value obtained through local information of the parameter local_k. Otherwise, it should be strictly positive;

is_simplebool

True represents not adding unnecessary intermediate variables, only adding the final result. It is worth noting that when set to True, the is_ablation parameter will become invalid, and when set to False, is_ablation will only take effect;

Returns

AnnData

Cell similarity data.

sciv.tl.overlap(regions: DataFrame, variants: DataFrame) DataFrame

Relate the peak region and variant site.

Parameters

regionsDataFrame

Information of peaks.

variantsDataFrame

Information of variants.

Returns

DataFrame

The variant maps data in the peak region.

sciv.tl.overlap_sum(regions: AnnData, variants: dict, trait_info: DataFrame, n_jobs: int = -1) AnnData

Overlap regional data and mutation data and sum the PP values of all mutations in a region as the values for that region.

Parameters

regionsAnnData

Data of peaks.

variantsdict

Data of variants.

trait_infoDataFrame

Information of traits.

n_jobsint

The maximum number of concurrently running jobs.

sciv.tl.pca(data: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list, n_components: int = 50, is_to_dense: bool = False) ndarray | matrix | list

PCA.

Parameters

datamatrix_data

Input cell feature data;

n_componentsint, optional

Dimensions that need to be reduced to.

is_to_densebool, optional

Whether to convert the data into a dense matrix.

Returns

dense_data

Reduced dimensional data.

sciv.tl.perturb_data(data: list | set | Tuple | ndarray, percentage: float) list | set | Tuple | ndarray

Randomly perturbs the positions of a percentage of data.

Parameters

datacollection

List of data elements to be perturbed.

percentagefloat

Percentage of data to be perturbed.

Returns

collection

Perturbed data list.

sciv.tl.safe_kl_divergence(p: list | set | Tuple | ndarray, q: list | set | Tuple | ndarray, epsilon: float = 1e-10) float

Safe KL divergence calculation to avoid division by zero.

Parameters

pcollection

First data.

qcollection

Second data.

epsilonfloat, optional

The small value to add to the denominator to avoid zeros.

Returns

float

KL divergence score.

sciv.tl.semi_mutual_knn_weight(data: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list, k: int = 30, or_k: int = 10, weight: float = 0.5, is_for: bool = True, is_mknn_fully_connected: bool = True) Tuple[coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list, coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list]

Mutual KNN with weight.

Parameters

datamatrix_data

Input data matrix;

kint, optional

The number of nearest neighbors (AND);

or_kint, optional

The number of or nearest neighbors (OR);

weightfloat, optional

The weight of interactions or operations;

is_forbool, optional

Obtain the nearest neighbors of each node from each row of the for loop matrix; Setting it to True is very suitable for situations with large samples and insufficient memory.

is_mknn_fully_connectedbool, optional

Is the network of MKNN an all connected graph? If the value is True, it ensures that a node is connected to at least the node that is not closest to itself. This parameter does not affect the result of SM-KNN (the first result), but only affects the result of traditional M-KNN (the second result).

Returns

matrix_data

Adjacency weight matrix.

sciv.tl.sigmoid(data: list | set | Tuple | ndarray | coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | matrix) list | set | Tuple | ndarray | coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | matrix

Sigmoid function.

Parameters

datacollection, matrix_data

Input data.

Returns

collection, matrix_data

Sigmoid output.

sciv.tl.silhouette(data: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list, labels: list | set | Tuple | ndarray) float

silhouette score.

Parameters

datamatrix_data

An array of pairwise distances between samples, or a feature array.

labelscollection

Predicted labels for each sample.

Returns

float

silhouette score.

sciv.tl.spectral_clustering(data: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list, n_clusters: int = 8, n_components=30, eigen_solver='arpack', is_to_dense: bool = False) list | set | Tuple | ndarray

Spectral clustering on data.

Parameters

datamatrix_data

Input data matrix;

n_clustersint, optional

The dimension of the projection subspace.

n_componentsint, optional

The dimension of the projection subspace.

eigen_solverstr, optional

Default use of Nyström approximation.

is_to_densebool, optional

Whether to convert the data into a dense matrix.

Returns

collection

Tags after spectral clustering.

sciv.tl.spectral_eigenmaps(data: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list, n_components: int = 30, affinity: Literal['nearest_neighbors', 'rbf', 'precomputed', 'precomputed_nearest_neighbors', 'jaccard'] = 'nearest_neighbors', eigen_solver: Literal['arpack', 'lobpcg', 'amg'] | None = None, n_jobs: int = -1, is_to_dense: bool = False) ndarray | matrix | list

Spectral Eigenmaps.

Parameters

datamatrix_data

Input cell feature data;

n_componentsint, optional

Dimensions that need to be reduced to.

eigen_solverOptional[_EigenSolver], optional

The eigenvalue decomposition strategy to use.

affinity: method n_jobs : int, optional

The number of jobs to use for the computation.

is_to_densebool, optional

Whether to convert the data into a dense matrix.

Returns

dense_data

Reduced dimensional data.

sciv.tl.symmetric_scale(data: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list, scale: int | float | list | set | Tuple | ndarray = 2.0, axis: Literal[0, 1, -1] = -1, is_verbose: bool = True) coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list

Symmetric scale Function.

Parameters

datamatrix_data

Input data;

axisLiteral[0, 1, -1], optional

Standardize according to which dimension;

scaleUnion[number, collection], optional

scaling factor.

is_verbosebool, optional

log information.

Returns

matrix_data

Standardized data.

sciv.tl.tf_idf(data: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list, ri_sparse: bool = True) coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list

TF-IDF transformer.

Parameters

datamatrix_data

Matrix data that needs to be converted;

ri_sparsebool, optional

(return_is_sparse) Whether to return sparse matrix.

Returns

matrix_data

Matrix processed by TF-IDF.

sciv.tl.tsne(data: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list, n_components: int = 2, is_to_dense: bool = False) coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list

T-SNE dimensionality reduction on data.

Parameters

datamatrix_data

Data matrix that requires dimensionality reduction;

n_componentsint, optional

Dimension of the embedded space.

is_to_densebool, optional

Whether to convert the data into a dense matrix.

Returns

matrix_data

Reduced dimensional data matrix.

sciv.tl.umap(data: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list, n_neighbors: float = 15, n_components: int = 2, min_dist: float = 0.15, is_to_dense: bool = False) coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list

UMAP dimensionality reduction on data.

Parameters

datamatrix_data

Data matrix that requires dimensionality reduction;

n_neighborsfloat, optional

The size of local neighborhood (in terms of number of neighboring sample points) used for manifold approximation. Larger values result in more global views of the manifold, while smaller values result in more local data being preserved. In general values should be in the range 2 to 100;

n_componentsint, optional

The dimension of the space to embed into. This defaults to 2 to provide easy visualization, but can reasonably be set to any integer value in the range 2 to 100.

min_distfloat, optional

The effective minimum distance between embedded points. Smaller values will result in a more clustered/clumped embedding where nearby points on the manifold are drawn closer together, while larger values will result on a more even dispersal of points. The value should be set relative to the spread value, which determines the scale at which embedded points will be spread out.

is_to_densebool, optional

Whether to convert the data into a dense matrix.

Returns

matrix_data

Reduced dimensional data matrix.

sciv.tl.z_score_marginal(matrix: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list, axis: Literal[0, 1] = 0) Tuple[coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list, coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list]

Matrix standardization (z-score, marginal).

Parameters

matrixmatrix_data

Standardized data matrix required.

axisLiteral[0, 1], optional

Standardize according to which dimension.

Returns

matrix_data, matrix_data

Standardized data. First element is the z-score data, second element is the mean data.

sciv.tl.z_score_normalize(data: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list, with_mean: bool = True, ri_sparse: bool | None = None, is_sklearn: bool = False) ndarray | matrix | list | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix

Matrix standardization (z-score).

Parameters

datamatrix_data

Standardized data matrix required.

with_meanbool, optional

If True, center the data before scaling.

ri_sparsebool | None, optional

(return_is_sparse) Whether to return sparse matrix.

is_sklearnbool, optional

This parameter represents whether to use the sklearn package.

Returns

dense_data, sparse_matrix

Standardized matrix.

sciv.tl.z_score_to_p_value(z_score: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list) coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list

Convert z-score to p-value.

Parameters

z_scorematrix_data

Input z-score data.

Returns

matrix_data

P-value data.

Random Walk

Random walk related functions.

class sciv.tl.RandomWalk(cc_adata: AnnData, init_status: AnnData, epsilon: float = 1e-05, max_steps: int = 300, gamma: float = 0.05, enrichment_gamma: float = 0.05, p: int = 2, n_jobs: int = -1, min_seed_cell_rate: float = 0.01, max_seed_cell_rate: float = 0.05, credible_threshold: float = 0, enrichment_threshold: Literal['golden', 'half', 'e', 'pi', 'none'] | float = 'golden', benchmark_count: int = 10, is_ablation: bool = False, is_simple: bool = True)

Bases: object

Random walk analysis.

run_ablation_m_knn() None

Using M-KNN fully connected cellular network.

run_ablation_ncsw() None

Removed cell weights in random walk and cluster type weights in initial scores.

run_ablation_ncw() None

Removed cell cluster type weights in initial scores

run_ablation_nsw() None

Removed cell weights from random walk.

run_benchmark() None

Perform random walk of random seeds on all traits.

run_core() None

Calculate weighted random walk.

run_en_ablation_m_knn() None

Using M-KNN fully connected cellular network (Enrichment analysis)

run_en_ablation_ncsw() None

Removed cell weights in random walk and cluster type weights in initial scores.

run_en_ablation_ncw() None

Removed cell cluster type weights in initial scores.

run_en_ablation_nsw() None

Removed cell weights from random walk

run_enrichment() None

Enrichment analysis.

run_knock(trs: AnnData, knock_trait: str, is_control: bool = False) None

Knockout analysis.

Parameters

trsAnnData

Input AnnData object.

knock_traitstr

Knockout trait or disease.

is_controlbool, optional

Whether to control the knockout. default is False.

static scale_norm(score: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list, is_verbose: bool = False) coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list

Scale normalization of the score matrix.

Parameters

scorematrix_data

Score matrix.

is_verbosebool, optional

Whether to print the progress. Defaults to False.

Returns

matrix_data

The normalized score matrix.

class sciv.tl.TraitDataParallel(module: T, device_ids: Sequence[int | device] | None = None, output_device: int | device | None = None, dim: int = 0)

Bases: DataParallel

Data parallel module for trait analysis.

gather(outputs, output_device)

Collect the results after parallel processing, check for the existence of results, and merge the results by column (each result matrix has the same number of rows but different numbers of columns).

Parameters

outputslist

Output results of each device

output_deviceint

Output device ID.

Returns

Tensor

The merged results sorted by column.

scatter(inputs, kwargs, device_ids)

Scatter the input data to multiple devices.

Parameters

inputslist

List of input data to be scattered.

kwargsdict

Dictionary of keyword arguments to be scattered.

device_idslist

List of device IDs to scatter the data to.

Returns

tuple

Tuple of scattered input data and keyword arguments.

sciv.tl.random_walk(seed_cell_weight: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list, weight: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list, gamma: float = 0.05, epsilon: float = 1e-05, max_steps: int = 300, p: int = 2, n_jobs: int = -1, device: str = 'auto') coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list

Random walk analysis.

Parameters

seed_cell_weightmatrix_data

Seed cell weight matrix, where each column represents a seed cell.

weightmatrix_data

Transition probability matrix (weight matrix). Defaults to None.

gammafloat

Random walk parameter. Defaults to 0.05.

epsilonfloat

Convergence threshold. Defaults to 1e-5.

max_stepsint

Maximum number of steps. Defaults to 300.

pint

Order of the random walk. Defaults to 2.

n_jobsint

Number of jobs to run in parallel. Defaults to -1, which means using all available processors.

devicestr

Device to run the analysis on. Defaults to ‘auto’.

Returns

matrix_data

The association score matrix, where each column represents the association score of a seed cell.

sciv.tl.trs_scale_norm(score: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list, axis: Literal[0, 1, -1] = 0, is_verbose: bool = True) coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list

Standardize and normalize the cell scores.

Parameters

scorematrix_data

Cell scores matrix.

axisLiteral[0, 1, -1]

Axis to apply the standardization and normalization. Defaults to 0.

is_verbosebool

Whether to print the progress. Defaults to True.

Returns

matrix_data

The standardized and normalized cell scores matrix.

Matrix

Matrix operation related functions.

sciv.tl.down_sampling_data(data: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list | set | Tuple, sample_number: int = 1000000) list

Down-sampling data.

Parameters

dataUnion[matrix_data | collection]

Data that requires down-sampling;

sample_numberint, optional

How many samples (values) were down-sampled.

Returns

list

Data after down-sampling.

sciv.tl.matrix_callback_block_storage(matrix: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list, callback, block_size: int = 10000, data: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list | None = None) coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list

Callback matrix.

Parameters

matrixmatrix_data

Matrix

callbackcallable

callback function

block_sizeint, optional

The size of the segmentation stored in block wise matrix multiplication. If the value is less than or equal to zero, no block operation will be performed.

datamatrix_data, optional

Return the placeholder variables of the result matrix. If there is value, it will reduce the consumption of memory space.

Returns

matrix_data

Result Matrix (CSR format)

sciv.tl.matrix_division_block_storage(matrix: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list, value: float | int | list | set | Tuple | ndarray | coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | matrix, block_size: int = 10000, data: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list | None = None) coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list

Dividing a matrix by another value, vector, or matrix.

Parameters

matrixmatrix_data

Matrix

valueUnion[float, int, collection, matrix_data]

Value, vector, or matrix

block_sizeint, optional

The size of the segmentation stored in block wise matrix multiplication. If the value is less than or equal to zero, no block operation will be performed.

datamatrix_data, optional

Return the placeholder variables of the result matrix. If there is value, it will reduce the consumption of memory space.

Returns

matrix_data

Result Matrix (CSR format)

sciv.tl.matrix_dot_block_storage(data1: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list, data2: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list, block_size: int = 10000, is_return_sparse: bool = False, data: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list | None = None) coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list

Perform Cartesian product of two matrices through block storage method.

Parameters

data1matrix_data

Matrix 1

data2matrix_data

Matrix 2

block_sizeint, optional

The size of the segmentation stored in block wise matrix multiplication. If the value is less than or equal to zero, no block operation will be performed.

is_return_sparsebool, optional

Whether to return sparse matrix.

datamatrix_data, optional

Return the placeholder variables of the result matrix. If there is value, it will reduce the consumption of memory space.

Returns

matrix_data

Cartesian product result

sciv.tl.matrix_multiply_block_storage(data1: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list, data2: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list, block_size: int = 10000, data: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list | None = None) coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list

Perform Hadamard product of two matrices through block storage method.

Parameters

data1matrix_data

Matrix 1

data2matrix_data

Matrix 2

block_sizeint, optional

The size of the segmentation stored in block wise matrix multiplication. If the value is less than or equal to zero, no block operation will be performed.

datamatrix_data, optional

Return the placeholder variables of the result matrix. If there is value, it will reduce the consumption of memory space.

Returns

matrix_data

Hadamard product result

sciv.tl.matrix_operation_memory_efficient(data1: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list, data2: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list | int | float, chunk_size: int = 10000, default: float = 100000000.0, operation: Literal['+', '-', '*', '/'] = '*') coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix

Perform element-wise addition, subtraction, multiplication, and division on two sparse matrices by blocks, supporting memory-efficient processing.

Parameters

data1matrix_data

Sparse matrix 1

data2Union[matrix_data, number]

Sparse matrix 2

chunk_sizeint, optional

The size of the segmentation stored in block wise element-wise operation. If the value is less than or equal to zero, no block operation will be performed.

defaultfloat, optional

Default value for division operation when denominator is 0. If the value is 0, it will raise a ValueError.

operationLiteral[‘+’, ‘-’, ‘*’, ‘/’], optional

Element-wise operation type, optional ‘+’, ‘-’, ‘*’, ‘/’

Returns

sparse_matrix

Result sparse matrix (CSR format)

sciv.tl.merge_matrix(datas: list, axis: Literal[0, 1] = 0) coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list

Merge multiple matrix data into one matrix.

Parameters

dataslist

List of matrix data.

axisLiteral[0, 1], optional

Axis to merge the matrix. Default is 0.

Returns

matrix_data

Merged matrix data.

sciv.tl.split_matrix(data: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list, axis: Literal[0, 1] = 0, chunk_number: int = 1000) list

Split matrix into multiple parts.

Parameters

datamatrix_data

Input matrix data.

axisLiteral[0, 1], optional

Axis to split the matrix. Default is 0.

chunk_numberint, optional

Number of parts to split the matrix. Default is 1000.

Returns

list

List of split matrix data.

sciv.tl.vector_multiply_block_storage(data1: list | set | Tuple | ndarray, data2: list | set | Tuple | ndarray, block_size: int = 10000, data: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list | None = None) coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list

Two vectors are broadcast in rows and columns respectively and multiplied by Hadamard product.

Parameters

data1collection

Vector 1

data2collection

Vector 2

block_sizeint, optional

The size of the segmentation stored in block wise matrix multiplication. If the value is less than or equal to zero, no block operation will be performed.

datamatrix_data, optional

Return the placeholder variables of the result matrix. If there is value, it will reduce the consumption of memory space.

Returns

matrix_data

Result Matrix (CSR format)

Util (.ul)

A universal tool interface that includes constant definitions, logging, and auxiliary functions.

sciv.ul.check_adata_get(adata: AnnData, layer: str | None = None, is_dense: bool = True, is_matrix: bool = False) AnnData

Check if layer is in .layers, and instantiate a new AnnData with it as .X.

Parameters

adataAnnData

Input AnnData object.

layerstr, optional

Layer of the data. Default is None.

is_densebool, optional

Whether to return dense matrix. Default is True.

is_matrixbool, optional

Whether to return matrix. Default is False.

Returns

AnnData

Data.

sciv.ul.check_gpu_availability(verbose: bool = False) bool

Check the availability of GPU.

Parameters

verbosebool, optional

Whether to print the information. Default is False.

Returns

bool

Whether the GPU is available.

Examples

>>> availability = sciv.ul.check_gpu_availability()
sciv.ul.file_method(name: str | None = None, is_verbose: bool = False) StaticMethod

Create file method handler class

Create a StaticMethod class instance based on the given name for handling file operations. If a name is provided, it will be combined with the project name as the handler file name; otherwise, only the project name will be used.

Parameters

namestr, optional

File handler name suffix, default is None

is_verbosebool, default is False

Is log information displayed

Returns

StaticMethod

Configured StaticMethod class instance

sciv.ul.generate_hex_colors(num_colors: int) list

Generate random hex colors.

Parameters

num_colorsint

Number of colors to generate.

Returns

list

List of random hex colors.

Examples

>>> colors3 = sciv.ul.generate_hex_colors(3)
>>> colors5 = sciv.ul.generate_hex_colors(5)
>>> print(f"Generate three colors: {colors3}")
>>> print(f"Generate five colors: {colors5}")
sciv.ul.generate_str(length: int = 10) str

Generate a random string.

Parameters

lengthint, optional

Length of the string. Default is 10.

Returns

str

Random string.

sciv.ul.get_index(position: int | float, positions_list: list, is_sort: bool = True) int | Tuple[int, int]
Search for position information. Similar to half search.

If the position exists in the list, return the index. If it does not exist, return the index located between the two indexes.

Parameters

positionnumber

Position to search for.

positions_listlist

Position list.

is_sortbool, optional

Whether to sort the list. Default is True.

Returns

Union[int, Tuple[int, int]]

Position index.

sciv.ul.get_real_predict_label(df: DataFrame, map_groupby: str | list | set | Tuple | ndarray, groupby: str = 'clusters', value: str = 'value') Tuple[DataFrame, int, list]

Get the real and predict label of the trait.

Parameters

dfDataFrame

Input data.

map_groupbyUnion[str, collection]

Map of the cluster.

groupbystr, optional

Name of the column of the cluster. Default is “clusters”.

valuestr, optional

Name of the column of the value. Default is “value”.

Returns

Tuple[DataFrame, int, list]

Sorted DataFrame. Number of the cluster. List of the cluster.

sciv.ul.list_duplicate_set(data: list) list

Append numbering to duplicate information. If data is None, return an empty list. If data is a list, return it as is. If data is a collection, return it converted to a list.

Parameters

datalist

Input data.

Returns

list

Unique data with constant quantity.

sciv.ul.list_index(data: list) Tuple[list, list | set | Tuple | ndarray]

Get the index of each element in a list.

Parameters

datalist

Input data.

Returns

Tuple[list, collection]

Index of each element in the list. Types of the elements in the list.

sciv.ul.log(name: str | None = None) Logger

Create log handler class

Create a Logger class instance based on the given name for logging. If a name is provided, it will be combined with the project name as the log file name; otherwise, only the project name will be used.

Parameters

namestr, optional

Log handler name suffix, default is None

Returns

Logger

Configured Logger class instance

sciv.ul.merge_matrix(datas: list, axis: Literal[0, 1] = 0) list

Merge multiple matrices into one matrix.

Parameters

dataslist

Input data.

axisLiteral[0, 1], optional

Axis to merge the matrices. Default is 0.

Returns

list

Merged data.

sciv.ul.numerical_bisection_step(min_value: float, max_value: float, step_length: float) Tuple[list | set | Tuple | ndarray, int]

Get the numerical bisection step.

Parameters

min_valuefloat

Minimum value of the step.

max_valuefloat

Maximum value of the step.

step_lengthfloat

Step length of the bisection.

Returns

Tuple[collection, int]

Numerical bisection step. Number of steps.

sciv.ul.set_inf_value(matrix: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list) None

Set the infinite value of the matrix to the maximum value of the matrix.

Parameters

matrixmatrix_data

Input matrix.

sciv.ul.split_matrix(data: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list, axis: Literal[0, 1] = 0, chunk_number: int = 1000) list

Split a matrix into multiple parts.

Parameters

datamatrix_data

Input data.

axisLiteral[0, 1], optional

Axis to split the matrix. Default is 0.

chunk_numberint, optional

Number of parts to split the matrix. Default is 1000.

Returns

list

Split data.

sciv.ul.strings_map_numbers(str_list: list, start: int = 0) list

Map strings to numerical values.

Parameters

str_listlist

Input strings.

startint, optional

Start value of the mapping. Default is 0.

Returns

list

Mapped numerical values.

sciv.ul.sum_min_max(data: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list, axis: int = 1) Tuple[int | float, int | float]

Obtain the minimum/maximum sum of rows in the matrix

Obtain the minimum/maximum sum of rows in the matrix. If data is None, return (0, 0). If data is a dense matrix, return the minimum/maximum sum of rows. If data is a sparse matrix, return the minimum/maximum sum of rows.

Returns

Tuple[number, number]

Minimum value of rows, maximum value of rows.

sciv.ul.to_dense(sm: coo_array | csr_array | csc_array | dok_array | lil_array | bsr_array | dia_array | sparray | coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix | ndarray | matrix | list, is_array: bool = False) ndarray | matrix | list

Convert sparse matrix to dense matrix

Convert a sparse matrix to a dense matrix. If sm is None, return an empty dense matrix. If sm is a dense matrix, return it as is. If sm is a sparse matrix, return it converted to array form.

Returns

dense_data

Converted dense matrix.

sciv.ul.to_sparse(dm: ~numpy.ndarray | ~numpy.matrix | list, way_callback=<class 'scipy.sparse._csr.csr_matrix'>, is_matrix: bool = True) coo_matrix | csr_matrix | csc_matrix | dok_matrix | lil_matrix | bsr_matrix | dia_matrix | spmatrix

Convert dense matrix to sparse matrix

Convert a dense matrix. If dm is None, return an empty sparse matrix. If dm is a sparse matrix, return it as is. If dm is a dense matrix, return it converted to sparse form.

Returns

sparse_matrix

Converted sparse matrix.

sciv.ul.track_with_memory(interval: float = 60) Callable

Create memory tracking decorator

Create a decorator function that records memory usage at fixed intervals during function execution. Returns the result, elapsed time, and memory list.

Parameters

intervalfloat, optional

Sampling interval (seconds), default is 60 seconds.

Returns

Callable

Decorator function; when the wrapped function is called, it returns a dictionary containing:<br/> - ‘result’: the original function’s return value.<br/> - ‘time’: function execution time (seconds) if is_monitor is True, otherwise None.<br/> - ‘memory’: list of sampled memory usage (bytes) if is_monitor is True, otherwise None.<br/>