cell-extract (1.0.2)
Installation
pip install --index-url cell-extractAbout this package
Extract specific columns from multiple tabular files and merge by row identifier.
cell-extract
Extract specific columns from multiple tabular files and merge them by row identifier.
cell-extract reads CSV or TSV files, extracts a user-specified column from each, and produces a single merged output table. The first column is treated as the row identifier, and rows are unioned across all input files.
Installation
pip install -e .
Or with dev dependencies (for testing):
pip install -e ".[dev]"
Quick Start
Given two files:
alpha.csv
gene,expr,pval
BRCA1,2.3,0.01
TP53,5.1,0.001
beta.csv
gene,expr,pval
BRCA1,4.7,0.02
TP53,3.2,0.05
Run:
cell-extract alpha.csv beta.csv --column expr
Output:
Origin,alpha,beta
BRCA1,2.3,4.7
TP53,5.1,3.2
Usage
cell-extract [OPTIONS] FILE [FILE ...]
Required Argument
| Option | Description |
|---|---|
--column COLUMN, -c COLUMN |
Name of the column to extract from each file |
Options
| Option | Description |
|---|---|
--output FILE, -o FILE |
Write output to FILE instead of stdout |
--output-format {csv,tsv}, -f {csv,tsv} |
Output format (default: csv) |
--with-source, -s |
Add a Source column recording which original column was extracted (useful for tracing/debugging) |
--version, -V |
Show version and exit |
--help, -h |
Show help message |
Behavior
- Row union: All row identifiers from all files are included. If a row exists in file A but not file B, the cell for file B will be empty.
- Empty cells: Missing values (NaN-equivalent) are written as empty fields.
- Delimiter detection:
.csv→ comma;.tsvor.tab→ tab. - Output format: CSV by default; use
-f tsvfor TSV output. - First column is the identifier: The leftmost column is always treated as the row key; it appears as "Origin" in the output header.
Examples
Multiple files with different row sets
cell-extract set1.csv set2.csv set3.csv -c expression
TSV input and output
cell-extract -c count -f tsv sample_A.tsv sample_B.tsv
Save to file
cell-extract -c fold_change -o merged.csv *.csv
With source tracing column
cell-extract alpha.csv beta.csv -c expr -s
# Origin,alpha,beta,Source
# BRCA1,2.3,4.7,expr
# TP53,5.1,3.2,expr
Useful when tracking which original column a value came from during debugging.
Development
Run tests:
pip install -e ".[dev]"
pytest tests/
License
MIT
Requirements
Requires Python: >=3.9
Details
Assets (2)
Versions (6)
View all
cell_extract-1.0.2.tar.gz
7.5 KiB