cell-extract (1.1.0)

Published 2026-05-05 21:31:45 +00:00 by ross in ross/cell-extract

Installation

pip install --index-url  cell-extract

About this package

Extract specific columns from multiple tabular files and merge by row identifier.

cell-extract

Extract specific columns from multiple tabular files and merge them by row identifier.

cell-extract reads CSV or TSV files, extracts a user-specified column from each, and produces a single merged output table. The first column is treated as the row identifier, and rows are unioned across all input files.

Installation

pip install -e .

Or with dev dependencies (for testing):

pip install -e ".[dev]"

Quick Start

Given two files:

alpha.csv

gene,expr,pval
BRCA1,2.3,0.01
TP53,5.1,0.001

beta.csv

gene,expr,pval
BRCA1,4.7,0.02
TP53,3.2,0.05

Run:

cell-extract alpha.csv beta.csv --column expr

Output:

Origin,alpha,beta
BRCA1,2.3,4.7
TP53,5.1,3.2

Usage

cell-extract [OPTIONS] FILE [FILE ...]

Required Argument

Option Description
--column COLUMN, -c COLUMN Name of the column to extract from each file

Options

Option Description
--output FILE, -o FILE Write output to FILE instead of stdout
--output-format {csv,tsv}, -f {csv,tsv} Output format (default: csv)
--with-source, -s Add a Source column recording which original column was extracted (useful for tracing/debugging)
--version, -V Show version and exit
--help, -h Show help message

Behavior

  • Row union: All row identifiers from all files are included. If a row exists in file A but not file B, the cell for file B will be empty.
  • Empty cells: Missing values (NaN-equivalent) are written as empty fields.
  • Delimiter detection: .csv → comma; .tsv or .tab → tab.
  • Output format: CSV by default; use -f tsv for TSV output.
  • First column is the identifier: The leftmost column is always treated as the row key; it appears as "Origin" in the output header.

Examples

Multiple files with different row sets

cell-extract set1.csv set2.csv set3.csv -c expression

TSV input and output

cell-extract -c count -f tsv sample_A.tsv sample_B.tsv

Save to file

cell-extract -c fold_change -o merged.csv *.csv

With source tracing column

cell-extract alpha.csv beta.csv -c expr -s
# Origin,alpha,beta,Source
# BRCA1,2.3,4.7,expr
# TP53,5.1,3.2,expr

Useful when tracking which original column a value came from during debugging.

Development

Run tests:

pip install -e ".[dev]"
pytest tests/

License

MIT

Requirements

Requires Python: >=3.9
Details
PyPI
2026-05-05 21:31:45 +00:00
17
MIT
14 KiB
Assets (2)
Versions (6) View all
1.4.0 2026-05-21
1.3.1 2026-05-20
1.3.0 2026-05-05
1.2.0 2026-05-05
1.1.0 2026-05-05