Bulk Cell Preprocessing

deside.bulk_cell.read_counts2tpm(read_counts_file_path: str, annotation_file_path: str, result_dir: str, file_type='htseq.counts', file_name_prefix: str = '')[source]

Convert read counts (htseq.counts) to TPM (transcript per million)

Parameters:
  • read_counts_file_path – the file path of merged read counts file (.htseq.counts), separated by tab or comma, gene by sample

  • result_dir – the folder of saving result files

  • annotation_file_path – file path of gencode.gene.info.v22.tsv download from https://api.gdc.cancer.gov/data/b011ee3e-14d8-4a97-aed4-e0b10f6bbe82 or other annotation files, gene_type, gene_name and exon_length should be included

  • file_type – htseq.counts, raw data type downloaded from https://portal.gdc.cancer.gov/

  • file_name_prefix – prefix of result file, only for naming

Returns:

None