RNAseq:mRNA表达FPKM标准话原理和计算方法

mRNA Expression HT-Seq Normalization

RNA-Seq expression level read counts produced by HT-Seq are normalized using two similar methods: FPKM and FPKM-UQ. Normalized values should be used only within the context of the entire gene set. Users are encouraged to normalize raw read count values if a subset of genes is investigated.

FPKM

The Fragments per Kilobase of transcript per Million mapped reads (FPKM) calculation normalizes read count by dividing it by the gene length and the total number of reads mapped to protein-coding genes.

Upper Quartile FPKM

The upper quartile FPKM (FPKM-UQ) is a modified FPKM calculation in which the total protein-coding read count is replaced by the 75th percentile read count value for the sample.

Calculations

Calc_FPKM_andUQ

  • RCg: Number of reads mapped to the gene
  • RCpc: Number of reads mapped to all protein-coding genes
  • RCg75: The 75th percentile read count value for genes in the sample
  • L: Length of the gene in base pairs; Calculated as the sum of all exons in a gene

Note: The read count is multiplied by a scalar (109) during normalization to account for the kilobase and ‘million mapped reads’ units.

举例

Sample 1: Gene A

  • Gene length: 3,000 bp
  • 1,000 reads mapped to Gene A
  • 1,000,000 reads mapped to all protein-coding regions
  • Read count in Sample 1 for 75th percentile gene: 2,000

FPKM for Gene A = (1,000)*(10^9)/[(3,000)*(1,000,000)] = 333.33

FPKM-UQ for Gene A = (1,000)*(10^9)/[(3,000)*(2,000)] = 166,666.67

如若转载,请注明出处:https://www.ouq.net/951.html

(0)
打赏 微信打赏,为服务器增加50M流量 微信打赏,为服务器增加50M流量 支付宝打赏,为服务器增加50M流量 支付宝打赏,为服务器增加50M流量
上一篇 06/25/2021 19:11
下一篇 06/30/2021 19:25

相关推荐

  • 本地部署DeepSeek教程

    本地部署DeepSeek的意义:企业用户>个人用户 不联网:数据隐私可保证 自己部署:随时可用 部署私有知识库:专属AI模型 本地部署DeepSeek的缺陷 质量差:本地比官方服更差 部署复杂:有一定操作难度 场景较少:个人用户部署价…

    机器学习 02/04/2025
    134
  • DeepSeek 的使用教程

    一、什么是 DeepSeek? DeepSeek 是一款专注于高效信息处理与智能交互的人工智能工具,支持文本生成、数据分析、代码编写、知识问答等功能。其核心能力包括: 自然语言对话:回答复杂问题、提供建议。 多场景应用:编程辅助、内容创作、…

    02/04/2025
    321
  • NCBI SRA Toolkit介绍

    SRA Toolkit The Sequence Read Archive (SRA Toolkit) stores raw sequence data from “next-generation” sequenci…

    生物信息技术 01/05/2025
    137
  • CS229 机器学习课程复习材料-概率论

    CS229 机器学习课程复习材料-概率论 概率论复习和参考 概率论是对不确定性的研究。通过这门课,我们将依靠概率论中的概念来推导机器学习算法。这篇笔记试图涵盖适用于CS229的概率论基础。概率论的数学理论非常复杂,并且涉及到“分析”的一个分…

    12/23/2024
    146
  • 机器学习:数学基础知识

    数学基础知识 高等数学 1.导数定义: 导数和微分的概念  (1) 或者:  (2) 2.左右导数导数的几何意义和物理意义 函数在处的左、右导数分别定义为: 左导数: 右导数: 3.函数的可导性与连续性之间的关系 Th1: 函数在处可微在处…

    机器学习 12/23/2024
    169