Understanding Volcano Plots
What is a Volcano Plot and What are they Used for?
JMP Clinical Documentation Glossary defines a volcano plot as “a scatterplot of the negative log[10]-transformed p-values derived from a specific t-test against the log2-fold change in expression”. When analyzing data for RNA based gene expression profiling (GEP) the volcano plot illustrates the log[10]-transformed adjusted p-Value against the log-fold change for each probe in the assay.
A volcano plot is useful for identifying events that differ significantly between two groups of experimental subjects. The name volcano plot comes from its resemblance to a volcanic eruption with the most significant points at the top, like spewed pieces of molten lava.
How do you interpret a Volcano Plot?
Each point on the graph represents a gene. The log2-fold differences between the groups are plotted on the x-axis and the -log10 p-value differences are plotted on the y-axis. The horizontal dashed line represents the significance threshold specified in the analysis, usually derived using a multiple testing correction.
Genes whose expression is decreased versus the comparison group are located to the left of zero on the x-axis while genes whose expression is increased are illustrated to the right of zero. Genes with statistically significant differential expression lie above a horizontal threshold. Closer to zero indicates less change while moving away from zero in either direction indicates more change. Volcano plots provide an effective means for visualizing the direction, magnitude, and significance of changes in gene expression.
Volcano plots like the one shown above are useful when there are many (thousands or even millions) of observations with a wide range of differences, both positive and negative. It exhibits a densely populated, symmetrical “V” shape. When the number of observations is reduced or the variation in response is not so evenly distributed, the volcano plot might appear as shown below.
The HTG EdgeSeq Reveal (Reveal) software is a powerful, simple-to-use integrated solution for interrogating and visualizing gene expression data using the HTG platform. Reveal allows for easy generation of not only Volcano plots but also Principal Component Analysis plots, Heat Maps and Expression Profiles. To request a demo click here.
Regardless of the shape of any given volcano plot, most researchers will be eager to examine the genes that express the greatest variation compared to control signal. The distinct characteristics between the two groups, such as healthy vs. diseased tissue or placebo vs. treatment patients is currently guiding cutting-edge discoveries about the biology and potential treatment targets for health conditions of all types.