Graphs and tables are powerful storytelling tools and are critical components of scientific publications. Often readers will skip reading the main text of the manuscript entirely and will only look at the display items. Large complex datasets that would be complicated to explain in words can be quickly communicated via tables and figures. Therefore, it is important that your display items clearly communicate your most important findings and can stand alone from the text. Tables are an easy way to summarize large amounts of data, and a well-composed figure can convey a convincing argument just by visualizing the data. Below we present some guidelines to consider when presenting data in a manuscript.
Tables are a concise and effective way to present large amounts of data. If you simply want to summarize specific information or if your message requires precise values, you should use a table. Tables are also a convenient display tool when you have many different units of measure, which can be difficult to present in an easy-to-read manner in a graph.
A well-designed table should have:
Tables are a great way to present large amounts of data; however, they can take a long time to interpret and do not easily communicate data trends. The viewer must connect the dots between the data to see these trends in tables. When you want to show the relationship of the data, illustrate trends, or make comparisons, data plots are best.
Data plots can quickly convey information from large quantities of data and are often used to show a functional or statistical relationship between two or more items.
Well-designed data plots should have:
Data plots are used to display quantitative data or objective measurements or counts that can be either discrete or continuous. Examples of quantitative data include weight, height, temperature, counts, etc.
Discrete quantitative data are counts of categorical data and cannot be meaningfully divided into smaller increments. For example, a single household can have 1 or 2 pets, but it cannot have 1.5. Only a finite number of possible values can be recorded for a single observation.
Common display formats for discrete data include:
Figure 1. Bar graph summarizing questionnaire responses.
Figure 2. Line graph summarizing the number of recovered patients over time.
Continuous data can take on any numeric value and can be divided into smaller increments, including fractional and decimal values. There are an infinite number of possible values between any two values. Measurements such as height, weight, and temperature can take on any numeric value, including fractional or decimal values, and are all examples of continuous data.
Common display formats for continuous data include:
Figure 3. Normal vs. skewed distributions displayed as histograms.
Figure 4. Example of clustered data points and data point outlier displayed as dot plots.
Figure 5. Average monthly temperatures.
Figure 6. Scatter plot demonstrating a positive and negative relationship and the corresponding Pearson's correlations.
Avoid using bar or line graphs to plot continuous data. Bar and line graphs obscure the data distribution and don't provide a complete picture to the reader as many different distributions can produce similar bar and line graphs. The figure below demonstrates how different datasets can produce the same bar graph.
Each of the scatterplots displayed on the right could produce the bar graph shown on the left which shows a difference between groups. The data shown in panel B demonstrate that the data are symmetrically distributed with a high degree of overlap between groups. In panel C, the difference between groups is largely driven by an outlier. The data shown in panel D are bimodally distributed in each group, suggesting potential subgroups that may warrant further investigation. In panel E, there are twice as many data points in Group A as there are in Group B. The narrower distribution of Group B may simply be due to the fact that there are fewer data points suggesting that more data are needed to verify the apparent between group difference.
Summary statistics (e.g., the bar graph above) may suggest conclusions that differ from what is concluded from the full dataset. When displaying continuous data, be sure to use a graph format that clearly indicates the distribution of the data so that readers can appropriately interpret the data.
Figure 7. Example of different data distributions producing the same bar graph. The data shown in panels B-E could all produce the bar graph shown in panel A.
The reader's understanding of a dataset is limited to what the authors present in their manuscripts. Figures and tables are an effective tool for communicating large amounts of data that would be complicated to explain in text. When composing a figure, be sure to choose a graph format that fully describes the data and provides readers with a complete picture. To make the most of your figures, consider the question that you aim to ask, the type of data that you are presenting, and what your readers can learn from it.
Published on 09/10/2020