Data Visualization

7 Best Practices for Data Visualization

Across every industry and vertical, organizations are increasingly overwhelmed with data. Recognizing its potential value, nearly every company is tracking and storing data in some form. However, you can’t easily explore raw data and extract meaningful conclusions; there’s just too much of it and the format isn’t always readily consumable. That’s where data visualization comes into play. Data visualization allows people to explore data in a fast and meaningful way. You can understand trends and patterns that might not be obvious from a table of summary statistics or rows in a spreadsheet.

While the importance of data visualization is reflected in the growing adoption of data exploration and visualization platforms, creating an effective visualization is not always straightforward. We’ve compiled a list of some best practices to help you get started.

1. Have a clear question and goal in mind

Poor planning is a common challenge when creating a data visualization. While many people are tempted to jump right into a visualization project, it’s important to understand what you’re trying to accomplish with the exercise. Are you trying to uncover emerging trends? Identify weak points in a product? Highlight problem areas in a process flow?

Let’s imagine you run marketing at a company that sells 10 different products. These items sell well, but to varying degrees. Price points are different, profit margins are different, and support time after the sale is different.

How do you know where you should focus your marketing dollars to optimize product revenue? This is a perfect opportunity for data visualization, but you’ll need to think through your goal before starting. It’s not as simple as looking at sales volumes since this doesn’t take into account profit margins, market share or seasonal trends. What are financial measures of a successful product? Which of those elements is the most important?

There could be any number of goals you’re trying to achieve or questions you’re trying to answer, but it’s always important to have your question in mind upfront. If you know what you’re trying to accomplish, it’ll be easier to build your visualization.

2. Select a question that will provide useful information

This seems like an obvious point, but we’ve seen many organizations and groups ignore relevancy for the sake of producing a visualization quickly. This is especially true in more siloed organizations, where groups don’t have a solid understanding of other departments’ challenges or goals. It doesn’t make sense to spend any time answering a question that provides little to no value to the organization.

Reach out to stakeholders outside of your own department, especially if the results will affect other groups and the actions will require their buy-in. Ask if the results from the question you’re looking to answer would help them do their job more effectively. Differing perspectives from across the organization will also help to refine your question or pull in more relevant data sources.

However, it’s important not let their feedback dilute the overall goal of your data visualization. Answer one question at a time, and don’t get caught up trying to build a comprehensive data visualization solution that addresses all pain points across the organization. As we mentioned in point 1, keep your goal well-defined and straightforward.

Sticking with the example from point 1, we’re interested in figuring out where to spend our marketing dollars. We would expect to gather feedback from members of R&D, sales, customer support, and any other teams that would have valuable input on this specific topic. Can the marketing team run a targeted campaign shortly after learning the results of the data visualization? Can the sales team handle the extra potential inbound leads that result from the marketing campaign?

3. Use the right tools

You don’t have to do this all alone or create visualizations from the ground up. And you certainly don’t have to rely only on Excel. Excel can help with simple visualizations, but it can’t support larger, more complex data-sets or interactive visualization components at scale. You’re extremely limited in the types of charts you can create via Excel and you also lose reproducibility.

Luckily, there are plenty of tools that can help you design a simple-to-understand visualization. There are numerous solutions in the marketplace for data visualization and exploration, but here are a few that we’ve used at TCB Analytics and highly recommend:

  • Tableau – Tableau is a business intelligence and data exploration solution that enables fast creation of interactive dashboards. We frequently leverage Tableau to create executive-level dashboards and as a prototyping tool to iterate quickly on our clients’ ideas.
  • Shiny by RStudio – We can quickly build valuable visualizations and dashboards for our clients in Shiny that take advantage of R’s extensive library of statistical, graphical and machine learning packages.
  • ggplot2 – ggplot2 is a must-learn R library if you’re serious about data visualization. It’s referred to as the “grammar of graphics” because it forces the user to think about their data structure prior to creating a visualization.

Which tools you use will depend on several factors. Tableau is a great place to start for basic summary statistics, aggregations, bar charts, and combining those components to create a KPI (Key Performance Indicator) dashboard. Tableau is also a good fit if there are hierarchical relationships in the data or you want to show high-level summaries with the ability to drill-down into more granular data. Those capabilities require no coding as Tableau has an intuitive and easy-to-use drag-and-drop interface.

We recommend using Shiny if you require more advanced statistical analysis within your dashboard like a Monte Carlo simulation, recommendation engine or any type of predictive modeling. If you’re a beginner to R we highly recommend checking out DataCamp’s free online intro to R course. We use R when we need more customization than Excel can provide, in dealing with large datasets or if we need to apply any advanced statistics or machine learning.

Lastly, ggplot2 is an R library that will enable you to go beyond simple charts in Excel to creating more complex, multi-layered graphics. You’d also be taking advantage of R’s reproducibility meaning that once you write a script to generate plots from data, anyone can use that script to re-create that same plot in the future.

4. Choose your design carefully

After you’ve defined your question, determined its business value and have decided which tools to help you with the process, it’s time to figure out your design.

Simply put, a bad design won’t provide a clear answer to your question. Too complex and it won’t make any sense; too simple and it won’t provide any value. It is possible to have a graphically appealing chart that is so overwhelming that it doesn’t help answer questions but instead creates more confusion.

Here are a few design elements to consider when creating a visualization:

  • Colors – Colors can make a big difference. Don’t use colors that are too similar, or it will be hard to tell two different data points apart. It’s also important to avoid closely combining colors that are commonly problematic for people with color vision deficiency

  • Design for your audience – It’s important to organize and position your data and dashboard elements so they are easily and quickly consumed by your audience. The same chart used for researchers in a specific domain may not be suitable for senior business members. Consider the feedback that you received from other stakeholders across your organization in point 2 when customizing visualizations for each audience
  • Accurately represent the data – To avoid exaggerating differences between variables, a good best practice is to always start from zero on your y-axes. Clear and detailed labeling should also be used at all times.

Don’t Lose Your Integrity

  • Chart types – The type of chart that you select for your visualization is one of the most important factors in relevancy and usability. To help answer this question, we recommend reviewing the following high-level guide from Andrew Abela: Chart Suggestions Infographic

5. Keep it simple

Keep clutter to a minimum. If your visualization feels busy to you, it’ll most likely feel even busier to others. There are almost always opportunities to simplify your graph. Ask yourself if the visualization will suffer any loss of meaning or impact for the audience if an element is removed.

This idea is rephrased in the following quote by visualization pioneer Edward Tufte:

“Graphical excellence is that which gives to the viewer the greatest number
of ideas in the shortest time with the least ink in the smallest space.”

6. Follow the 10-second rule

This is the golden rule of creating effective data visualizations. Remember, the goal is to enable the audience to answer a question quickly. Your visualization should allow different people to come to the same conclusion about your data in 10 seconds or less. If it takes longer than 10 seconds to glean actionable insight from a visualization, then it might as well be presented as plain text in a spreadsheet.

Let’s look at the following two charts as an example. Can you quickly understand from the chart on the left which country is gaining or losing market share and at what rate? Now look at the chart on the right. It’s much easier to see that Japan is gaining share of world car production. Not only that, but we also place a simplistic crosstab below the chart, so the user can see the detailed numbers if they so desire.

7. Review additional resources

If you’re interested in learning more about data visualization, there’s a wealth of knowledge online to help you get started or optimize your approach. Check out some of these online resources:

  • FlowingData – This site was created by Dr. Nathan Yau. He’s written two best-selling books on data visualization. FlowingData explores how data scientists use analysis, visualization, and exploration to understand data. They also offer tutorials for beginners and provide real-world examples of data visualization.
  • Makeover Monday – Makeover Monday is a great place to learn about improving data visualizations. Each week, they post a link to a chart, and its data. Then, they’ll ask the community to rework the chart. You can review ideas from the community to find examples that may be applicable to your visualizations.
  • DataCamp – DataCamp gives you one of the easiest way to learn data science and data visualization online. They partner with some of the world’s most respected industry leaders and teachers to get you up-to-date on data science and data-related topics. With almost 1 million members, they’re doing something right.