March 26, 2015

Building HTML Tables That Google Search Will Love

Remember when Google started including images, videos, local shops, PDF documents and Youtube videos right in search results back in 2007? Well now that’s happening again with structured forms of content like HTML tables.

[Skip to list of components of a perfect table]

Tables were originally meant to present sets of information, just like spreadsheets. But back in the early days of the web, designers found that they could stretch them to encompass the entirety of a web page and use columns and rows to lay content out the way they wanted. Today this technique is considered archaic as CSS provides far superior layout capabilities, but a significant quantity of sites that rely on tables for their design remain.

It took time, some trail-and-error and a lot of machine learning, but Google’s deciphered a way to reliably isolate good tables that contain data from the vast sea of “bad” tables that contain layout. This was no small feat, considering bad tables outnumber the good a hundred fold—and Google had a colossal 10 billion+ tables in its index.

In this article we’ll examine some of the particulars behind what makes a good table. This way when you publish sets of data, you’ll have another traffic source from Google’s table search, better semantic preparedness (more on this later) and potentially better search results for certain queries in the future, when table search takes a more integrated role in Google SERPs.

Google’s a machine (that’s getting gradually more unsupervised). So the trick to creating tables that Google will love is making it easy for machines to understand your tables and categorize the data within them. When creating tables remember this cardinal rule: the more easily the semantic structure of a table can be discovered, the more likely it is to contain high quality data.

When creating tables remember this cardinal rule: the more easily the semantic structure of a table can be discovered, the more likely it is to contain high quality data.

Dissecting a Perfect Table

diagram of table search elementsGood tabular data almost always has a row that acts as a header (except for, perhaps in the case of vertical tables, which we will explore as well). Just as headers describe columns, a candidate for a strong table has a primary column which describes rows. This column often features entities that the table is about, which is why it’s called the subject column. Unlike table headers, subject columns aren’t declared in code: Google figures out what these are on its own.

The biggest indicator of a table that contains high quality data is a subject column whose contents remain consistent across a theme.

Below you’ll find a list of factors that serve as indicators of a good table, ranked from highest to lowest on a speculative basis. Note that some of these factors are also slightly speculative, but likely. Others have either been explicitly described in Google’s research papers or strongly suggested.

  1. Consistent Subject Column – Great tables will have an identifiable subject column that consistently represents a class or category of items. This makes the table more suitable for classification within a semantic framework. In the example of banned books pictured above, the subject column is the book title.
  2. Table Headers – Use the <th> tag to declare headers for your tables. In an ideal table the header will be a first row that names each of the columns.
  3. Table Caption – The <caption> tag contains text that appears above a table, acting as a sort of title. Always include a caption and  use one that’s keyword rich, but concise. Avoid keyword stuffing.
  4. Subject Column in Bold – To make it easier for search engines to identify your subject column, give is some custom CSS styling. The most apt option is to make the column bold. By using CSS here, you’re also telling the search engine crawler that your site isn’t using the table in question for web layout purposes.
  5. Context and its Prominence – Headers, captions and a subject column  form a great table, but they frequently don’t give an accurate description of what the table is about. Google analyzes content near the table that’s likely to have a strong influence on its meaning. This includes  the heading tag (<h1> through <h6>) immediately above the table. Any body text above or below the table and the page’s title. If you have a page with multiple tables, remember to place each one in a separate section denoted by a new heading tag. Describe what your table is in plain English immediately before or after the table.
  6. Rectangularity (Table Size) – A table that is largely vertical (has many more rows than columns) or largely horizontal (with many more columns than rows) is not likely to be one that’s used for web layout, as bad tables are often closer to squares in shape. It’s important to note that good tables can be square and indeed often are. But highly rectangular tables are rarely not data-driven.
  7. Lack of Duplicate Content – Bad tables aren’t often confined to a single page on a domain, they recur. Furthermore, they often repeat the same content in some rows and columns in those recurring instances. Note that there are legitimate cases in which the same table format is duplicated throughout a site, for instance, in Wikipedia articles.
  8. Low Variance in Characters – A good table will not have large variance in cell size from row to row OR from column to column. On the other hand, a bad table will have a little consistency in the number of characters per cell.
  9. Single Subject Column – Tables with two competing subject columns may be entirely reliable sources of data (although plausibly more complex). Unfortunately, at this time Google’s Table Search algorithms focus on only identifying one subject column. For this reason it’s prudent to avoid tables that have two potential subjects by splitting your data into multiple tables.
  10. Data Linkage (Relationships) – Once a subject column had been clearly identified, Google seeks to understand the relationship between that column and others. Latent semantic indexing of surrounding content and identifying headings comes into play here. Sometimes, the data will all fit neatly into a semantic vocabulary such as the ones outlined on Schema.org. This will happen rarely and only for very generalized tables, but when it does Google will have a very high confidence in the quality of the table. To leverage this, it’s worth considering expressing simple data that has a typical structure to it into a table on your website. You wouldn’t necessarily put this information into a table (you might even use a repeating format like a list), but for the purpose of taking advantage of table search, adjusting some content in this way could prove beneficial.
  11. Numerical Units – Many tables will contain numerical values. Lack of clarity about what unit these values are being expressed in will hurt a table.
  12. Duplicate Subject Column Instances – It is acceptable to have duplicate cells in the subject column and it will not hurt the quality of the table (this may seem counter-intuitive for those accustomed to working with relational databases).

 

Vertical Tables

vertical table example
Not all good tables fit neatly into the box defined by the above-mentioned factors however, the major exception being vertical tables. These are frequently found on, but certainly not limited to sites like Wikipedia and consist of two columns. Interestingly, such a small quantity of columns often negates the need for table headers, as each of the two cells in a row form a subject:value pair.

Take a look at the long vertical table (right) detailing all sorts of data on Jupiter’s largest moon. It may not follow the 12 rules above, but it certainly does contain structured and useful information. Wikipedia uses an interesting format for vertical tables where headers act as captions that span a two column width, and designate a subsection of the table.

 

 

Current State of Tables in Search

table-in-google-search-900x414Tables are used in Google web search today, but as of 2015 they seldom make appearances, although it’s likely that the breadth of search results that use tables is going to increase.

 

 Tables in SERPs

Tables will appear in search if the following two conditions are met:

  1. The query is fact seeking (highly informational). Especially if it’s a short sentence or phrase where the object matches the table’s header row and the subject matches the subject column. Note the “Literacy Rate in Canada” example.
  2. The top three search results already contain the table that fits the query.

Tables will only boost your ranking if the query is highly informational and if you rank second or third already. In other words, the state of table integration into blended search is conservative at best and in a beta phase right now.

Tables for Rich Snippets

The vertical tables mentioned earlier are a great candidate for rich snippets. Remember Ganymede from before? Here’s that SERP result featuring the tabular data as a rich snippet.

tables-for-structured-data

Vertical tables, especially those found on Wikipedia, are prime candidates for this sort of snippet enrichment despite the lack of traditional semantic markup. What’s really cool about this is that information may be attached to a snippet that isn’t part of any semantic vocabulary yet.

 

Should You Use Tables?

In a word, yes. Tables afford three advantages. First, your data will be indexed in table search, which means those who use Fusion Tables or the Research Tools within Google Docs will come across them. Furthermore, using tables can potentially enrich your result in SERPs by adding structured data, even if no classes exist yet for the sort of data being structured. This can serve as a sort of semantic preparedness and bypass, to some extent, the need for constantly checking Schema.org for new vocabulary. Lastly, on occasion, your tables could appear in search and give you a minor ranking boost—this places it in a class with other standalone ranking boosters like Google Authorship (formerly), mobile-friendly sites, SSL, etc.

Most of the tables I’ve cited so far suggest little to no commercial intent, so as a closing thought I’ll leave you with the table below and its corresponding search result.

flight data in website table

those same flights as they appear in table search

Filed under:  SEO User Experience  ||  Tagged under:
Author:

Orun Bhuiyan

As SEOcial's marketing technologist, Orun loves to discuss his hard-won knowledge on topics like SEO, programming and design. He's an enthusiast in emerging technologies, including big data and the semantic web.
Leave a comment
comments powered by Disqus

Base Terminology

SEO is the process of affecting the visibility of a website or a web page in a search engine's un-paid ("organic") search results.
The semantic web refers to the next stage of the world wide web and aims to ascribe semantic meaning to all web content through a collection of systems of classification. This means that, in the future, machines will be able to better understand the content we produce, resulting in better search results, new applications and an Internet that is fundamentally different from the one we use today!
What if each of the objects around you had a unique identifier that can be connected to the Internet? The goal of the Internet of things is to equip all objects in the world with tags that allow them to be digitally organized or manipulated. The implications? Less theft, less waste and the ability to control your surroundings in a manner never before possible.
Conversion optimization is the practice of modifying the parameters of a lead-generating system to stimulate a higher success rate as defined by goals. Most conversion optimization is structured to create an increase in ROI (return on investment). We frequently use multivariate and A/B split testing when optimizing conversion, wherein we test two or more systems at the same time, analyze their performance and deduce precisely what action items will bring us closest to the set goals in the least amount of time.
Market diagnostics or analytics is the process of collecting and analyzing business data — especially consumer data. This allows us to assess and improve the effectiveness of a marketing campaign.
In many applications today, there is such a phenomenal quantity of data available that it's difficult to collect and process with traditional database tools. The field of collecting, manipulating and drawing conclusions from massive quantities of data from a particular source is known as big data.
What started as a CMS (content management system) that was only meant to create and edit blog content has grown at a tremendous rate to become the most ubiquitous system for developing websites on the internet. WordPress accounts for an incredible 15% of all sites on the web.

RT @neiltyson: His passing has left an intellectual vacuum in his wake. But it's not empty. Think of it as a kind of vacuum energy permeati…

4 weeks ago

RT @PicardTips: Picard management tip: Don't negotiate absurd schedules with engineers. Encourage truth telling and reasonable time estimat…

3 months ago

RT @PicardTips: Picard holiday tip: Religion and commercialism will both fade over generations. Generosity and kindness, however, will endu…

4 months ago

Request Our Portfolio





  1. Which option best describes you? *

  2. Are you a key decision maker in the business you represent? *