As an SEO agency specializing in eCommerce SEO, we are faced with some of the most common SEO issues an eCommerce site may have. Whether these problems present themselves during our SEO audits or potential clients ask us about them before signing on, one thing we know for sure, these are quickly becoming some of the most frequently asked questions.
So I decided to ask some top-notch SEOs what their thoughts are on some of these issues. I was fortunate enough to get Terry Van Horne, Bill Slawski, David Harry and Hamlet Batista to provide some of their eCommerce SEO best practices as they relate to:
- Website topology
- Website architecture
- Pages targeting long-tail keyword terms
- Duplicate content
- Thin content
- Hidden content
- Link building
- Negative SEO
So without further ado, here are their responses (this is a long read, so grab a cup of coffee):
Best Practices for eCommerce Website Topology
To a degree, the navigation of your website will depend upon the complexity of your website, and the topics that it covers. Ideally, You will try to provide an experience that is accurate and comprehensive. You might be able to get a sense of how complex your coverage of a topic should ideally be by performing relevant queries on the topic, and seeing how much content exists on the web related to it, including sub-topics that should be covered.
The owners of the site or their employees are likely subject matter experts, and asking them for information, and questions may be helpful. The text for your navigation should be descriptive of the content it leads to, and give potential visitors some confidence in what they will see on the other side of the links. There’s probably a decision to be made about the use of dropdown menus versus menus that are more directory-like on the pages of the site – sometimes it’s helpful for people to see the navigation text without it being in a dropdown.
I prefer to keep the navigation link text as simple as possible with a lot of attention paid to keeping the labels as general as possible. I am not a big fan of the large fixed menus on many large ecommerce sites. IMO, they often provide the user with too many options that are too similar. Not to mention that in some cases all these links fragment the link equity being passed down link hierarchy. The old consensus (still clung to by some) was that you wanted to link as many pages as possible from the home page. To some degree this is no longer near as effective as it once was and by having such a broad theme it actually complicates Google identifying what the site is about.
This is very true where menus include large numbers of **less relevant** links to other areas of the site. I try to keep menus to “main categories” and brands, however, with the advent of Google indexing rendered areas of a page I am ready to experiment with putting some parts of navigation menus in divs with dispaly:none or tabs to control what links on the page are indexed and which (if any) are not visible to users until moused over. In that case, the relevance to the page of displayed links could provide maximum relevance for the page from good link equity allocation. Often I will suggest to clients/designers that are “fixated” on these large fixed menus to remove them from the home page and only use them on internal pages.
Navigation Link text;
I really don’t think a whole ton has changed over the years as far as this one is concerned. In that, I mean the tried-and-tested rule of being concise and effective. Using keywords and terms that are user-friendly, is often what we want to do as far as feeding the search engines. I’m not a huge fan of trying to stuff keywords into internal navigational elements for the sake of the engines.
In terms of strict navigational elements, I generally look at the size of the site and use structures along the lines of; domain.com/section-name/category-name/product-name.html.
On a smaller site, that would be shortened and larger one’s. In relation to the actual anchor text itself, we build the architecture with those in mind as well.
Fixed menus (drop downs of all categories and sub categories);
As I was saying in the first part, we base a lot of the architecture’s naming conventions and associated targeting on the terms we’ve targeted in that phase of the program along with that makes sense to the end user of the site (UX wise).
For me, the most important part of this is often the main navigation on a site. Something we like to call the “big ass menu”. You know the ones; they often can cover the entire visible screen once hovered over. Not only is the not ideal usability, but it’s also a poorly thought out use of internal linking ratios, which can play into passing of various ranking signals (equity). Sadly, this is a common problem.
We tend to prefer a cleaner, less-is-more approach of a focused main menu. From there, we can use various other navigational elements (such as side menus) in each of the associated sections/categories. This not only lends greater control over the aforementioned internal link ratios, but also gives better UX for the end user. Win. Win.
Once more, the topology used should be logical and in some sensible, if not semantic, format.
As you might imagine, given my above approaches, we are a “keep it simple” when possible crew around here. The semantic variances in term targeting can be spread out over a given page or multiple pages as needed. That being said, we often see those that try to laser target various terms, that I personally would allow to happen organically.
What I mean by that is that I am more interested in the high-level terms that drive (converting?) traffic than to be overly focused on semantic variants that will generally occur naturally through your focus on the more important core and secondary terms. Sure, I organize things into semantic baskets, but there is a line (for active targeting) that we don’t cross.
The best website topologies are simple, and eliminate most duplicate content issues from the start. For example:
- sitename.com/category/product_name vs www.sitename.com/product_name. The first option can potentially create multiple URLs to the same product, while the second one would avoid that.
- If the names in navigation elements are keyword-researched, the link text would be ideal. Don’t obsess over having keyword-rich link text everywhere. For example, breadcrumbs full of keywords but don’t help users navigate are useless.
Best Practices for eCommerce Website Architecture
Ideally, you want to make certain that there’s a definite click path that a search engine crawler can follow that will allow it to crawl every page on a site that you want indexed. URLs for Products shouldn’t have categories in them, because if those products fit into 2 different categories, you could ideally have 2 different URLs for them. You do want to have keywords within your URLs that help describe the product that the page is about.
Many eCommerce sites have parameters that indicate some kind of sorting or filtering that make the site more usable for visitors, but which you might not want indexed. Like in alphabetical order, in reverse alphabetical order, in order by price high to low, in order by price low to high, x products per page, xx products per page. You may want to make some of these sorting pages to have meta robots noindex, follow elements in them.
Pagination markup (using rel=”prev” and rel=”next” link attributes in the headers of those pages can be a good idea when a category is displayed on more than one page, and consist of a series of pages that may share, or have substantially the same titles, and meta descriptions.
Canonical link elements should be set up for these ecommerce pages that are intended to be indexed as well. For paginated pages that are part of the same series, many people mistakenly use the first page in that pagination series as the URL fo the canonical link element, for all of the pages in a series. Instead, those pagination pages should point to themselves or a “view all” page, if one exists.
Parameter handling should be used for URLs that have parameters that exist solely to tract sessions or users, to avoid having a lot of pages that shouldn’t be indexed, within the search engine’s indexes.
A retention policy should be set up for products that end up being discontinued, especially if they are products that might return. In some cases, for discontinued products, you do want a 404 or 410 message sent to the search engines stating that you are discontinuing carrying that product. If You are likely to continue carrying a product, but are out of stock, a message could be added to that page, noting that those products will return, and a link to similar products could be added.
There is schema for products in different industries that could be used to provide the search engines with content that they likely want to have information about for the industry you are in. It’s worth investigating, and setting up review and ratings schema where you have ratings and reviews, so that you may end up with rich snippets in search results that show those off.
Keywords in URL;
Easy answer? Do it. Lol. While the actual scoring value of them to a search engine is highly debatable, there is a ton of research that shows users will often click on links with concise or compelling words, over those that have none. This alone makes it a wise choice and we can’t help but think that search engines find some value in them as well.
The only real ‘best practice’ I have for those is to use a logical order, as discussed earlier, and ensure the words are legible via the use of a “dash”… also noted previously; domain.com/section-name/category-name/product-name.html.
As always, use some commons sense; concise and logical in approach and execution.
Parameters in URL;
Given the above, it stands to reason that we should avoid these whenever possible. While search engines such as Google tend to be getting better at handling them, they can (in some instances) be ignored or even not scored the same as those using search engine friendly URLs.
In some instances, we would even block them or give a specific directive to Google via the parameter tool. When clean SEF URLs are unavoidable, care should be taken to ensure the pages are of value, that there is no wide spread duplication being created and that crawl budgets aren’t being eaten up.
Session ID in URL;
Our approach in these situations is almost identical to those above (parameters). What is the purpose of the ID? Is it truly a page required to be indexed (often not)? Is it creating duplicate content? Is it eating up crawl budget?
Session IDs are generally more about tracking the user, than they are presenting valuable information to the search engines. Do the due-diligence, then block the pages accordingly (again, Google’s parameter tool is often the best way to go).
301 strategy for discontinued/out of stock products/skews
We tend to go with Google’s long standing advice;
- Product discontinued and not returning; redirect to the most logical page or category.
- Product will be returning at some point; post out of stock message, advise on related products and put a user email capture box to be notified when it comes back in stock etc…
We get that question a lot actually (“should we 301 until it’s back in stock?”). That’s never a good idea as the equity will dwindle or be lost altogether. When it’s simply never coming back, we’ll look to redirect the equity to the closest logical location (section, category or related product). Again, usability is often the driving force here. In most cases a collection (or single) product page rarely has enough equity for it to be a huge concern; UX becomes more important than scoring signals.
Another fairly easy answer; do it when possible. While it doesn’t always make sense to implement, especially when a site redevelopment isn’t imminent, in the world of ecommerce, it has been increasingly a wise move. If it’s location related, review/rating related, product markup, GoodRelations, images, it makes sense to do it.
How far to go with it will depend on the business model, complexity and size of the site and the undertaking (resources) required to implement it. As with all things, it often makes sense to start small, watch (track via analytics) and move further as you go. We always want to establish the value of any approach.
Including the keywords in URL is still a big signal for me especially if they identify a specific product or brand. When I was building custom carts I used a topology like this:
I believe having as few parameters as possible in the URL is best. I like to keep it to a maximum of three and do not use parameter names that contain ID or session. So Product= is awesome and ProductID= and ID are even riskier still with session at one point in time blocking indexation entirely. I think pretty much any cart worth using has addressed that 2000ish issue. That said I have seen where keywords in parameters seemed to be very positive. The best practice for any parameter that do not change the page is to use the Google Search Console (Google Webmaster Tools) and block that parameter using the parameters tool. Ecommerce carts that offer multiple ways to display results using parameters can also be block these using the parameters tool reducing risks of over indexation and squandering crawl budget.
A strategy for discontinued/out of stock products/skews includes not only a 301 and/or SEO strategy it should also include a user strategy. Unfortunately this, to a large degree, is a “function” of the software chosen. So one best practice is to understand as much about the product skews and manufacturers fulfillment as possible. IMO, the right questions that help to identify required functions of the cart you are designing or implementing to avoid problems with:
– each unique color, size, cut/design is a different skew (supplemental content issue)
– some manufacturers change product skew by run eg: for quality control Guitar Cos. change skew for each run of a product (essentially same product with new identifier but same keyword terms
– product out of stock
– poor manufacturer fulfillment time for re-stock
IME, both developing and implementing an “out of the box” ecommerce solution being able to include related products on “item pages” is by far the best way to manage most of the above. IME, you can skip the 301 until there is more data on what is the best page to 301 it to, if ever. IMO, the big benefit to this solution is that it provides the best way to manage all of the above issues. It is better for the user because the related items will provide several options for the user (which could be tracked to determine the best page to 301 to) and it also means that you have a static page for those keywords which may continue to receive 1000’s of queries before users realize those terms are no longer the best keywords for the product.
I think of structured data as brail for search engines. Like brail it enables the sight impaired a means to understand what is in the written word. Structured data should always be:
– as accurate as possible
– written with proper syntax
– implemented with proper itemtype and itemprop
I have seen offer schema add prices to a SERP and believe product reviews and events have the most influence in the SERPs likely due to price being why some users are searching; review stars and event cards draw more attention over standard listings due to pictures and SERP layout.
The best URL structures are descriptive and provide hints about the content. For example:
- sitename.com/0123456.html vs www.sitename.com/blue-widget.html . The first option doesn’t incentivize the click because it is not clear what to expect, while the second one makes the content clear. IDs in the URLs are fine too, for example; www.sitename.com/blue-widget-01234.html
- Search engines can handle several parameters in the URL fine, but one or two parameter in the URL are best because they will have fewer URL permutations to craw. For example sitename.com/category?page=2 is good, But consider www.sitename.com/category?sort=asc&page_size=10, www.sitename.com/category?sort=desc&page_size=20, etc. You provide too many URL combinations to the same content.
- Avoid session ids in your URLs. In particular, when they are not URL parameters. For example: sitename.com/category;jsessionid=012345?site=abc
- For out of stock products you have some that won’t ever come back, and some that will come back (they might be seasonal.) 301 redirect products that won’t come back to 1) an alternative product 2) the parent category page. Avoid redirect chains by always redirecting to the final destination. For products that will come back, leave the page there with an out of stock message, and use canonical tags to index an alternative product, or canonical back to the parent category. The page will be removed from the index, but its link juice will be passed to another page.
- Markup your site with structured data everywhere possible.
eCommerce SEO Best Practices for Targeting Long Tail Keyword Terms
This one is certainly situational as far as my world is concerned. In some instances (ecommerce wise) a “long tail” term could be a product name and others, a more generic piece of content or a longer named section or category. The main considerations for me are;
- What’s the actual value of the page (to us, the user and the search engine)?
- How deep in the architecture is the page/term in question?
- Does the page make sense as a stand-alone page?
These are the things I would be considering. I’ve never been a huge fan of location spam or even term spam in the form of doorway pages. Sure, they work… but they’re often bad UX and a waste of crawl budget etc. I really don’t have a problem with Google’s current stance on it.
Again, I often prefer to use the content program and architecture to surface the longer tail stuff. If need be, I can also adjust internal link architecture as well if it needs special attention. The query really needs to be worth it though.
The biggest problems I’ve encountered with targeting longtail keywords on ecommerce sites is that in many cases it creates near duplicate content so is to some extent vulnerable to compromising the new doorway page guidelines; the pages are hard to rank with on page tactics and too many deep links to a page is a known Penguin algorithm characteristic and lastly there are fewer resources for these sorts of pages so finding a unique “value add” is much more challenging than using other SEO techniques targeting keywords that are not too granular. Techniques that avoid solutions that produce those problems are best.
Instead of targeting keywords too broadly review SERPs and determine what queries result in answer boxes, knowlegegraph panels and structured data rich snippets in SERPs. Hummingbird moved SEO’s from “matching keywords” to determining concepts so longtail keywords are still important, however, techniques must be adjusted to techniques that leverage natural language usage and elements and entity relationships to the “concept” of the page.
eCommerce SEO Best Practices For Handling Duplicate Content
I hate em. Whenever possible, start to work on your own descriptions etc. Not only is there some duplication risk, but most of the time the feed descriptions are crap. While duplication isn’t usually something you can get penalized for, the odds of your page having the trust/authority over every other site using them, is generally low.
Start with the products that are the most important (margins, demand etc) and work your way outwards writing new descriptions. It’s worth it at the end of the day.
Is canonicalization to one page a proper workaround for 3 colors of widgets?
Yes… yes and YES. As is the running theme for me so far, good UX is often good for the search engines. Those kind of things should be handled dynamically on the same page ideally. If you want to feed the search engine? List “available” colours in text somewhere… I refer a single page that load those options without a refresh. Again, the user is who we sell to, not the search engines (they never buy anything, right Terry?).
Product feed duplication is often ignored or not given a lot of thought. In all cases where I am using a distributed feed (manufacturer provides same data to all) on the site or I am placing a feed on a 3rd party site like Amazon or a comparison search engine I always make sure:
- that the descriptions on the site are not duplicated
- that the feeds I put on 3rd party sites have unique descriptions
- that larger sites like Amazon also have unique descriptions
Though I may use the same feed on all of the 3rd party sites it is still unique on that site whereas I’ve seen instances where several vendors on a 3rd party site were all using the same manufacturers feed. I like to make and optimize unique feeds on the large sites like eBay and Amazon because they are destinations for buyers and optimizing your Amazon feed can improve results on the site substantially so the ROI improvement is worth it.
Canonicalization, IMO, is a grey area if you believe that search engines have no business telling webmasters how to build sites. So for instance if I have broken an article down into 3 pages but canonicalize all to page 1 because I want users to start on page 1 not part way through the article and since I am paid by impression getting more ad impressions using the canonical seems reasonable solution.
Many SEO’s believe this is contrary to proper usage, however, the HTML guidelines and search engines have not really identified this as inappropriate and there are lots of good reasons to do this especially in ecommerce where many pages end up in what we used to call the supplemental index. One could argue you’ve done Google a favor by basically keeping these pages out of the Google index saving the site’s crawl budget, consolidated links to all pages to one page and in the process quite possibly improving the user experience.
I don’t think product feeds cause duplicate content issues unless scraped and repurposed by other sites.
In order to decide if you should canonicalize item variations, ask yourself “are people searching for the variations or navigating to them?” if they search, leave them, and update the meta data to reflect the item variations; if they navigate to the variations, canonicalize.
eCommerce SEO Best Practices For Handling Thin Content
What makes it thin?
I guess we need to somewhat establish what might be considered “thin” given that ol Google doesn’t really like to give us a straight answer on things. What we do know is that they’re content with little or no added value. Some examples they give tend to include;
- Automatically generated content
- Thin affiliate pages
- Content from other sources. For example: Scraped content or low-quality guest blog posts
- Doorway pages
And while they’ve never outwardly stated it, in the ecommerce setting, the above discussed manufacturer product feeds and often be an (indirect) issue as well. If there is little on you site that is unique, that is fully fleshed out, then you certainly run the risk of having thin content. This makes ecom sites with feeds in a position where secondary content is paramount.
Google says add value add… how do you accomplish this?
That’s really just an extension of what I was saying above. If you have product feeds, start to rewrite descriptions or add to them and the page they reside on. Do some exact match searches and see what others with that same content have done, that rank above you.
And, in the case of ecommerce, implement a content program with elements such as “How to” and other resource articles. Add some “FAQ” articles. Start a blog. Add links to other related resources. Make some videos… you get the idea. Add value to not only the pages, but the site itself. Enhance and add uniqueness.
I think to a large degree the consensus is that thin content is only about the amount of words or content on a page, which, to some extent is an indication of low quality or thin content. I take it further and want to not only have quantity, but, also content that adds value beyond the information on all the ranking pages. IMO, all things being equal it is that “added value” that differentiates this content from thin content. That can be in the form of better organization of information; schema which enables the search engine to better understand the information and it’s relationship to the webpage entity and rich media like pictures, slideshares and videos.
In an e-commerce site, the easiest and most scalable way to add content to inventory pages, is to invite customers to write reviews, and ask questions. You can also “bubble up” some key product reviews to the category level.
eCommerce SEO Best Practices For Handling Hidden Content
Tabs and display:none
I decided to put these together as they tend to be realated. One should always look at the display of their website template through Google’s eyes (via Webmaster Tools). Is everything (important) being displayed and digested? If it’s not being indexed, it can be a huge issue.
One of the easier secondary tests is to take some text from the tab or other scripted element that doesn’t display on load and do a “site:” search in Google for that text. Is it showing up? Is it indexed? If not, you could have a problem. At this point, although Google seems to be getting better, we’ve seen MANY cases where the ‘tabbed’ text isn’t being indexed. In some cases, the website owner had the actual description in these… not ideal as you might imagine.
It should go unsaid that all attempts to cloak content is very risky. Tabs are a problem since Google began to only index content that is displayed. So a best practice would be to put the most important content in tabs that are visible when the page loads. It is a good practice to test pages that use tabs and display:none in CSS in GWT Search Console using the “Fetch with Googlebot” tool and the rendered option. This will show you the content that will get indexed. A “Tabs” workaround is to use anchored links that point at the tabs that are hidden.
Google and other bots might or might not index the content behind tabs on the same URL (using hash fragments #). You can check your specific case by searching for text inside the hidden tabs. You could have separate URLs per each tab, for example like in the case the tabs have very important information such as product reviews. But, typically having all content in a single page from an HTML source perspective is superior approach.
display:none is only a concern when the pages break from the normal use. Spam fighting algos look for patterns as spam has a distinctive pattern. For example, the text behind a display:none is full of keyword-rich text and keyword anchors.
eCommerce SEO Best Practices For Link Building
Best practices? I will go with “be aware”. Now, that’s not the same thing as “be afraid” which seems to be the call-of-the-day in some circles… There are still plenty of instances where one can “create” links, just not nearly as many safe ones as there used to be. As for truly natural links (if you want to call it that), a strong content program combined with PR and social media, can go a long way.
The other end of “be aware” is more about constantly keeping an eye on your link profile for any strange anomalies that could be an issue down the road. Vigalence is key for me…
Best practices and tips for ecommerce link building:
To be honest, it’s a real tough go as most folks in the space can attest to. Resist the urge to throw a ton of manufactured links at desire product pages. In fact, if you do some research, you’d likely find most of the competitors that rank, don’t have a ton to those pages either.
I would focus more on the home page, then pass the equity via internal links and navigation. This is also another great reason to implement some form of a content program and associated promotion as mentioned in the section above.
Has Penguin changed thresholds on deep link building?
Trick question? Google’s Penguin algorithm has changed the game for ALL link “building”. And just when you think you’re safe? They’ll likely change the thresholds again. Be proactive and think ahead to avoid issues down the road. Build and attract links to the home page and section category pages. Build and attract links to the other content program elements on the site. From there you can start to move the equity around the site via internal links as needed.
What is not very well understood is that it is not natural for a site to have a lot of deep links so it is best if you do not really “build” too many links to deep pages. Naturally promoting your site using Social media, price comparison engines, coupon sites which are natural places for deep linking to occur from. SEO’s should be very careful about guest posting and using deep links. If you do decide to “build” links then anchor text from 3rd party sites should be varied with special attention paid to “exact match” and keyword rich anchor text.
Has Penguin changed thresholds on deep link building?
One of the characteristics that I have found in cleaning up manual penalties and Penguin is that after Penguin there was definitely a lower threshold for “exact match” and keyword rich deep links
Focus on building “preselling resources”. The most linked pages on most e-commerce sites are resources that help clients and potential clients use your products. For example, one of my clients in the B2B space has a page that offers a chemical compatibility database. It is the most popular page on their site and gets a lot of organic links. It also attracts targeted visitors. This type of link building is future proof.
Has Penguin changed thresholds on deep link building?
You only need to worry about algorithmic thresholds when you are focused on faking the signals, which is not a good long term plan. The thresholds will change and change often.
eCommerce SEO Best Practices For Handling Negative SEO
I touched on this earlier when we discussed link building and “being aware”. This unfortunate tactic has become far more common over the last 5 yrs and is certainly something we currently advise clients have on their monthly monitoring programs. It is one area you don’t want to find out about when it’s too late.
As such, one should be monitoring Google Webmaster tools and even back up with data from other providers such as Majestic and/or Ahrefs. As someone that has a strong background in forensic SEO (cleaning up the mess) I can’t count how many times unsuspecting webmasters have been bombed without even knowing it.
Watch it… monitor it and be proactive in filing disavow reports. One Googler from the web spam team once intimated that while it may not ultimately stop a manual penalty action, getting it revoked will be ever so much easier if you have shown a history of diligence.
Hacked sites that often result in 000’s of pages added to a site with links pointing to it from other hacked sites?
Much like your link profile, you should be monitoring for this as well.
- Watch indexation and crawl levels
- Watch for terms that Google thinks the site is about, that it isn’t. (via Google Search Console)
- Watch analytics for query traffic on terms non-related to the site
One thing that’s a positive, is that Google is getting MUCH better at noticing this kind of activity and will often notify you via Search Console and even let you know when your (popular) software might be out of date.
If it does happen, I’d get it cleaned up and keep strong documentation. If it’s a seriously problematic issue, then consider page removal tools etc… I would simply 404 them, do NOT 301 redirect lol. As with most things, monitor your Google traffic and rankings to ensure it hasn’t already caused some issues by the time you found it.
SEO’s need to realize that there are different kinds of negative SEO. Some will just point toxic links at a site but another way to attack is to erode the crawl budget often by figuring out that a site does not handle links pointed at pages that do not exist. Then the person doing negative SEO will create links to non-existent pages often removing the link after it is indexed so the site linked to does not realize the problem. However, with time they’ll find 000’s of these URLs that Google continues to try and index because they find a link elsewhere. I have seen this affect the rankings of a site.
Hacked sites that use SQL injections and other tactics to put links on pages (often porn and pills) or creating pages (often ranking for a short time) also link to pages on your site from other hacked sites, so, after discovering the exploit be sure to monitor your incoming links and deal with them immediately.
If you’ve made it this far, you’ve probably learned some things or have something to add. We want to hear from you in comments below of via Twitter @TopHatRank