The video below is of interest to SEO’s, webmaster’s trying to create their own informational websites, and the Indieweb.  The video, featuring Rand Fishkin, is 32 minutes long but packs a lot of current information.  I agree with Rand through the first 2/3rds of the video where he is making his case. I disagree with his conclusions in the last third because I’m not an SEO, I don’t have clients that are trying to sell things and I’m not trying to sell things.

via The Future of SEO is on the SERP | BrightonSEO 2018 – YouTube

Why this matters to:

  1. The Indieweb: Rand touches upon the social network silos, and how they are increasingly not linking out. They want to keep your content within their walled gardens.  Google is now doing this too, especially in mobile search.  This is not by accident but by design.  This is why I keep hammering away that Google is one of the the bad silos that the Indieweb should be concerned about, especially with Google controlling 90% of search traffic.  When the social network silos implode we will still be left with Google as the Gatekeeper.
  2. To Content Websites and Webmasters: we see in the video, that on the mobile SERP, Google is just posting their own information or information scraped from our sites and reused as their own without providing any click through links to the originators. eg. weather, celebrity news, sports, travel and tourism, food and dining via Google maps, accommodation etc. and its growing.  Commercial content websites which rely on ads to pay the bills are not getting many ad impressions if Google borrows their content or or otherwise fails to provide click through traffic.  As Rand points out the tacit agreement with search engines (I call it the Search Contract) is that in return for providing content and letting search engine crawlers use our bandwidth to index our sites, the search engines supply traffic.
  3. Commerce Websites: This is where Rand and I part ways.  His conclusions are probably realistic if you are trying to market a product because Google, the social networks and Amazon are all putting you in a squeeze play.  It’s the money making sites that hire SEO’s and good SEO’s have to do what is in their client’s best interest.  In this instance you have to play the game, when your business depends upon sales, it is probably not the best time to launch an anti-Google crusade.

Conclusions

Watch the video, you will learn something even if you are not an SEO and don’t care about search engines.  Rand’s presentation and the slides are telling.  Or at least watch the first 2/3rds until he gets to the recommendations for SEO’s.

It explains why I think decentralized search is so important for the Indieweb and the general health of the web and why we need guerrilla search solutions.

robots.txt

BTW Rand mentions one clear solution for content sites early on:  if, over time Google is not sending you traffic, bar Googlebot via robots.txt.  Give Bing and the smaller search engines an exclusive, if they are smart enough to take it.  If Google is not sending traffic you are not out anything.  I say this as someone who has just launched a web directory.  I don’t know how Google treats web directories anymore and I guess I will find out.  But if after a year or two, I’m not getting any traffic or appear to be penalized by Google, I have no problem barring Googlebot from the site.

 

This was also posted to
/en/indieweb.

 

Also on:

Liked this post? Follow this blog to get more. Follow

Searchking.com had two divisions: 1. A search engine, 2. remotely hosted web directories (think WordPress.com only for directories,) run by different individuals. This article will deal with the search engine alone.  The Searchking search engine I am going to talk about is not the Searchking directory that exists today even though it is using a similar concept in presentation.

This is all from memory having used it quite a lot.  I have no insider knowledge of how it worked or was administrated.

What was Searchking?

Searchking was a minor search engine born at a time when there were 7 – 8 major search engines, many dozens of minor search engines, and many hundreds of directories.  It would seem very crude by today’s standards, but back in 1997 it was on par with many of the majors and almost all the minors for search quality.  It was designed by Searchking CEO Bob Massa and coded by Sargeant Hatch.

The reality was SearchKing search engine was not really a search engine.  It had no crawler other than, maybe a meta tag grabber, my memory is a little cloudy on that.  It was, really a flat directory.  That is it had no hierarchy of categories.  All it searched was the Title, Description and Keywords submitted for each page.  I’m sure there was a rudimentary algo, for example, giving more weight to a keyword in the tile than in a description, but that was it.

How Did it Work?

For the person searching, it appeared to be a search engine.  There was just a searchbox.  When you searched, you got a SERP with 10 results on page one, ten more on page two.  When you clicked through to a site it was shown to you with a frame at either the top or bottom.  On the frame you could vote for the quality of the site.  You could also report spam or dismiss the frame.  The quality votes also effected ranking.  The reality was that very few searchers ever voted.  This means a key feature of the algo was rarely used.  The spam report was used a bit.  Mostly, people dismissed the frame as quickly as they could.

There was some human review.  I think Admins running the search engine kept a weather eye on the submissions and did some random checks.  They did act on reports.  But remember, the Web was wide open, wild frontier when this was built. Nobody knew what worked and what didn’t.  Even some of the major search engines like Infoseek mainly indexed only meta tags and maybe a little on page text for that one page.  Hardly a deep crawl.

Submitting your website to Searchking really meant submitting each page by hand, manually.  Again keep in mind, there were no CMS’s yet nor blog platforms in general use, so most websites were hand coded in HTML. Plus everyone was on dialup which was slow.  So websites tended to be 5 – 25 static pages.  Because of the slowness of dialup internet, we all tended to keep Titles, Descriptions and keywords short.  Search too was rudimentary everywhere.  Keyword searches were one or two words on all search engines.  In 1997 when Searchking search engine was built, it was built to work within the limitations of the day.

By year 2000, Searchking was starting to show it’s age, but it remain viable.  One feature the Searchking search engine had was instant listing.  When you added a page, it went live instantly.  This would play a key role in 2001.

By 2001, “Mighty” Yahoo was still the king of search, Google was rapidly gaining popularity for it’s deep spidering and better search results.  The other major search engines were falling behind rapidly or had disappeared.  In those days, it took Google about a month to start listing a new site that had been submitted too it.  Getting a new site to rank in Google was yet another matter, you might be listed, but you might be on page 20 of Google’s SERPs.  Also, Google was still just a web search engine, it was not a news search engine. News stories maybe got in and ranked faster but it still took a week or two before it would appear on Google.

9-11 and the need for Instant News and Information

Then the 9-11 terrorist attacks on the Twin Towers in New Your City occurred  on September 11, 2001.  Everyone was stunned.  Over on the Searchking directory hosting side, we had a forum community of directory owners.  I think we all spent most of the 11th glued to our televisions and radios, but we were also hitting all the news websites for updates.  We started sharing the URL’s to news stories on the forums.  Google had been caught flat footed, for days there was little useful or current information on it, which they acknowledged and tried to correct, but that was slow.

Over at Searchking, CEO Bob Massa told us in the forums that they were monitoring people desperately searching Searchking for any kind of information about the attacks, survivors and relief efforts.  Because Searchking could list new pages instantly, he asked us to help: to dig out and add the URL’s to news articles, new web pages of survivors, relief news, defense news, background news, to Searchking to help those looking for information.  And the SK community responded, we were searching for any news we could find ourselves anyway, TV and radio announcers were reading our makeshift URL’s of survivor lists, and we could jot down and share those too.  This was most important in those first couple of days after the attack, but people from the UK, Canada, the US, even a person in Greenland, cranked out listings of information for about 2 weeks straight. Eventually Google reprogrammed their crawlers and started catching up as did the mainstream media.  All the fancy high tech crawlers failed, but little low tech Searchking actually delivered.

It was probably Searchking search engine’s last best moment.

Why is this Important Today?

There are lessons to be learned from this example that would help make a new directory viable today.

  1. The idea of a flat directory, without drilling down through a hierarchy of categories fits in with the way people use search in 2018.  Make a directory, look and act like a search engine to the searcher.
  2. Even if you have a hierarchy you can hide it on another page so you don’t scare people off.  Just present the searchbox front and center to the user.  (Personally, I would still have a hierarchy of categories as “spider food” for Google, Bing and Yandex, but just make the searchbox so prominent users won’t ignore it.)
  3. One might combine a directory with one of the many available open source web search engine scripts  in some way and combine human reviewed results with crawler results.
  4. Instant listing after a human review is still fast.
  5. Don’t rely on user voting for rankings. People want their information in as few clicks as possible.
  6. Allow for longer Titles, Descriptions and Keywords.  You can better capture what a site is about that way.  We kept these short because of dialup slowness, but that isn’t a problem anymore.

 

Interested in the directory hosting side of the old SK?  I will have more on that in a later post.

 

This was also posted to
/en/linking.

 

 

Liked this post? Follow this blog to get more. Follow

If I were building a search engine…

  1. You need to start building your own index of the web.  That means you need a crawler (robot) and it needs to be good.  It takes time to build an index and it is not cheap.
  2. I would gather in info from other providers: Wikipedia, Wolfram Alpha whatever I could cobble together.
  3. Until your own index is ready, you need to have a search results from a big search engine.  There are only two left: Google and Bing in English.
  4. What happens if Google or Bing refuse to sell you a search feed or if they refuse to renew because you are getting too big?  You could probably hammer together a good blended backfill feed by combining Mojeek, Gigablast and Yandex using your own algorithm.  You either lead with your own results and use the other three as backfill or you blend your own results in with the three others in a sort of seamless metasearch.  (You do need to plan for “what if” scenarios.
  5. All the above is to buy you time while you learn and refine how to crawl the web on your own.
  6. Eventually you roll out your own search index as the backbone of the search results with the others maybe on standby for queries you are weak on.

Is Duckduckgo using Bing as it’s backbone search provider while it builds its own index?  We know DDG has it’s own crawler, but we don’t know if it is building it’s own index or just spam hunting.  I certainly do not know.  If it were me, I would not be building a search engine user base successfully, and expect somebody else to provide the search feed forever.  Not when there are only two big indexes.  If I were DDG I would have a Big Plan plus I would have contingency plans about 3 or 4 layers deep.  DDG is pretty quiet about all this which is probably wise.

Qwant seems to be the other player.  Their strategy is more open:  they are actively building an index by crawling the web.  Bing is providing backfill.  It is not obvious where Qwant’s index ends and Bing’s begins.  The results are good and seamless.  With a little luck, marketing and money Qwant will eventually need to use Bing less and less.  This too is a good plan.

Fortunately, Bing is willing to sell it’s search feed to just about anyone that can pay the fee.  For now.  I do not think Microsoft really knows what to do with Bing, except to somehow milk it for all the dollars they can get until they decide what to do with it and how it fits in with the company strategy.

Search engines are important but they are not as important as they were before the social media silos of Facebook and Twitter.  You don’t need Google to find a company’s Facebook page.

If I were trying to build a search engine today those are some of the things I would try. All this could change in a year.

Liked this post? Follow this blog to get more. Follow

I always have thought web directories should have backfill from a search engine when you search them.  Especially, general directories.

Yahoo (dubbed “Mighty Yahoo”) was the first directory to add a backfill provider on it’s search results. First Alta Vista, then Inktomi, then Google, then Inktomi again provided backfill results for Y!.  Snap/NBCi directory had backfill and so did Looksmart.

How backfill worked was somebody would search the directory, when the directory ran out of listings that matched the query, you would see search results from the backfill provider.  (Being Mighty Yahoo’s backfill provider cemented Google’s reputation and proved to be one of the biggest mistakes Yahoo ever made.)  What was important is the search portal needed to provide something in a search that would satisfy the person searching, otherwise they would start going elsewhere. And they could go elsewhere because there was genuine competition in web search back then.

It gets harder for niche directories.  For example: if you go to a Star Trek niche directory and search for “uniforms” it is assumed you are only going to get uniform sites that are related to Star Trek.  In this instance you don’t want a general backfill giving results for police uniforms or nurses uniforms.

For backfill to work on a niche subject directory you almost need a place to put in an extra search “slug” to bring it into context.  So in our example, you need to have a place in your admin panel to add “star trek” as a term to the backfill.  So that somebody searching for uniforms on your Star Trek directory are automatically searching for “star trek uniforms” on the backfill.  The “star trek” is always added to the backfill.

It gets even harder for a local or geographic directory.  Frankly I think backfill should not be used with a local directory.

I did some looking around, Google and Bing are way too expensive to get a search feed from.  Smaller engines like Mojeek.com (or Mojeek.co.uk) or Gigablast.com provide search results API’s.  Both are affordable, but Gigablast’s is really really affordable.

Of course, I don’t know how to actually code this.  That’s what coders are for.  😀

But the trick here is this:

  • The more listings you get in your directory the more users will use the search function.
  • You have to have a search function.
  • No matter how sophisticated your directory search is, you have a finite amount of listings.
  • Mobile devices make the search function even more important.
  • This is why most will need backfill.

Liked this post? Follow this blog to get more. Follow