Joho the Blog » Edge-based about-ness

Edge-based about-ness

What something is about often is so implicit that it’s precisely the thing that’s not stated. And sometimes a page can’t even know what it’s about: the manual about O-ring maintenance couldn’t know that it would actually be about the Challenger disaster.

So, I wonder how a search engine like Google would do if, when assessing the relevance of a page, it counted the content of pages directly linking to it much higher than the content of the page itself. Aren’t those linking pages more likely to state explicitly what’s on the target page that warrants a link?

Maybe Google’s PageRank algorithm(s) already does that. Anyway, I bet a bunch of people have already studied this extensively and have pre-figured out why I’m wrong.

Previous: « || Next: »

7 Responses to “Edge-based about-ness”

  1. Seems to me that your suggestion would make “gaming” Google even easier than it currently is. All one would need to do is have many links to, say, the JOHO blog on your pr0n page. I agree that it beats comment spam, but it then allows one to self-moderate one’s own page, rather than relying on the judgement of the community that is reflected (if I’m reading all this right) in the current model.

  2. http://poorbuthappy.com/ease/archives/002829.html

    Good Friday afternoon stuff: Joho the Blog: Edge-based about-ness: “What something is about often is so implicit that it’s precisely the thing that’s not stated. And sometimes a page can’t even know what it’s about: the manual about O-ring maintenance…

  3. This is a classic problem in traditional, library-oriented information organization, and one of the reasons why local, community-based classification schemes are easier to handle than the big universal ones like Dewey and Library of Congress. Often, the only way to navigate through the problem is to anticipate what people would use the page FOR; obviously, it’s easier to do that when you’re classifying documents for a distinct group, rather than anyone who could conceivably be interested.

    It gets even tougher when you are dealing with documents by or about minority groups, particularly sexual minorities who are either being allegorical to skate past a censor, or are insisting on the universality of their content, to break into the mainstream. Hence all those phrases, uttered to the press and to reviewers: “It’s not about gay people. It’s about everybody.”

    For that reason, your idea is a good one.

  4. Here is a quote and pointer and to a very interesting thread in the Google News forum in webmasterworld.com

    Read this thread. It’s got it all.

    Latent Semantic Indexing
    http://www.webmasterworld.com/forum3/21115.htm

    Back in 1998 I was publisher of several trade journals. One was for the equine (that’s horses) industry. This new employee with the funny title of “webmaster” (I thought she should wear black leather with that title) took our text and put it on this thing called a “web site.”

    In a meeting she demonstrated this software called a “search engine.” You could find any article easily. Just type in what you’re looking for, and hit return, and ta da! there it was.

    OK! Our most popular story–often reprinted–was a basic feature on colic. So, I typed in colic. Couldn’t find it.

    Turns out, the article never used the word colic, other than the headline, which was not indexed. Instead, the author used the more informal words, such as “tying up.”

    This happens more than you think, especially on powerful authority sites which use their own short hand.

  5. Umm, didn’t you just describe the basic principle of Google-bombing?

    That works by manipulating an external factor of “aboutness” (link anchor text)

  6. “What something is about often is so implicit that it’s precisely the thing that’s not stated. And sometimes a page can’t even know what it’s about…”
    aah, now that’s the prof. W. that I remember.

  7. Seth, yes. I was wondering what would happen if you boosted the weighting of the semantics of linked pages (and not just the anchor text) way over (note the underlined “much”) the semantics of the target page. Maybe Google does that already.

Leave a Reply


Web Joho only

Comments (RSS).  RSS icon

Switch to our mobile site