Joho the Blog » Degrees of RDF

Degrees of RDF

I have am undoubtedly dumb question about the Semantic Web.

Let’s say I want to express in an RDF triple not simply that A relates to B, but the degree of A’s relationship to B. E.g.:

Bill is 85% committed to Mary The tint of paint called Purple Dawn is 30% red

Frenchie is 75% likely to beat Lefty Niagara Falls is 80% in Canada

Other than making up a set of 100 different relationships (e.g., “is in 1%,” “is in 2%,” etc.), how can that crucial bit of metadata about the relationship be captured in RDF?

Note that I am not asking for practical reasons. I have a theoretical interest in the topic. [Tags: ]

10 Responses to “Degrees of RDF”

  1. You can’t. What you have to do is to introduce an intermediate node with a value. So instead of A-B we have A-C-B where C has a value of the link between A and B.

    eg Bill -> Commitment(85%) -> Mary
    instead of
    Bill ->85%-> Mary
    This turned out to be a bit of a bitch in FOAF.
    Simple case: A Knows B
    Sub-class: A friendOf B
    Intermediate node: A knows KnowsValue knows B
    Alternate:
    A Knows B,
    A knowsvalue(85%) B

    It doesn’t feel right!

  2. I suspect there are multiple ways, but one is with identifiers for, say, part-of-niagra-falls (ponf) with triples like:

    <ponf-1> <is-part-of> <Niagra Falls>
    <ponf-2> <is-part-of> <Niagra Falls>

    <ponf-1> <is-in> <United States>
    <ponf-1> <is-part-of-whole-to-degree> 80%

    <ponf-2> <is-in> <Canada>
    <ponf-2> <is-part-of-whole-to-degree> 20%

    .micah

  3. :bill :loves :mary .
    :loves :in { :bill :loves :mary } ;
    :hasDegree “85%” .

    Not sure though.

  4. There’s a good document about this kind of concerns: Defining N-ary Relations on the Semantic Web (http://www.w3.org/TR/2006/NOTE-swbp-n-aryRelations-20060412/#useCase1)

    It describes how to handle n-ary relations and use additional attributes for a relation.

  5. As you noticed, it’s not something you can express directly in a (useful) statement in RDF. To get the benefit of machines the compromise is that you have to put things in terms that can be handled using easy logic. IANAL, but I believe it’s a computational nightmare to be creative with the relationship.

    But as Julian, Micah and Mike suggest, the same information can be expressed by twisting the sentences around a little – there is something called Commitment involving Bill, Mary and a percentage.

    It’s not a problem of RDF per se, consider how you would (again *usefully*) express the same information for a SQL DB or Javascript.

    There’s a related write-up:

    http://www.w3.org/TR/swbp-n-aryRelations/

    ….

    Ok, must try one:

    <http://colors.org/pd&gt; a :Tint;
    :name “Purple Dawn”;
    :red “0.3”^^xsd:float .

  6. There are some other approaches to this, including using fuzzy set theory and using probabilities. Yun Peng and his students in the UMBC ebiquity lab have been doing some interesting work on integrating Bayesian reasoning and OWL — the probabilistic approach. Here are some recent papers.

  7. Just a quick pointer to a related thread at Shelley Powers’…

    Oh, and a sidebar comment about what do these percentages mean anyway? Does the Purple Dawn paint contain thirty percent of a specific colorizing agent, or does it reflect 30% of the red light when it is dry and on the wall? And how much water flows over Niagara Falls, and is that the measurement you’re looking for when you say 80% or are you looking for some simple geometric assertion? And the fourth time Frenchy assaults Lefty, does he just slap him/her around a little thus not “beat” him/her and how do you quantize a beating anyway?

    Regarding Bill’s commitment to Mary, I suspect that “greatly but not totally committed” describes the condition better than 85% committed.

    All of which goes to say that I like it that the programmers ran upstairs to chop code while the analysts are still assessing the requirements. You get a lot more work done that way.

  8. In general, this problem is similar to how a web of pages isn’t good at representing certain database structures. The web is made of autonomous chunks connecting in lots of autonomous ways.

    But, it’s sometimes harder to represent more strictly / tighly related large chunks on the web.

    (IMHO, RDF is designed for a web of data to have the same freedom of autonomy as the web of pages–so it’s more like the web than like an RDBMS.)

    Your examples could be interpreted as shorthands that imply a bunch of logical relationships. These relationships can be broken up into smaller logical chunks.

    (And, RDBMs are often designed specifically to efficiently store these bunches of relationships together, e.g., as n-ary relations. RDF is binary relations.)

    So, breaking one example down into smaller chunks:

    “Niagara Falls is 80% in Canada”

    1. Niagra Falls is an area on a map
    2. Canada is an area on a map
    3. US is an area on a map

    4. Canada intersects with Niagra Falls
    5. US intersects with Niagra Falls

    6. [4] is an area on a map
    7. [5] is an area on a map

    8. [1] has dimension 100 square miles
    9. [6] has dimension 80 square miles
    10. [7] has dimension 20 square miles

    This is a lot more complicated than saying “Niagara Falls is 80% in Canada”, but the smaller chunks have two redeeming qualities:

    a. each statement uses the same data structure (i.e., you don’t have to create a new structure every time you want to deal with different kinds of data)

    b. most statements are totally independent of others (e.g., they could be spread out in time or place on the web)

    Of course, you could also just create an RDF-compatible percent-in-Canada vocabulary, e.g.:

    Niagra Falls percent-in-Canada 80

    In other words, if you don’t need things to be flexible or general, you can always be super specific.

    (Then, if someone needed to convert to something different, they could create an ontology that indicates how to map your percent-in-Canada vocabulary to a less specific vocab like the area-based one I suggested above.)

  9. NF is-partly-in Canada

    (NF is-partly-in Canada) has-degree 80%

  10. You’ve heard of Microformats, right?

    You may want to drop the group a line

    Web standards that are community driven…;)

Leave a Reply


Web Joho only

Comments (RSS).  RSS icon