How To Tell Stuff To A Computer

Mail

Shortcuts For Experts

Intro
RDBMS/XML
FOL
Frames
Description Logics
A.I.
RDF
UMLS
Google
Conclusion

AI and Knowledge

In this chapter we will be studying artificial intelligence and its relationship to knowledge representation. This chapter is one of the shorter chapters in the entire tutorial, and there are several reasons for this. First of all, one of the truisms of A.I. is that the moment anything in this field becomes useful, it is no longer called A.I.. In fact, most of the ideas we have discussed up to now- such as description logics, frame systems, etc. were originally developed within the A.I. community. Another reason this chapter is short is that many ideas in A.I. are not directly related to representing knowledge. The final reason is that much of the technology simply hasn't proven to be very successful (yet).

What do we mean by A.I.?

For the sake of this primer and our focus on KR, we will refer to A.I. as the ability to deal with unexpected knowledge. If we look at frame or DL systems, we can see that these will only work if all the information they operate on has been validated by a human being to be in the correct format and to contain a predetermined vocabulary. That's why these types of systems are more appealing to the Guys in the Garage- Starting with data in the correct format makes any use of the information a lot easier. A believer in pure A.I. technologies, on the other hand, would argue that with enough science, it would be possible to take the creations of The Writer and extract from them their true meaning- Despite the fact that unstructured text frequently contains unexpected changes of topic, unexpected phrasing, unexpected vocabulary, unexpected nuances of context, and maybe even unexpected typos. The assumption is made that knowledge can be represented and extracted in a manner that requires few compromises and doesn't reacquire the software to be "dumbed down".

In the seventies, a vast amount of effort was put in by researchers at many academic centers to bring this vision into reality. Unfortunately, they remained mostly unsuccessful, making grants for A.I. projects notoriously difficult to come by in more modern times. This downturn in A.I.'s fortunes is often referred to as the A.I. winter.

The different types of A.I.

Three major approaches are taken in order to build knowledge systems with A.I. The least interesting, for our purposes, are those that use strictly probabilistic methods. These include bayesian systems and reverse markov chaining systems- Commonly used for things like Spam filters and automated voice dictation. No real attempt is made in these cases to really "understand" the information filtered into these systems- Instead, they analyze information superficially to do limited (if often very impressive) things with the information they are given. These techniques are very successful at analyzing unstructured information, but are limited in their flexibility.

The second major approach is to use neural networks or similar systems to manipulate information. In common parlance, scientists who work in this area are often referred to as "The Scruffies" because the way their systems represents knowledge lacks any strict formalism. A neural network, for instance, will build ad-hoc linkages between "neurons" based on an input of knowledge and can then respond to queries about the data at a later time- But the ad-hoc linkages that are created in a manner that makes it impractical to extract any formal knowledge from them. For this reason, "The Scruffies" are also not very interesting for the science of knowledge representation.

Symbolic A.I.

One type of A.I., however, is incredibly interesting for KR researchers: The science of Symbolic A.I. At the core of a symbolic A.I. system lies a highly formal representation system- Perhaps a frame system or a description logics (this is where these KR systems first cam into use). However, the data it receives from the outside will be largely unstructured. For instance, suppose the unstructured data is as follows:

    Wally got up in the morning, showered, brushed his teeth, and, later, shaved.

The system will then study this information and will then attempt to build hypotheses as to how the information can be represented by its internal KR language. A multitude of hypotheses might need put forward:

"Wally is a person, not the family's pet schnauzer, because animals usually don't shave"
"The word later probably doesn't mean that he shaved late in the day, since shaving is usually a morning activity"
"Wally probably got up, showered, and brushed his teeth in succession, not simultaneously"

To represent these hypotheses the systems breaks them into parts: Each hypothesis contains a formal bit of knowledge that can be represented in the internal vocabulary, whereas the rest of the hypothesis is information of a probabilitstic nature. So it might, conceivably, be represented as follows:

    (hypothesis (is wally person) 0.9)
    (hypothesis (time shave morning) 0.6)
    (hypothesis (sequential-performed-activities wally (got-up showered brushed-teeth)) 0.8)

...where the last number represents the probability that the hypothesis is true. The system would constantly check the internal KR it has for inconsistencies, using a logical reasoning engine to deduce possible new facts from existing bits of knowledge. As new information is received, the system would constantly adjust the probabilities of previous hypotheses (and possibly discard them all together) based on new information. Also, the system constantly needs to maintain some type of context- For instance, if it turns out that the information being analyzed is part of a children's story, then it might want to give less weight to the hypothesis that Wally is definitely a Person and not an Animal. After a text has been completely analyzed, the system would then have, in its internal kr format, a representation of the information in a KR format that it could output or use for other purposes.

...So Why is A.I. so difficult?

Besides the fact that a decent symbolic reasoning system is just ridiculously complex in terms of its architecture to begin with, it has another great problem to deal with: Humans, when writing text or generating other information can draw on a vast amount of "common sense" knowledge to facilitate communication with any potential reader- Like "non-human animals usually don't shave themselves". Any A.I. system needs to duplicate this "common sense" to some degree if it wishes to succeed in comprehend most written text. Building such a database is mostly an "all or nothing" proposition- Although such "common sense" can help you deal with unexpected information, there are just so many different ways that unexpected information can be unexpected- And if you can only handle 40% of the cases, then there will always be some software created by The Guy in the Garage that will outperform your system in a manner that is totally unintelligent- Or maybe people will just read what The Writer has written directly, without using your software ahead of time.

progress over time in most sciences

progress in A.I.

Are any more KR ideas in A.I. like Frame Systems and Description Logics That Haven't yet Made The Big Time?

Although most systems in A.I. tend to have frame systems or description logics at heart, some people have been experimenting with more exotic ways of representing knowledge that have not yet been used much outside of the field of A.I.. Most interesting of these try to address the way humans use metaphors and analogies to express ideas- I have not yet seen any example of how such concepts could be practically represented in a description logics. See the book Fluid Concepts and Creative Analogies. I would love to see a system that could do this!

RDF and the Semantic Web >>