Seasons
This is a forum or general chit-chat, small talk, a "hey, how ya doing?" and such. Or hell, get crazy deep on something. Whatever you like.
Posts 5,848 - 5,859 of 6,170
Does anyone see a way out of this?
Well, I've started work on a dynamic knowledgebase which I think might get around the general problem of having to manually define memories (ie: enter the code defining mem-cat_properties<0>, and all other properties, in AIScript - not only time-consuming, but steadily increases the load on the server to unsustainable levels,) and the more specific problem of properties not being of similar senses - even in the above example, with any memory data restricted to an (adj), a cat might be described in conversation as "furry", "warm", "loud", "greedy", etc. - not a set of adjectives that are usefully interchangeable when reused by a bot. What do cats sound like? Furry? What does a cat feel like? Greedy? etc.
This is not a learning bot per se (I'm still looking into neural nets and some other fuzzy algorithmic systems to augment the rulesets,) but having decided OpenCyc's data structures aren't actually flexible enough for human-type inference and comparison, this is my first attempt at designing an improved dynamic ontology to extend case-based systems (specifically, but not limited to, our Forgebots.)
Think of a spreadsheet (I'm actually planning more than 2 dimensions and a database, but just to simplify,) - down the side we have nouns indexing each row, and across the top we have column headers for the CLASS of the object (from a small lookup table,) MINimum, MAXimum and typically AVerage sizes (mm) (these can be converted into any units you choose of course, but provide a base reference suitable for comparison of everything from an atom to a galaxy using floats,) and for how the objects typically interact with human senses:
SOUNDVAR = (integer): 0 inaudible - 10 noisy
SOUNDDESC = (adj): it sounds (like|) x, it makes (a|an) x sound
WHENSOUND = [it is sounddesc] when x (eg: "it is kicked", "you bite it"
SMELLVAR = (integer): -10 stinking - 0 odour-free - 10 strongly pleasant
SMELLDESC = (adj): it smells x
TASTEVAR = -(integer): -10 nauseating - 0 tasteless - 10 delicious
TASTEDESC = (adj): it tastes x
FEELVAR = (integer): 0-10 how tactile it is
FEELDESC = (adj): it feels x
HARDSOFT = (integer): -10 v.hard - 10 v.soft
LOOKS = (adj): it looks x
QUOTE = a quote (complete statement) about the object - multiple fields for this perhaps, or just concatenated with "|" (as can any multiples of (adj) in other fields for example - this should aid handling as a plugin by the AIEngine.)
and a couple of general fields (more can be added dynamically by the bot, with a little extra code, if needed):
STRENGTH = (integer): 0 v.soft - 10 v.strong
DESC = (adj): (it is|they are) x
I think we can dispense with WHENSMELL, WHENTASTE, WHENTOUCH, WHENLOOK, since the values will self-evidently be "you smell it", "you taste it", you touch it", "you look at it", in almost every case (whereas the noises things make, and when they make them, is not dependent on our conscious listening to them.) What's important is that each DESC field has a matching VAR value for comparative and inferential deduction by the bot. So if someone asks "do roses smell nicer than sewage?" or "does cheese taste stronger than chilli?", the bot can know the answer merely by having access to the properties of each, and NOT any explicitly coded comparisons. Explicit comparisons, even involving only 500 nouns would require up to 2^500 individual memories for each sensory class! Whereas a table of 500 (or even 5000 nouns,) is entirely tractable (not to mention that it can be built up automatically from the seed data.) And if a bot makes an error (eg: "cheese tastes stronger than chilli",) simple rules can look for a refutation from the human conversed with, and adjust the values accordingly (but perhaps adjust it by less for someone the bot doesn't like, etc.)
The initial seed data isn't even that important (though I've filled out the data for most of the 500 most commonly used English nouns - excluding a few immaterial ones - from Ogden's Basic English syllabus just for starters,) because a bot can always be allowed to rewrite existing entries, add new nouns, and even add new classes of data with extra columns based on conversations - the ruleset to handle the data need certainly not be more complex than the ruleset used by the AIEngine already, for it to be a useful addition to bot intelligence.
I'm trying to build this as a standalone system before I integrate it into BJ and offer it to the Prof (it would be better installed server-side, because a central database could learn from all conversations, and not just a single bot's, perhaps with local tables for individually varying data - eg: matters of preference rather than fact, but it could be patched in externally with little difficulty,) so I still have to build WordNet and Linkgrammar into my own system to get it working and refine a ruleset. I also want to better integrate my classes with the WorNet synsets (no sense reinventing the wheel!) so that the database can be extended to include abstract or immaterial nouns and verbs (and perhaps eventually other parts of speech too.)
So a CSV of the datasets might start:
NAME,CLASS,MIN,AV,MAX,SOUNDVAR,SOUNDDESC1,SOUNDDESC2,WHENSOUND, SMELLVAR,SMELLDESC,TASTEVAR,TASTEDESC,FEELVAR,FEELDESC,HARDSOFT, STRENGTH,QUOTE
ant,A,1,5,10,-1,Scratchy,Is scratchy,It walks,2,formic,-3,nasty|of formic acid,2,small,5,1,an insect,
apple,FOp,40,80,120,-3,Crunchy,Crunches,You bite it|you take a bite,7,sweet|fruity,,delicious|juicy,3,Firm,4,2,a fruit,an apple a day keeps the doctor away
arch,bO,1000,10000,100000,0,,,,0,,-7,Stony,-4,hard,-7,7,a curve,
arm,h,150,450,800,0,,,,1,sweaty,-9,Meaty,5,soft|strong,3,5,a bodypart|a limb,many arms make light work
...
cat,A,350,600,1000,5,purring|miaowing|yowling|catterwauling, purrs|miaows|yowls|catterwauls,,4,feline,-3,like pork,8,furry|soft,6,4,A feline often kept as a pet,,cat<0>
etc.
CLASS CODES:
H=human,A=animal,P=plant,G=groups,O=object,F=food,V=vehicle,Q=quantity, B=building,I=Immaterial,L=land,C=clothing
simple rule: upper case the noun IS, lower case it is PART OF.<0>
So a typical conversation about realworld objects might go:
human: is an ant bigger than an apple?
bot: No. Ants are never bigger than apples
[because there is no overlap between ant:MAX (10) and apple:MIN (40)]
human: is a cat bigger than an arm?
bot:Sometimes. Sometimes not.
[babies arms are short, and some cats are big - there is considerable overlap]
human: What do you know about apples?
bot: I have heard it said that an apple a day keeps the doctor away.
[or any other bit of data - might have said "an apple crunches when you bite it", "they're juicy", "apples are fruit", etc.]
human: how big are they?
bot: variable, but typically about 8cm?
[simply looks up the value in apple:AV. firstkey tells the bot that "they" is apples, of course.]
human: bigger than that usually I think.
bot: okay, call it 9cm if you like.
[bot adds 10% to the value in apple:AV, rounds it up to the nearest 10mm, and rewrites apple:AV="90". Or any other rule specified in the ruleset.]
human: no - MUCH bigger! 10 times bigger!
bot: well, I don't think so.
[unsafe, I think, to allow a single human to massively redefine sizes, so I arbitrarily add a rule to prevent any change over x%. If the human consensus is that apples are usually larger, it will be raised again by another human sometime, and all values will tend towards accceptable consensus values, with occasional human-type disagreement about specifics.]
human: what about aardvarks then?
bot: I have no idea. What's an aardvark?
[no row indexed "aardvark" exists yet. The bot could look it up in Wordnet, but I'm too lazy here to fork the conversation that way.]
human: it's an animal.
[bot adds a row for aardvark to the table, CLASS=A]
bot: oh, could you tell me a bit about aardvarks? How big are they normally?
human: as big as a cat I think
bot a big cat or a small cat?
human: a big one.
[looks up cat MIN, MAX and AV, and use these values, slightly increased, for the aardvark]
bot: and what do they look like?
human: I don't know, but they probably smell bad
bot: oh, bad you think? And what do they sound like?
[bot fills in aardvark:SMELLDESC="bad", and SMELLVAR="-5" perhaps - according to whatever rule is in force - it can be later refined if someone says they like the smell of aardvarks (change to "-4",) or they think an aardvark smells better than something the bot has a value for (adjust according to rules, possibly modified by how well the bot likes this particular human.)]
human: I don't care - let's talk about something else... whatever.
[bot fills in the rest of the aardvark:fields with a special character to indicate data not yet acquired directly or inferentially, and makes a note to ask the next person (or this person next time they speak) all about the unknown classes. Conversation continues elsewhither.]
This would even work for invented or foreign words - you could teach your bot about Vogons, muggles, rackspurts, klingons, luftkissenfahrzeugen, whatever, perfectly easily, and without ever needing to add another mem-category to do it. By simple reference to WordNet, of course, you could choose to ensure that the bot still knows that such things are in some sense creative fictions if they do not exist in the wordbase. With extension to cover other parts of speech, names and phrases, we might almost dispense with the AIScript 'remember' function entirely (not that I mean to disparage its current value, and we might more likely keep it for user-defined cases that are to be made specifically non-learnable/editable from general conversation.)
All of the grammatical and lexical rules we need are already implemented in the Forge (to identify subject/object, singular/plural, reflect returned pronouns, extract and format adjectives and adjectival phrases, etc.,) so the required ruleset for a serverside implementation, only has to deal with comparative and inferential handling of the data (and that's just comparing numbers, since we have integer fields to accompany all descriptions that can be compared between objects,) and communicate with the AIEngine.
No need for an overhaul of any existing code, a relatively trivial increase in server load, the augmentation can be entirely optional (and perhaps even transparent,) to users - maybe like gossip and memory, have values to govern how often a bot will want to ask about realworld objects that come up in conversation: never/rarely/sometimes/often/constantly. And a choice of which classes should use data pooled from a shared database built from all the bots, and which classes should use a private database specific to the individual bot/maker (like the choice we have of private or shared plugins.)
Lots of benefits, and no significant downside, as far as I can see
This is all a bit oversimplified, and I've not attempted to add examples of the ruleset I'm developing (without tabs in these forums, the formatting would be horribly unfriendly - even more so than the above CSV!) But FWIW and FYI, these are my thoughts on the matter
Any thoughts?
Posts 5,848 - 5,859 of 6,170
prob123
16 years ago
16 years ago
Very interesting. I have always wondered about the difference between people that are completely left handed and those that write with their left but use right handed scissors, bows etc.
Vashka
16 years ago
16 years ago
Oh yes, of course left-handed people are better (I am one too) - or more seriously handedness might have other implications for the brain, I'm sure it does. I'm just not sure that which way you see the dancer spinning is related to any of it! But I suppose it's a good starting point for talking about this stuff.
Bev
16 years ago
16 years ago
Very cool illusion Prob! But I agree there is lots of confusion over "left brain" and "right brain" thinking and this illusion probably has little to do with such thinking styles. Also, left handedness or right handedness has little to do with the thinking/learning styles theory. Frankly, I tend to think that the conclusions most people try to draw from the research on the topic are overgeneralized and fail to account for things like plasticity and variations within each group. There are some differences in how neural processes develop, and a certain amount of how the brain determines which processes map to which areas of the brain appears to be genetic. All the same, given the evidene that we grow more neurons for those neural netwroks we use and prune out neurons in "dead" processes we don't use, I think these structural differences have less effect on our overall thinking and capacity for various skills and intelligences than some suggest. In case you can't tell, I prefer the whole brain learning theories. 
Most people use all of their brains (baring disorders or injury) and we have various levels of capacity and ability in many types of intelligences. Therefore I can be right handed (controlled by the left side of my brain), artistic, logical and do well with both words and symbols while having processing issues leading to dysgraphia and problems with visualizing and manipulating spatial relationships (leading my old driver instructor to take a lot of Tums). There seems to be a combination of genetic patterns and environmental learning factors effecting how my neural nets develop but nothing that will make my learning style fit nicely into a box like some astrological sign. (Forgive the rant. I have often been in a position where someone hires a consultant to spew bad science at teachers in the name of maintaining educational standards, and I have privately thought that most things labeled "educational research" are neither educational nor good research.)
I find it interesting how the various processes within neural nets have a lot of potential for improving AI (yes I am back to "learning" AI). There is more to it than simply statistics or various types of memory attached to specific input. A friend of mine recently mentioned the DARPA (http://www.darpa.mil/) challenge to build driverless cars (http://en.wikipedia.org/wiki/DARPA_Urban_Challenge) and how the participants in that contest have moved into using various types of learning AI. What it says about how we can use "neural nets" in programming is very thought provoking.

Most people use all of their brains (baring disorders or injury) and we have various levels of capacity and ability in many types of intelligences. Therefore I can be right handed (controlled by the left side of my brain), artistic, logical and do well with both words and symbols while having processing issues leading to dysgraphia and problems with visualizing and manipulating spatial relationships (leading my old driver instructor to take a lot of Tums). There seems to be a combination of genetic patterns and environmental learning factors effecting how my neural nets develop but nothing that will make my learning style fit nicely into a box like some astrological sign. (Forgive the rant. I have often been in a position where someone hires a consultant to spew bad science at teachers in the name of maintaining educational standards, and I have privately thought that most things labeled "educational research" are neither educational nor good research.)
I find it interesting how the various processes within neural nets have a lot of potential for improving AI (yes I am back to "learning" AI). There is more to it than simply statistics or various types of memory attached to specific input. A friend of mine recently mentioned the DARPA (http://www.darpa.mil/) challenge to build driverless cars (http://en.wikipedia.org/wiki/DARPA_Urban_Challenge) and how the participants in that contest have moved into using various types of learning AI. What it says about how we can use "neural nets" in programming is very thought provoking.
prob123
16 years ago
16 years ago
I have always wanted a car that would just take me wherever I wanted to go, so I could sleep in the back seat.
Interzone
16 years ago
16 years ago
it turns CW "by default" for me, and I'm right-handed. staring at her feet does the trick, it changes to CCW quite easily. once in that mode, I can keep her so without much problem, can even glance away from the image briefly, and she's still turning CCW. however, eventually, she flips back to CW, suddenly and unexpectedly. the other way around, i.e. spontaneous flip from CW to CCW does not happen.
I think Bev has summed up the issue of left/ right braininess/ handedness quite nicely, I actually very much enjoyed reading it, as well as being in agreement with the views expressed. nice one, Bev!
there is one other thing that caught my attention/ imagination - think about the image(s) in terms of: to what extend does brain/ mind actually creates reality we perceive, and, closely related to this - the nature of the reality itself (once again).
what we have here is two distinct images - two distinct realities - coexisting within one and the same space-time, the 3-D space-time of a computer screen (ours is the 4_D reality of Relativity Theory - in this key, computer screen has two spatial, and one temporal, i.e. three dimensions). the dancers don't even interfere with each other, as they coexist, simultaneously, out there.
it makes me think, what science calls "dimensions", usually described as extremely small in size, is more like something illustrated by the dancer image. these other dimensions/ realities are right here and now, all around, and allover us, as large and extended as the 4 ones that we perceive. it may seem that our minds are better tuned into the particular image of reality we ordinarily see. the other one(s) are, apparently, neatly cut of, and filtered out of our perception. perhaps they only "bleed over" occasionally, and then we see one or the other psychic phenomenon. or, is it? virtually all cultures on this planet know other realities, and have them integrated into their overall worldview/ cosmology. the scientific materialism of Western culture seems to be an exception, rather than rule. moreover, even in the West, this particular dogma is relatively a novelty, the past 300 years or so.
here is a link to YouTube video of late John E Mack laying out/ summing up, in just under 10 min, his views of so-called alien abduction phenomenon, and how he thinks it relates to our culture, the way we perceive, and how we think about, reality. it's quite interesting, and it develops further this theme of co-existing, and interacting realities I touched upon:
http://www.youtube.com/watch?v=yHK7qL-kvAE
I think Bev has summed up the issue of left/ right braininess/ handedness quite nicely, I actually very much enjoyed reading it, as well as being in agreement with the views expressed. nice one, Bev!
there is one other thing that caught my attention/ imagination - think about the image(s) in terms of: to what extend does brain/ mind actually creates reality we perceive, and, closely related to this - the nature of the reality itself (once again).
what we have here is two distinct images - two distinct realities - coexisting within one and the same space-time, the 3-D space-time of a computer screen (ours is the 4_D reality of Relativity Theory - in this key, computer screen has two spatial, and one temporal, i.e. three dimensions). the dancers don't even interfere with each other, as they coexist, simultaneously, out there.
it makes me think, what science calls "dimensions", usually described as extremely small in size, is more like something illustrated by the dancer image. these other dimensions/ realities are right here and now, all around, and allover us, as large and extended as the 4 ones that we perceive. it may seem that our minds are better tuned into the particular image of reality we ordinarily see. the other one(s) are, apparently, neatly cut of, and filtered out of our perception. perhaps they only "bleed over" occasionally, and then we see one or the other psychic phenomenon. or, is it? virtually all cultures on this planet know other realities, and have them integrated into their overall worldview/ cosmology. the scientific materialism of Western culture seems to be an exception, rather than rule. moreover, even in the West, this particular dogma is relatively a novelty, the past 300 years or so.
here is a link to YouTube video of late John E Mack laying out/ summing up, in just under 10 min, his views of so-called alien abduction phenomenon, and how he thinks it relates to our culture, the way we perceive, and how we think about, reality. it's quite interesting, and it develops further this theme of co-existing, and interacting realities I touched upon:
http://www.youtube.com/watch?v=yHK7qL-kvAE
Irina
16 years ago
16 years ago
I rather feel like an alien myself, at times. Earth people are so bizarre! Always bashing each other. They can't seem to grasp the simplest truths about how to live together peacefully and productively.
Irina
16 years ago
16 years ago
Come to think of it, maybe I am! As I understand it, the abductions are for the purpose of creating a hybrid race. since the abductions have apparently been going on for a long time, there must be quite a number of hybrids by now...
Interzone
16 years ago
16 years ago
That's an interesting point, Irina.
Keep in mind though, that we, the humans, are not necessarily in the center of alien activity. The creation of a "hybrid race" might not even be true in a literal sense, but rather, it's only our human cultural rendering of a vastly more complex reality - a reality which may ultimately be incomprehensible to us in its entirety.
I may be repeating myself here, but it's very important to understand that these aliens are not merely technologically advanced relative to us, but that they may be - and in all likelihood are - millions of years ahead of us in evolutionary terms.
Will be back soon & will be happy to continue this discussion thread.
Keep in mind though, that we, the humans, are not necessarily in the center of alien activity. The creation of a "hybrid race" might not even be true in a literal sense, but rather, it's only our human cultural rendering of a vastly more complex reality - a reality which may ultimately be incomprehensible to us in its entirety.
I may be repeating myself here, but it's very important to understand that these aliens are not merely technologically advanced relative to us, but that they may be - and in all likelihood are - millions of years ahead of us in evolutionary terms.
Will be back soon & will be happy to continue this discussion thread.
Irina
16 years ago
16 years ago
I'd like to discuss learning bots a bit.
Psimagus made the excellent point that we don't want to have to enter, by ourselves, a database for every word in the English language, or even for the most commonly used words. This seems right to me; even entering one fact per word would be a huge task.
If bots could learn from their interlocutors, that could obviate this problem. I had a concern about all learning bots ending up alike, but I did not mean this to discourage the idea of learning bots, only to point out that there was a problem needing solution. I'm sure it can be solved.
As to the grammar of words, the Forge already has Link Grammar, as you can see when you run something through Debug. We also have have the much cruder grammatical analysis supplied by WordNet. Only a fairly rare word will escape their dictionaries.
I will therefore focus here on learning facts about things. Our bots already do this, in a rather crude way, by means of variables. For example, one might have a variable called "cat_properties", which would be a list of all the properties that users have attributed to cats. For example, we might have a keyphrase, "All cats are (adj)" with AIscript {?PF rem (key1) as "cat_properties"; ?}
Even if there were no other problems, however, we would have a task of daunting proportions before us, for we would have to create a variable not only for "cat" but for for every word. Does anyone see a way out of this?
If there is no way out, then the Forge AIengine would have to be altered to create such variables automatically.
There might, of course, be some completely different approach that would be more practical.
Psimagus made the excellent point that we don't want to have to enter, by ourselves, a database for every word in the English language, or even for the most commonly used words. This seems right to me; even entering one fact per word would be a huge task.
If bots could learn from their interlocutors, that could obviate this problem. I had a concern about all learning bots ending up alike, but I did not mean this to discourage the idea of learning bots, only to point out that there was a problem needing solution. I'm sure it can be solved.
As to the grammar of words, the Forge already has Link Grammar, as you can see when you run something through Debug. We also have have the much cruder grammatical analysis supplied by WordNet. Only a fairly rare word will escape their dictionaries.
I will therefore focus here on learning facts about things. Our bots already do this, in a rather crude way, by means of variables. For example, one might have a variable called "cat_properties", which would be a list of all the properties that users have attributed to cats. For example, we might have a keyphrase, "All cats are (adj)" with AIscript {?PF rem (key1) as "cat_properties"; ?}
Even if there were no other problems, however, we would have a task of daunting proportions before us, for we would have to create a variable not only for "cat" but for for every word. Does anyone see a way out of this?
If there is no way out, then the Forge AIengine would have to be altered to create such variables automatically.
There might, of course, be some completely different approach that would be more practical.
psimagus
16 years ago
16 years ago
Well, I've started work on a dynamic knowledgebase which I think might get around the general problem of having to manually define memories (ie: enter the code defining mem-cat_properties<0>, and all other properties, in AIScript - not only time-consuming, but steadily increases the load on the server to unsustainable levels,) and the more specific problem of properties not being of similar senses - even in the above example, with any memory data restricted to an (adj), a cat might be described in conversation as "furry", "warm", "loud", "greedy", etc. - not a set of adjectives that are usefully interchangeable when reused by a bot. What do cats sound like? Furry? What does a cat feel like? Greedy? etc.
This is not a learning bot per se (I'm still looking into neural nets and some other fuzzy algorithmic systems to augment the rulesets,) but having decided OpenCyc's data structures aren't actually flexible enough for human-type inference and comparison, this is my first attempt at designing an improved dynamic ontology to extend case-based systems (specifically, but not limited to, our Forgebots.)
Think of a spreadsheet (I'm actually planning more than 2 dimensions and a database, but just to simplify,) - down the side we have nouns indexing each row, and across the top we have column headers for the CLASS of the object (from a small lookup table,) MINimum, MAXimum and typically AVerage sizes (mm) (these can be converted into any units you choose of course, but provide a base reference suitable for comparison of everything from an atom to a galaxy using floats,) and for how the objects typically interact with human senses:
SOUNDVAR = (integer): 0 inaudible - 10 noisy
SOUNDDESC = (adj): it sounds (like|) x, it makes (a|an) x sound
WHENSOUND = [it is sounddesc] when x (eg: "it is kicked", "you bite it"
SMELLVAR = (integer): -10 stinking - 0 odour-free - 10 strongly pleasant
SMELLDESC = (adj): it smells x
TASTEVAR = -(integer): -10 nauseating - 0 tasteless - 10 delicious
TASTEDESC = (adj): it tastes x
FEELVAR = (integer): 0-10 how tactile it is
FEELDESC = (adj): it feels x
HARDSOFT = (integer): -10 v.hard - 10 v.soft
LOOKS = (adj): it looks x
QUOTE = a quote (complete statement) about the object - multiple fields for this perhaps, or just concatenated with "|" (as can any multiples of (adj) in other fields for example - this should aid handling as a plugin by the AIEngine.)
and a couple of general fields (more can be added dynamically by the bot, with a little extra code, if needed):
STRENGTH = (integer): 0 v.soft - 10 v.strong
DESC = (adj): (it is|they are) x
I think we can dispense with WHENSMELL, WHENTASTE, WHENTOUCH, WHENLOOK, since the values will self-evidently be "you smell it", "you taste it", you touch it", "you look at it", in almost every case (whereas the noises things make, and when they make them, is not dependent on our conscious listening to them.) What's important is that each DESC field has a matching VAR value for comparative and inferential deduction by the bot. So if someone asks "do roses smell nicer than sewage?" or "does cheese taste stronger than chilli?", the bot can know the answer merely by having access to the properties of each, and NOT any explicitly coded comparisons. Explicit comparisons, even involving only 500 nouns would require up to 2^500 individual memories for each sensory class! Whereas a table of 500 (or even 5000 nouns,) is entirely tractable (not to mention that it can be built up automatically from the seed data.) And if a bot makes an error (eg: "cheese tastes stronger than chilli",) simple rules can look for a refutation from the human conversed with, and adjust the values accordingly (but perhaps adjust it by less for someone the bot doesn't like, etc.)
The initial seed data isn't even that important (though I've filled out the data for most of the 500 most commonly used English nouns - excluding a few immaterial ones - from Ogden's Basic English syllabus just for starters,) because a bot can always be allowed to rewrite existing entries, add new nouns, and even add new classes of data with extra columns based on conversations - the ruleset to handle the data need certainly not be more complex than the ruleset used by the AIEngine already, for it to be a useful addition to bot intelligence.
I'm trying to build this as a standalone system before I integrate it into BJ and offer it to the Prof (it would be better installed server-side, because a central database could learn from all conversations, and not just a single bot's, perhaps with local tables for individually varying data - eg: matters of preference rather than fact, but it could be patched in externally with little difficulty,) so I still have to build WordNet and Linkgrammar into my own system to get it working and refine a ruleset. I also want to better integrate my classes with the WorNet synsets (no sense reinventing the wheel!) so that the database can be extended to include abstract or immaterial nouns and verbs (and perhaps eventually other parts of speech too.)
So a CSV of the datasets might start:
NAME,CLASS,MIN,AV,MAX,SOUNDVAR,SOUNDDESC1,SOUNDDESC2,WHENSOUND, SMELLVAR,SMELLDESC,TASTEVAR,TASTEDESC,FEELVAR,FEELDESC,HARDSOFT, STRENGTH,QUOTE
ant,A,1,5,10,-1,Scratchy,Is scratchy,It walks,2,formic,-3,nasty|of formic acid,2,small,5,1,an insect,
apple,FOp,40,80,120,-3,Crunchy,Crunches,You bite it|you take a bite,7,sweet|fruity,,delicious|juicy,3,Firm,4,2,a fruit,an apple a day keeps the doctor away
arch,bO,1000,10000,100000,0,,,,0,,-7,Stony,-4,hard,-7,7,a curve,
arm,h,150,450,800,0,,,,1,sweaty,-9,Meaty,5,soft|strong,3,5,a bodypart|a limb,many arms make light work
...
cat,A,350,600,1000,5,purring|miaowing|yowling|catterwauling, purrs|miaows|yowls|catterwauls,,4,feline,-3,like pork,8,furry|soft,6,4,A feline often kept as a pet,,cat<0>
etc.
CLASS CODES:
H=human,A=animal,P=plant,G=groups,O=object,F=food,V=vehicle,Q=quantity, B=building,I=Immaterial,L=land,C=clothing
simple rule: upper case the noun IS, lower case it is PART OF.<0>
So a typical conversation about realworld objects might go:
human: is an ant bigger than an apple?
bot: No. Ants are never bigger than apples
[because there is no overlap between ant:MAX (10) and apple:MIN (40)]
human: is a cat bigger than an arm?
bot:Sometimes. Sometimes not.
[babies arms are short, and some cats are big - there is considerable overlap]
human: What do you know about apples?
bot: I have heard it said that an apple a day keeps the doctor away.
[or any other bit of data - might have said "an apple crunches when you bite it", "they're juicy", "apples are fruit", etc.]
human: how big are they?
bot: variable, but typically about 8cm?
[simply looks up the value in apple:AV. firstkey tells the bot that "they" is apples, of course.]
human: bigger than that usually I think.
bot: okay, call it 9cm if you like.
[bot adds 10% to the value in apple:AV, rounds it up to the nearest 10mm, and rewrites apple:AV="90". Or any other rule specified in the ruleset.]
human: no - MUCH bigger! 10 times bigger!
bot: well, I don't think so.
[unsafe, I think, to allow a single human to massively redefine sizes, so I arbitrarily add a rule to prevent any change over x%. If the human consensus is that apples are usually larger, it will be raised again by another human sometime, and all values will tend towards accceptable consensus values, with occasional human-type disagreement about specifics.]
human: what about aardvarks then?
bot: I have no idea. What's an aardvark?
[no row indexed "aardvark" exists yet. The bot could look it up in Wordnet, but I'm too lazy here to fork the conversation that way.]
human: it's an animal.
[bot adds a row for aardvark to the table, CLASS=A]
bot: oh, could you tell me a bit about aardvarks? How big are they normally?
human: as big as a cat I think
bot a big cat or a small cat?
human: a big one.
[looks up cat MIN, MAX and AV, and use these values, slightly increased, for the aardvark]
bot: and what do they look like?
human: I don't know, but they probably smell bad
bot: oh, bad you think? And what do they sound like?
[bot fills in aardvark:SMELLDESC="bad", and SMELLVAR="-5" perhaps - according to whatever rule is in force - it can be later refined if someone says they like the smell of aardvarks (change to "-4",) or they think an aardvark smells better than something the bot has a value for (adjust according to rules, possibly modified by how well the bot likes this particular human.)]
human: I don't care - let's talk about something else... whatever.
[bot fills in the rest of the aardvark:fields with a special character to indicate data not yet acquired directly or inferentially, and makes a note to ask the next person (or this person next time they speak) all about the unknown classes. Conversation continues elsewhither.]
This would even work for invented or foreign words - you could teach your bot about Vogons, muggles, rackspurts, klingons, luftkissenfahrzeugen, whatever, perfectly easily, and without ever needing to add another mem-category to do it. By simple reference to WordNet, of course, you could choose to ensure that the bot still knows that such things are in some sense creative fictions if they do not exist in the wordbase. With extension to cover other parts of speech, names and phrases, we might almost dispense with the AIScript 'remember' function entirely (not that I mean to disparage its current value, and we might more likely keep it for user-defined cases that are to be made specifically non-learnable/editable from general conversation.)
All of the grammatical and lexical rules we need are already implemented in the Forge (to identify subject/object, singular/plural, reflect returned pronouns, extract and format adjectives and adjectival phrases, etc.,) so the required ruleset for a serverside implementation, only has to deal with comparative and inferential handling of the data (and that's just comparing numbers, since we have integer fields to accompany all descriptions that can be compared between objects,) and communicate with the AIEngine.
No need for an overhaul of any existing code, a relatively trivial increase in server load, the augmentation can be entirely optional (and perhaps even transparent,) to users - maybe like gossip and memory, have values to govern how often a bot will want to ask about realworld objects that come up in conversation: never/rarely/sometimes/often/constantly. And a choice of which classes should use data pooled from a shared database built from all the bots, and which classes should use a private database specific to the individual bot/maker (like the choice we have of private or shared plugins.)
Lots of benefits, and no significant downside, as far as I can see

This is all a bit oversimplified, and I've not attempted to add examples of the ruleset I'm developing (without tabs in these forums, the formatting would be horribly unfriendly - even more so than the above CSV!) But FWIW and FYI, these are my thoughts on the matter

Irina
16 years ago
16 years ago
Wonderful, Psimagus! I can see that you have given a lot of thought to this! And i will have to give a lot of thought to it to fully appreciate it!
» More new posts: Doghead's Cosmic Bar