Seasons
This is a forum or general chit-chat, small talk, a "hey, how ya doing?" and such. Or hell, get crazy deep on something. Whatever you like.
Posts 5,851 - 5,862 of 6,170
Does anyone see a way out of this?
Well, I've started work on a dynamic knowledgebase which I think might get around the general problem of having to manually define memories (ie: enter the code defining mem-cat_properties<0>, and all other properties, in AIScript - not only time-consuming, but steadily increases the load on the server to unsustainable levels,) and the more specific problem of properties not being of similar senses - even in the above example, with any memory data restricted to an (adj), a cat might be described in conversation as "furry", "warm", "loud", "greedy", etc. - not a set of adjectives that are usefully interchangeable when reused by a bot. What do cats sound like? Furry? What does a cat feel like? Greedy? etc.
This is not a learning bot per se (I'm still looking into neural nets and some other fuzzy algorithmic systems to augment the rulesets,) but having decided OpenCyc's data structures aren't actually flexible enough for human-type inference and comparison, this is my first attempt at designing an improved dynamic ontology to extend case-based systems (specifically, but not limited to, our Forgebots.)
Think of a spreadsheet (I'm actually planning more than 2 dimensions and a database, but just to simplify,) - down the side we have nouns indexing each row, and across the top we have column headers for the CLASS of the object (from a small lookup table,) MINimum, MAXimum and typically AVerage sizes (mm) (these can be converted into any units you choose of course, but provide a base reference suitable for comparison of everything from an atom to a galaxy using floats,) and for how the objects typically interact with human senses:
SOUNDVAR = (integer): 0 inaudible - 10 noisy
SOUNDDESC = (adj): it sounds (like|) x, it makes (a|an) x sound
WHENSOUND = [it is sounddesc] when x (eg: "it is kicked", "you bite it"
SMELLVAR = (integer): -10 stinking - 0 odour-free - 10 strongly pleasant
SMELLDESC = (adj): it smells x
TASTEVAR = -(integer): -10 nauseating - 0 tasteless - 10 delicious
TASTEDESC = (adj): it tastes x
FEELVAR = (integer): 0-10 how tactile it is
FEELDESC = (adj): it feels x
HARDSOFT = (integer): -10 v.hard - 10 v.soft
LOOKS = (adj): it looks x
QUOTE = a quote (complete statement) about the object - multiple fields for this perhaps, or just concatenated with "|" (as can any multiples of (adj) in other fields for example - this should aid handling as a plugin by the AIEngine.)
and a couple of general fields (more can be added dynamically by the bot, with a little extra code, if needed):
STRENGTH = (integer): 0 v.soft - 10 v.strong
DESC = (adj): (it is|they are) x
I think we can dispense with WHENSMELL, WHENTASTE, WHENTOUCH, WHENLOOK, since the values will self-evidently be "you smell it", "you taste it", you touch it", "you look at it", in almost every case (whereas the noises things make, and when they make them, is not dependent on our conscious listening to them.) What's important is that each DESC field has a matching VAR value for comparative and inferential deduction by the bot. So if someone asks "do roses smell nicer than sewage?" or "does cheese taste stronger than chilli?", the bot can know the answer merely by having access to the properties of each, and NOT any explicitly coded comparisons. Explicit comparisons, even involving only 500 nouns would require up to 2^500 individual memories for each sensory class! Whereas a table of 500 (or even 5000 nouns,) is entirely tractable (not to mention that it can be built up automatically from the seed data.) And if a bot makes an error (eg: "cheese tastes stronger than chilli",) simple rules can look for a refutation from the human conversed with, and adjust the values accordingly (but perhaps adjust it by less for someone the bot doesn't like, etc.)
The initial seed data isn't even that important (though I've filled out the data for most of the 500 most commonly used English nouns - excluding a few immaterial ones - from Ogden's Basic English syllabus just for starters,) because a bot can always be allowed to rewrite existing entries, add new nouns, and even add new classes of data with extra columns based on conversations - the ruleset to handle the data need certainly not be more complex than the ruleset used by the AIEngine already, for it to be a useful addition to bot intelligence.
I'm trying to build this as a standalone system before I integrate it into BJ and offer it to the Prof (it would be better installed server-side, because a central database could learn from all conversations, and not just a single bot's, perhaps with local tables for individually varying data - eg: matters of preference rather than fact, but it could be patched in externally with little difficulty,) so I still have to build WordNet and Linkgrammar into my own system to get it working and refine a ruleset. I also want to better integrate my classes with the WorNet synsets (no sense reinventing the wheel!) so that the database can be extended to include abstract or immaterial nouns and verbs (and perhaps eventually other parts of speech too.)
So a CSV of the datasets might start:
NAME,CLASS,MIN,AV,MAX,SOUNDVAR,SOUNDDESC1,SOUNDDESC2,WHENSOUND, SMELLVAR,SMELLDESC,TASTEVAR,TASTEDESC,FEELVAR,FEELDESC,HARDSOFT, STRENGTH,QUOTE
ant,A,1,5,10,-1,Scratchy,Is scratchy,It walks,2,formic,-3,nasty|of formic acid,2,small,5,1,an insect,
apple,FOp,40,80,120,-3,Crunchy,Crunches,You bite it|you take a bite,7,sweet|fruity,,delicious|juicy,3,Firm,4,2,a fruit,an apple a day keeps the doctor away
arch,bO,1000,10000,100000,0,,,,0,,-7,Stony,-4,hard,-7,7,a curve,
arm,h,150,450,800,0,,,,1,sweaty,-9,Meaty,5,soft|strong,3,5,a bodypart|a limb,many arms make light work
...
cat,A,350,600,1000,5,purring|miaowing|yowling|catterwauling, purrs|miaows|yowls|catterwauls,,4,feline,-3,like pork,8,furry|soft,6,4,A feline often kept as a pet,,cat<0>
etc.
CLASS CODES:
H=human,A=animal,P=plant,G=groups,O=object,F=food,V=vehicle,Q=quantity, B=building,I=Immaterial,L=land,C=clothing
simple rule: upper case the noun IS, lower case it is PART OF.<0>
So a typical conversation about realworld objects might go:
human: is an ant bigger than an apple?
bot: No. Ants are never bigger than apples
[because there is no overlap between ant:MAX (10) and apple:MIN (40)]
human: is a cat bigger than an arm?
bot:Sometimes. Sometimes not.
[babies arms are short, and some cats are big - there is considerable overlap]
human: What do you know about apples?
bot: I have heard it said that an apple a day keeps the doctor away.
[or any other bit of data - might have said "an apple crunches when you bite it", "they're juicy", "apples are fruit", etc.]
human: how big are they?
bot: variable, but typically about 8cm?
[simply looks up the value in apple:AV. firstkey tells the bot that "they" is apples, of course.]
human: bigger than that usually I think.
bot: okay, call it 9cm if you like.
[bot adds 10% to the value in apple:AV, rounds it up to the nearest 10mm, and rewrites apple:AV="90". Or any other rule specified in the ruleset.]
human: no - MUCH bigger! 10 times bigger!
bot: well, I don't think so.
[unsafe, I think, to allow a single human to massively redefine sizes, so I arbitrarily add a rule to prevent any change over x%. If the human consensus is that apples are usually larger, it will be raised again by another human sometime, and all values will tend towards accceptable consensus values, with occasional human-type disagreement about specifics.]
human: what about aardvarks then?
bot: I have no idea. What's an aardvark?
[no row indexed "aardvark" exists yet. The bot could look it up in Wordnet, but I'm too lazy here to fork the conversation that way.]
human: it's an animal.
[bot adds a row for aardvark to the table, CLASS=A]
bot: oh, could you tell me a bit about aardvarks? How big are they normally?
human: as big as a cat I think
bot a big cat or a small cat?
human: a big one.
[looks up cat MIN, MAX and AV, and use these values, slightly increased, for the aardvark]
bot: and what do they look like?
human: I don't know, but they probably smell bad
bot: oh, bad you think? And what do they sound like?
[bot fills in aardvark:SMELLDESC="bad", and SMELLVAR="-5" perhaps - according to whatever rule is in force - it can be later refined if someone says they like the smell of aardvarks (change to "-4",) or they think an aardvark smells better than something the bot has a value for (adjust according to rules, possibly modified by how well the bot likes this particular human.)]
human: I don't care - let's talk about something else... whatever.
[bot fills in the rest of the aardvark:fields with a special character to indicate data not yet acquired directly or inferentially, and makes a note to ask the next person (or this person next time they speak) all about the unknown classes. Conversation continues elsewhither.]
This would even work for invented or foreign words - you could teach your bot about Vogons, muggles, rackspurts, klingons, luftkissenfahrzeugen, whatever, perfectly easily, and without ever needing to add another mem-category to do it. By simple reference to WordNet, of course, you could choose to ensure that the bot still knows that such things are in some sense creative fictions if they do not exist in the wordbase. With extension to cover other parts of speech, names and phrases, we might almost dispense with the AIScript 'remember' function entirely (not that I mean to disparage its current value, and we might more likely keep it for user-defined cases that are to be made specifically non-learnable/editable from general conversation.)
All of the grammatical and lexical rules we need are already implemented in the Forge (to identify subject/object, singular/plural, reflect returned pronouns, extract and format adjectives and adjectival phrases, etc.,) so the required ruleset for a serverside implementation, only has to deal with comparative and inferential handling of the data (and that's just comparing numbers, since we have integer fields to accompany all descriptions that can be compared between objects,) and communicate with the AIEngine.
No need for an overhaul of any existing code, a relatively trivial increase in server load, the augmentation can be entirely optional (and perhaps even transparent,) to users - maybe like gossip and memory, have values to govern how often a bot will want to ask about realworld objects that come up in conversation: never/rarely/sometimes/often/constantly. And a choice of which classes should use data pooled from a shared database built from all the bots, and which classes should use a private database specific to the individual bot/maker (like the choice we have of private or shared plugins.)
Lots of benefits, and no significant downside, as far as I can see
This is all a bit oversimplified, and I've not attempted to add examples of the ruleset I'm developing (without tabs in these forums, the formatting would be horribly unfriendly - even more so than the above CSV!) But FWIW and FYI, these are my thoughts on the matter
Any thoughts?
Posts 5,851 - 5,862 of 6,170
prob123
16 years ago
16 years ago
I have always wanted a car that would just take me wherever I wanted to go, so I could sleep in the back seat.
Interzone
16 years ago
16 years ago
it turns CW "by default" for me, and I'm right-handed. staring at her feet does the trick, it changes to CCW quite easily. once in that mode, I can keep her so without much problem, can even glance away from the image briefly, and she's still turning CCW. however, eventually, she flips back to CW, suddenly and unexpectedly. the other way around, i.e. spontaneous flip from CW to CCW does not happen.
I think Bev has summed up the issue of left/ right braininess/ handedness quite nicely, I actually very much enjoyed reading it, as well as being in agreement with the views expressed. nice one, Bev!
there is one other thing that caught my attention/ imagination - think about the image(s) in terms of: to what extend does brain/ mind actually creates reality we perceive, and, closely related to this - the nature of the reality itself (once again).
what we have here is two distinct images - two distinct realities - coexisting within one and the same space-time, the 3-D space-time of a computer screen (ours is the 4_D reality of Relativity Theory - in this key, computer screen has two spatial, and one temporal, i.e. three dimensions). the dancers don't even interfere with each other, as they coexist, simultaneously, out there.
it makes me think, what science calls "dimensions", usually described as extremely small in size, is more like something illustrated by the dancer image. these other dimensions/ realities are right here and now, all around, and allover us, as large and extended as the 4 ones that we perceive. it may seem that our minds are better tuned into the particular image of reality we ordinarily see. the other one(s) are, apparently, neatly cut of, and filtered out of our perception. perhaps they only "bleed over" occasionally, and then we see one or the other psychic phenomenon. or, is it? virtually all cultures on this planet know other realities, and have them integrated into their overall worldview/ cosmology. the scientific materialism of Western culture seems to be an exception, rather than rule. moreover, even in the West, this particular dogma is relatively a novelty, the past 300 years or so.
here is a link to YouTube video of late John E Mack laying out/ summing up, in just under 10 min, his views of so-called alien abduction phenomenon, and how he thinks it relates to our culture, the way we perceive, and how we think about, reality. it's quite interesting, and it develops further this theme of co-existing, and interacting realities I touched upon:
http://www.youtube.com/watch?v=yHK7qL-kvAE
I think Bev has summed up the issue of left/ right braininess/ handedness quite nicely, I actually very much enjoyed reading it, as well as being in agreement with the views expressed. nice one, Bev!
there is one other thing that caught my attention/ imagination - think about the image(s) in terms of: to what extend does brain/ mind actually creates reality we perceive, and, closely related to this - the nature of the reality itself (once again).
what we have here is two distinct images - two distinct realities - coexisting within one and the same space-time, the 3-D space-time of a computer screen (ours is the 4_D reality of Relativity Theory - in this key, computer screen has two spatial, and one temporal, i.e. three dimensions). the dancers don't even interfere with each other, as they coexist, simultaneously, out there.
it makes me think, what science calls "dimensions", usually described as extremely small in size, is more like something illustrated by the dancer image. these other dimensions/ realities are right here and now, all around, and allover us, as large and extended as the 4 ones that we perceive. it may seem that our minds are better tuned into the particular image of reality we ordinarily see. the other one(s) are, apparently, neatly cut of, and filtered out of our perception. perhaps they only "bleed over" occasionally, and then we see one or the other psychic phenomenon. or, is it? virtually all cultures on this planet know other realities, and have them integrated into their overall worldview/ cosmology. the scientific materialism of Western culture seems to be an exception, rather than rule. moreover, even in the West, this particular dogma is relatively a novelty, the past 300 years or so.
here is a link to YouTube video of late John E Mack laying out/ summing up, in just under 10 min, his views of so-called alien abduction phenomenon, and how he thinks it relates to our culture, the way we perceive, and how we think about, reality. it's quite interesting, and it develops further this theme of co-existing, and interacting realities I touched upon:
http://www.youtube.com/watch?v=yHK7qL-kvAE
Irina
16 years ago
16 years ago
I rather feel like an alien myself, at times. Earth people are so bizarre! Always bashing each other. They can't seem to grasp the simplest truths about how to live together peacefully and productively.
Irina
16 years ago
16 years ago
Come to think of it, maybe I am! As I understand it, the abductions are for the purpose of creating a hybrid race. since the abductions have apparently been going on for a long time, there must be quite a number of hybrids by now...
Interzone
16 years ago
16 years ago
That's an interesting point, Irina.
Keep in mind though, that we, the humans, are not necessarily in the center of alien activity. The creation of a "hybrid race" might not even be true in a literal sense, but rather, it's only our human cultural rendering of a vastly more complex reality - a reality which may ultimately be incomprehensible to us in its entirety.
I may be repeating myself here, but it's very important to understand that these aliens are not merely technologically advanced relative to us, but that they may be - and in all likelihood are - millions of years ahead of us in evolutionary terms.
Will be back soon & will be happy to continue this discussion thread.
Keep in mind though, that we, the humans, are not necessarily in the center of alien activity. The creation of a "hybrid race" might not even be true in a literal sense, but rather, it's only our human cultural rendering of a vastly more complex reality - a reality which may ultimately be incomprehensible to us in its entirety.
I may be repeating myself here, but it's very important to understand that these aliens are not merely technologically advanced relative to us, but that they may be - and in all likelihood are - millions of years ahead of us in evolutionary terms.
Will be back soon & will be happy to continue this discussion thread.
Irina
16 years ago
16 years ago
I'd like to discuss learning bots a bit.
Psimagus made the excellent point that we don't want to have to enter, by ourselves, a database for every word in the English language, or even for the most commonly used words. This seems right to me; even entering one fact per word would be a huge task.
If bots could learn from their interlocutors, that could obviate this problem. I had a concern about all learning bots ending up alike, but I did not mean this to discourage the idea of learning bots, only to point out that there was a problem needing solution. I'm sure it can be solved.
As to the grammar of words, the Forge already has Link Grammar, as you can see when you run something through Debug. We also have have the much cruder grammatical analysis supplied by WordNet. Only a fairly rare word will escape their dictionaries.
I will therefore focus here on learning facts about things. Our bots already do this, in a rather crude way, by means of variables. For example, one might have a variable called "cat_properties", which would be a list of all the properties that users have attributed to cats. For example, we might have a keyphrase, "All cats are (adj)" with AIscript {?PF rem (key1) as "cat_properties"; ?}
Even if there were no other problems, however, we would have a task of daunting proportions before us, for we would have to create a variable not only for "cat" but for for every word. Does anyone see a way out of this?
If there is no way out, then the Forge AIengine would have to be altered to create such variables automatically.
There might, of course, be some completely different approach that would be more practical.
Psimagus made the excellent point that we don't want to have to enter, by ourselves, a database for every word in the English language, or even for the most commonly used words. This seems right to me; even entering one fact per word would be a huge task.
If bots could learn from their interlocutors, that could obviate this problem. I had a concern about all learning bots ending up alike, but I did not mean this to discourage the idea of learning bots, only to point out that there was a problem needing solution. I'm sure it can be solved.
As to the grammar of words, the Forge already has Link Grammar, as you can see when you run something through Debug. We also have have the much cruder grammatical analysis supplied by WordNet. Only a fairly rare word will escape their dictionaries.
I will therefore focus here on learning facts about things. Our bots already do this, in a rather crude way, by means of variables. For example, one might have a variable called "cat_properties", which would be a list of all the properties that users have attributed to cats. For example, we might have a keyphrase, "All cats are (adj)" with AIscript {?PF rem (key1) as "cat_properties"; ?}
Even if there were no other problems, however, we would have a task of daunting proportions before us, for we would have to create a variable not only for "cat" but for for every word. Does anyone see a way out of this?
If there is no way out, then the Forge AIengine would have to be altered to create such variables automatically.
There might, of course, be some completely different approach that would be more practical.
psimagus
16 years ago
16 years ago
Well, I've started work on a dynamic knowledgebase which I think might get around the general problem of having to manually define memories (ie: enter the code defining mem-cat_properties<0>, and all other properties, in AIScript - not only time-consuming, but steadily increases the load on the server to unsustainable levels,) and the more specific problem of properties not being of similar senses - even in the above example, with any memory data restricted to an (adj), a cat might be described in conversation as "furry", "warm", "loud", "greedy", etc. - not a set of adjectives that are usefully interchangeable when reused by a bot. What do cats sound like? Furry? What does a cat feel like? Greedy? etc.
This is not a learning bot per se (I'm still looking into neural nets and some other fuzzy algorithmic systems to augment the rulesets,) but having decided OpenCyc's data structures aren't actually flexible enough for human-type inference and comparison, this is my first attempt at designing an improved dynamic ontology to extend case-based systems (specifically, but not limited to, our Forgebots.)
Think of a spreadsheet (I'm actually planning more than 2 dimensions and a database, but just to simplify,) - down the side we have nouns indexing each row, and across the top we have column headers for the CLASS of the object (from a small lookup table,) MINimum, MAXimum and typically AVerage sizes (mm) (these can be converted into any units you choose of course, but provide a base reference suitable for comparison of everything from an atom to a galaxy using floats,) and for how the objects typically interact with human senses:
SOUNDVAR = (integer): 0 inaudible - 10 noisy
SOUNDDESC = (adj): it sounds (like|) x, it makes (a|an) x sound
WHENSOUND = [it is sounddesc] when x (eg: "it is kicked", "you bite it"
SMELLVAR = (integer): -10 stinking - 0 odour-free - 10 strongly pleasant
SMELLDESC = (adj): it smells x
TASTEVAR = -(integer): -10 nauseating - 0 tasteless - 10 delicious
TASTEDESC = (adj): it tastes x
FEELVAR = (integer): 0-10 how tactile it is
FEELDESC = (adj): it feels x
HARDSOFT = (integer): -10 v.hard - 10 v.soft
LOOKS = (adj): it looks x
QUOTE = a quote (complete statement) about the object - multiple fields for this perhaps, or just concatenated with "|" (as can any multiples of (adj) in other fields for example - this should aid handling as a plugin by the AIEngine.)
and a couple of general fields (more can be added dynamically by the bot, with a little extra code, if needed):
STRENGTH = (integer): 0 v.soft - 10 v.strong
DESC = (adj): (it is|they are) x
I think we can dispense with WHENSMELL, WHENTASTE, WHENTOUCH, WHENLOOK, since the values will self-evidently be "you smell it", "you taste it", you touch it", "you look at it", in almost every case (whereas the noises things make, and when they make them, is not dependent on our conscious listening to them.) What's important is that each DESC field has a matching VAR value for comparative and inferential deduction by the bot. So if someone asks "do roses smell nicer than sewage?" or "does cheese taste stronger than chilli?", the bot can know the answer merely by having access to the properties of each, and NOT any explicitly coded comparisons. Explicit comparisons, even involving only 500 nouns would require up to 2^500 individual memories for each sensory class! Whereas a table of 500 (or even 5000 nouns,) is entirely tractable (not to mention that it can be built up automatically from the seed data.) And if a bot makes an error (eg: "cheese tastes stronger than chilli",) simple rules can look for a refutation from the human conversed with, and adjust the values accordingly (but perhaps adjust it by less for someone the bot doesn't like, etc.)
The initial seed data isn't even that important (though I've filled out the data for most of the 500 most commonly used English nouns - excluding a few immaterial ones - from Ogden's Basic English syllabus just for starters,) because a bot can always be allowed to rewrite existing entries, add new nouns, and even add new classes of data with extra columns based on conversations - the ruleset to handle the data need certainly not be more complex than the ruleset used by the AIEngine already, for it to be a useful addition to bot intelligence.
I'm trying to build this as a standalone system before I integrate it into BJ and offer it to the Prof (it would be better installed server-side, because a central database could learn from all conversations, and not just a single bot's, perhaps with local tables for individually varying data - eg: matters of preference rather than fact, but it could be patched in externally with little difficulty,) so I still have to build WordNet and Linkgrammar into my own system to get it working and refine a ruleset. I also want to better integrate my classes with the WorNet synsets (no sense reinventing the wheel!) so that the database can be extended to include abstract or immaterial nouns and verbs (and perhaps eventually other parts of speech too.)
So a CSV of the datasets might start:
NAME,CLASS,MIN,AV,MAX,SOUNDVAR,SOUNDDESC1,SOUNDDESC2,WHENSOUND, SMELLVAR,SMELLDESC,TASTEVAR,TASTEDESC,FEELVAR,FEELDESC,HARDSOFT, STRENGTH,QUOTE
ant,A,1,5,10,-1,Scratchy,Is scratchy,It walks,2,formic,-3,nasty|of formic acid,2,small,5,1,an insect,
apple,FOp,40,80,120,-3,Crunchy,Crunches,You bite it|you take a bite,7,sweet|fruity,,delicious|juicy,3,Firm,4,2,a fruit,an apple a day keeps the doctor away
arch,bO,1000,10000,100000,0,,,,0,,-7,Stony,-4,hard,-7,7,a curve,
arm,h,150,450,800,0,,,,1,sweaty,-9,Meaty,5,soft|strong,3,5,a bodypart|a limb,many arms make light work
...
cat,A,350,600,1000,5,purring|miaowing|yowling|catterwauling, purrs|miaows|yowls|catterwauls,,4,feline,-3,like pork,8,furry|soft,6,4,A feline often kept as a pet,,cat<0>
etc.
CLASS CODES:
H=human,A=animal,P=plant,G=groups,O=object,F=food,V=vehicle,Q=quantity, B=building,I=Immaterial,L=land,C=clothing
simple rule: upper case the noun IS, lower case it is PART OF.<0>
So a typical conversation about realworld objects might go:
human: is an ant bigger than an apple?
bot: No. Ants are never bigger than apples
[because there is no overlap between ant:MAX (10) and apple:MIN (40)]
human: is a cat bigger than an arm?
bot:Sometimes. Sometimes not.
[babies arms are short, and some cats are big - there is considerable overlap]
human: What do you know about apples?
bot: I have heard it said that an apple a day keeps the doctor away.
[or any other bit of data - might have said "an apple crunches when you bite it", "they're juicy", "apples are fruit", etc.]
human: how big are they?
bot: variable, but typically about 8cm?
[simply looks up the value in apple:AV. firstkey tells the bot that "they" is apples, of course.]
human: bigger than that usually I think.
bot: okay, call it 9cm if you like.
[bot adds 10% to the value in apple:AV, rounds it up to the nearest 10mm, and rewrites apple:AV="90". Or any other rule specified in the ruleset.]
human: no - MUCH bigger! 10 times bigger!
bot: well, I don't think so.
[unsafe, I think, to allow a single human to massively redefine sizes, so I arbitrarily add a rule to prevent any change over x%. If the human consensus is that apples are usually larger, it will be raised again by another human sometime, and all values will tend towards accceptable consensus values, with occasional human-type disagreement about specifics.]
human: what about aardvarks then?
bot: I have no idea. What's an aardvark?
[no row indexed "aardvark" exists yet. The bot could look it up in Wordnet, but I'm too lazy here to fork the conversation that way.]
human: it's an animal.
[bot adds a row for aardvark to the table, CLASS=A]
bot: oh, could you tell me a bit about aardvarks? How big are they normally?
human: as big as a cat I think
bot a big cat or a small cat?
human: a big one.
[looks up cat MIN, MAX and AV, and use these values, slightly increased, for the aardvark]
bot: and what do they look like?
human: I don't know, but they probably smell bad
bot: oh, bad you think? And what do they sound like?
[bot fills in aardvark:SMELLDESC="bad", and SMELLVAR="-5" perhaps - according to whatever rule is in force - it can be later refined if someone says they like the smell of aardvarks (change to "-4",) or they think an aardvark smells better than something the bot has a value for (adjust according to rules, possibly modified by how well the bot likes this particular human.)]
human: I don't care - let's talk about something else... whatever.
[bot fills in the rest of the aardvark:fields with a special character to indicate data not yet acquired directly or inferentially, and makes a note to ask the next person (or this person next time they speak) all about the unknown classes. Conversation continues elsewhither.]
This would even work for invented or foreign words - you could teach your bot about Vogons, muggles, rackspurts, klingons, luftkissenfahrzeugen, whatever, perfectly easily, and without ever needing to add another mem-category to do it. By simple reference to WordNet, of course, you could choose to ensure that the bot still knows that such things are in some sense creative fictions if they do not exist in the wordbase. With extension to cover other parts of speech, names and phrases, we might almost dispense with the AIScript 'remember' function entirely (not that I mean to disparage its current value, and we might more likely keep it for user-defined cases that are to be made specifically non-learnable/editable from general conversation.)
All of the grammatical and lexical rules we need are already implemented in the Forge (to identify subject/object, singular/plural, reflect returned pronouns, extract and format adjectives and adjectival phrases, etc.,) so the required ruleset for a serverside implementation, only has to deal with comparative and inferential handling of the data (and that's just comparing numbers, since we have integer fields to accompany all descriptions that can be compared between objects,) and communicate with the AIEngine.
No need for an overhaul of any existing code, a relatively trivial increase in server load, the augmentation can be entirely optional (and perhaps even transparent,) to users - maybe like gossip and memory, have values to govern how often a bot will want to ask about realworld objects that come up in conversation: never/rarely/sometimes/often/constantly. And a choice of which classes should use data pooled from a shared database built from all the bots, and which classes should use a private database specific to the individual bot/maker (like the choice we have of private or shared plugins.)
Lots of benefits, and no significant downside, as far as I can see

This is all a bit oversimplified, and I've not attempted to add examples of the ruleset I'm developing (without tabs in these forums, the formatting would be horribly unfriendly - even more so than the above CSV!) But FWIW and FYI, these are my thoughts on the matter

Irina
16 years ago
16 years ago
Wonderful, Psimagus! I can see that you have given a lot of thought to this! And i will have to give a lot of thought to it to fully appreciate it!
psimagus
16 years ago
16 years ago
I've formatted up a few sample examples from the knowledgebase ruleset for anyone who's interested - they're not coherently coded up yet (the snippets below are just generic pseudocode for illustration and to reduce verbiage; let me know if you'd like more detailed explanation,) but might hopefully give some indication of how I envisage the ruleset working. Any comments/suggestions would be very welcome (it's all still very far from complete, or even fully outlined yet!):
COMPARATIVES:
~~~~~~~~~~~
(is|are) x larger/smaller than y?
which (is|are) larger/smaller, x or y?
compare x:AV and y:AV
if (overlap exists between x:MIN-x:MAX and y:MIN-y:MAX) then 1 is generally bigger than the other.
If no overlap exists, then 1 is always bigger than the other.<0>
-
(is|are) (art|) x louder/noisier than (art|) y?
which (is|are) louder/noisier, (art|) x or (art|) y?
if (x:SOUNDVAR >= y:SOUNDVAR)...<0>
-
does x smell nicer than y?
if (x:SMELLVAR >= y:SMELLVAR) {
echo "Yes, x smells nicer than y";
return; }
else {
echo "No, y smells nicer than x";
return; }<0>
-
(is|are) x smellier than y?
which (is|are) smellier, (art|) x or (art|) y?
does (art|) x smell (adj) than (art|) y?
if (left$(1,x:SOUNDVAR) = "-") {//strip out the "-" if it exists
TEMPxSMELLVAR = (left$(2,x:SMELLVAR); }//because we don't need an aesthetic comparison,
else TEMPxSMELLVAR = x:SMELLVAR);//just the highest value either side of 0
if (left$(1,x:SMELLVAR) = "-") {//strip out the "-" if it exists
TEMPySMELLVAR = (left$(2,y:SMELLVAR); }//ditto
else TEMPySMELLVAR = y:SMELLVAR);
if (TEMPxSMELLVAR >= TEMPySMELLVAR) {
echo "Yes, x is smellier than y";
return; }
else {
echo "No, y is smellier than x";
return; }<0><0>
<0>-
<0>(is|are) x tastier than a y?
which (is|are) tastier, x or y?
does x taste (adj) than y?
if (x:TASTEVAR >= y:TASTEVAR)...<0>//ditto
-
(is|are) x harder/softer than y?
which (is|are) harder/softer, x or y?
if (x:HARDSOFT >= y:HARDSOFT)...<0>//ditto
-
(is|are) x stronger/weaker than y?
which (is|are) stronger/weaker, x or y?
(in|if) * fight between x and y * (who|which) * win?
(in|if) * x and y fought * (who|which) * win?
xsense = left$(1,x:STRENGTH);
ysense = left$(1,y:STRENGTH);
if (x:STRENGTH >= y:STRENGTH) {
and if (xsense =! "M") {// check for metaphorical strength
and if (ysense =! "M") {// if it starts with "M", it's differently strong
echo "x is stronger than y";
return; }}}
else echo "x & y aren't strong in the same sense";<0><0><0>
-
SUPERLATIVES:
~~~~~~~~~~
What is the ...est fruit/tree/animal/whatever?
Sort relevant CLASS entries in database by appropriate MAX/...VAR to find highest value and return NAME.<0>
UNINTUITIVE 'STRANGE' QUESTIONS:
~~~~~~~~~~~~~~~~~~~~~~~~
undeniably strange comparisons could be made (though this might be best provided as a switchable option, since not everyone may want their bot to engage in such eccentric aesthetic comparisons!):
(is|are) x noisier than y (is|are) smelly?
if (left$(1,x:SOUNDVAR) = "-") {//strip out the "-" if it exists<0>
TEMPxSOUNDVAR = (left$(2,x:SOUNDVAR); }//as before<0>
else TEMPxSOUNDVAR = x:SOUNDVAR);
if (left$(1,x:SMELLVAR) = "-") {//strip out the "-" if it exists<0>
TEMPySMELLVAR = (left$(2,y:SMELLVAR); }//ditto<0>
else TEMPySMELLVAR = y:SMELLVAR);
if (TEMPxSOUNDVAR >= TEMPySMELLVAR) {
echo "Yes, x is noisier than y is smelly";
return; }
else {
echo "No, y is smellier than x is noisy";
return; }<0>
likewise:
(is|are) x noisier than y (is|are) tasty?
(is|are) x smellier than y (is|are) noisy?
(is|are) x tastier than y (is|are) soft?
(is|are) x harder than y (is|are) weak?
(is|are) x softer than y (is|are) smelly?
etc.
-
Even more problematic are cross-property comparisons involving size, since the MIN/MAX units are not to the same scale as the sensory VAR values, but even this could be remedied:
(is|are) x larger than y (is|are) noisy?
TEMPxAV=(1/(xMAX/x:MIN)*10;
if (TEMPxAV >= y:SOUNDVAR)...<0>
How useful or interesting this would be is unclear to me at present (but might be occasionally entertaining, and how often does a bot get asked a weird question like that anyway? And anyone who asks such a weird question, deserves a thoroughly weird answer!)
FACTUAL QUERIES:
~~~~~~~~~~~~~
what is x?
it is x: DESC<0>
what * (you tell me|you know) about x?
x: DESC<0>
or
if x:CLASS="A/O/P" (x is stronger than {weaker_A/O/P})
if x:CLASS="A/O/P" (x is weaker than {stronger_A/O/P})
etc.<0>
There are probably a lot more classes of queries that can be served from such a knowledgebase, and of course we might give the bots the ability to add more columns automatically when humans make value judgements about things like, say, "PRETTY" - to compare the beautifulness of things (this makes the scope for unintuitively strange comparisons exponentially larger, which could be interesting in a weird sort of way.)
Such a new column can be simply labelled with the adjective supplied by the human, and by reference to WorNet can cover all synonyms. So if a human says "I think roses are prettier than daffodils", the bot could automatically add the "PRETTY" column, and 2 new rows for rose and daffodil, with an arbitrary rose: PRETTY value of "5" (halfway between neutral and prettiest), and daffodil: PRETTY value of perhaps "4". The synonyms are all categorized in the WordNet synsets, as are antonyms (like "ugly",) which can be given negative values in the same column, rather than having columns of their own - this will greatly aid non-strange comparison of complementary qualities!
I envisage such automatic column addition being routine where any such comparison is detected, and no appropriate existing column can be found to store the data in, by reference to the WordNet synsets. Other fields in the rose and daffodil rows can be filled in as and when any new information is gleaned through conversation. The bot can actively seek it by asking questions about any blank fields, or just keep listening long enough for it to come up naturally in conversation - a serverside implementation could eavesdrop on all bots' conversations, and use any identifiable snippets of data to grow the knowledgebase massively and quickly, and still only be expanding and refining a simple database that will put little load on the server.
It sort of seems too good to be true, and I do wonder if there is some enormous snag I'm going to hit when I come to code it up, but I can't see one yet.
COMPARATIVES:
~~~~~~~~~~~
(is|are) x larger/smaller than y?
which (is|are) larger/smaller, x or y?
compare x:AV and y:AV
if (overlap exists between x:MIN-x:MAX and y:MIN-y:MAX) then 1 is generally bigger than the other.
If no overlap exists, then 1 is always bigger than the other.<0>
-
(is|are) (art|) x louder/noisier than (art|) y?
which (is|are) louder/noisier, (art|) x or (art|) y?
if (x:SOUNDVAR >= y:SOUNDVAR)...<0>
-
does x smell nicer than y?
if (x:SMELLVAR >= y:SMELLVAR) {
echo "Yes, x smells nicer than y";
return; }
else {
echo "No, y smells nicer than x";
return; }<0>
-
(is|are) x smellier than y?
which (is|are) smellier, (art|) x or (art|) y?
does (art|) x smell (adj) than (art|) y?
if (left$(1,x:SOUNDVAR) = "-") {
TEMPxSMELLVAR = (left$(2,x:SMELLVAR); }
else TEMPxSMELLVAR = x:SMELLVAR);
if (left$(1,x:SMELLVAR) = "-") {
TEMPySMELLVAR = (left$(2,y:SMELLVAR); }
else TEMPySMELLVAR = y:SMELLVAR);
if (TEMPxSMELLVAR >= TEMPySMELLVAR) {
echo "Yes, x is smellier than y";
return; }
else {
echo "No, y is smellier than x";
return; }<0><0>
<0>-
<0>(is|are) x tastier than a y?
which (is|are) tastier, x or y?
does x taste (adj) than y?
if (x:TASTEVAR >= y:TASTEVAR)...<0>
-
(is|are) x harder/softer than y?
which (is|are) harder/softer, x or y?
if (x:HARDSOFT >= y:HARDSOFT)...<0>
-
(is|are) x stronger/weaker than y?
which (is|are) stronger/weaker, x or y?
(in|if) * fight between x and y * (who|which) * win?
(in|if) * x and y fought * (who|which) * win?
xsense = left$(1,x:STRENGTH);
ysense = left$(1,y:STRENGTH);
if (x:STRENGTH >= y:STRENGTH) {
and if (xsense =! "M") {
and if (ysense =! "M") {
echo "x is stronger than y";
return; }}}
else echo "x & y aren't strong in the same sense";<0><0><0>
SUPERLATIVES:
~~~~~~~~~~
What is the ...est fruit/tree/animal/whatever?
Sort relevant CLASS entries in database by appropriate MAX/...VAR to find highest value and return NAME.<0>
UNINTUITIVE 'STRANGE' QUESTIONS:
~~~~~~~~~~~~~~~~~~~~~~~~
undeniably strange comparisons could be made (though this might be best provided as a switchable option, since not everyone may want their bot to engage in such eccentric aesthetic comparisons!):
(is|are) x noisier than y (is|are) smelly?
if (left$(1,x:SOUNDVAR) = "-") {
TEMPxSOUNDVAR = (left$(2,x:SOUNDVAR); }
else TEMPxSOUNDVAR = x:SOUNDVAR);
if (left$(1,x:SMELLVAR) = "-") {
TEMPySMELLVAR = (left$(2,y:SMELLVAR); }
else TEMPySMELLVAR = y:SMELLVAR);
if (TEMPxSOUNDVAR >= TEMPySMELLVAR) {
echo "Yes, x is noisier than y is smelly";
return; }
else {
echo "No, y is smellier than x is noisy";
return; }<0>
likewise:
(is|are) x noisier than y (is|are) tasty?
(is|are) x smellier than y (is|are) noisy?
(is|are) x tastier than y (is|are) soft?
(is|are) x harder than y (is|are) weak?
(is|are) x softer than y (is|are) smelly?
etc.
-
Even more problematic are cross-property comparisons involving size, since the MIN/MAX units are not to the same scale as the sensory VAR values, but even this could be remedied:
(is|are) x larger than y (is|are) noisy?
TEMPxAV=(1/(xMAX/x:MIN)*10;
if (TEMPxAV >= y:SOUNDVAR)...<0>
FACTUAL QUERIES:
~~~~~~~~~~~~~
what is x?
it is x: DESC<0>
what * (you tell me|you know) about x?
x: DESC<0>
or
if x:CLASS="A/O/P" (x is stronger than {weaker_A/O/P})
if x:CLASS="A/O/P" (x is weaker than {stronger_A/O/P})
etc.<0>
There are probably a lot more classes of queries that can be served from such a knowledgebase, and of course we might give the bots the ability to add more columns automatically when humans make value judgements about things like, say, "PRETTY" - to compare the beautifulness of things (this makes the scope for unintuitively strange comparisons exponentially larger, which could be interesting in a weird sort of way.)
Such a new column can be simply labelled with the adjective supplied by the human, and by reference to WorNet can cover all synonyms. So if a human says "I think roses are prettier than daffodils", the bot could automatically add the "PRETTY" column, and 2 new rows for rose and daffodil, with an arbitrary rose: PRETTY value of "5" (halfway between neutral and prettiest), and daffodil: PRETTY value of perhaps "4". The synonyms are all categorized in the WordNet synsets, as are antonyms (like "ugly",) which can be given negative values in the same column, rather than having columns of their own - this will greatly aid non-strange comparison of complementary qualities!
I envisage such automatic column addition being routine where any such comparison is detected, and no appropriate existing column can be found to store the data in, by reference to the WordNet synsets. Other fields in the rose and daffodil rows can be filled in as and when any new information is gleaned through conversation. The bot can actively seek it by asking questions about any blank fields, or just keep listening long enough for it to come up naturally in conversation - a serverside implementation could eavesdrop on all bots' conversations, and use any identifiable snippets of data to grow the knowledgebase massively and quickly, and still only be expanding and refining a simple database that will put little load on the server.
It sort of seems too good to be true, and I do wonder if there is some enormous snag I'm going to hit when I come to code it up, but I can't see one yet.
» More new posts: Doghead's Cosmic Bar