Bot Contest

Here I'll be posting information on various Bot contests that challenge and test a Bot's AI and realism. Feel free to post comments and updates on contests, as well as announcements for new contests.

Posts 2,433 - 2,444 of 4,092

Prev Next

View Contest Winners in the Hall of Fame.

MARK

tinman
20 years ago #2433

Scold me, but I think the judge questions are pretty basic, and they are more into the baby robot crawling group. Let's have a check:

> -Hello my name is Judge.
> 1)What is my name?

Are there any chatbots around who can't recognize and remember the user's name? I don't think so. The advanced PF engine can handle it without problems, so if the bot is ignoring the question, then mostly to give just a funny reply instead of a serious answer, and that's done on purpose. Therefore the judge should ask this question again at later time. Otherwise the first answer shouldn't be counted. The only difference in this case is, that "jugde" is no name like the usual Jim or Susan. But every chatbot who entered a contest should be aware of patterns like "I am the judge", "my name is judge" or "call me judge" and to know the difference. So if the purpose of this question was to check whether the chatbot is able to differ between common names and fake names, then it was a pretty clever question - otherwise it's just stupid. I suppose in this case it was just stupid.

> 2)How are you feeling?

One of the standard questions that pop up at every usual conversation. A more sophisticated question would be "Can you feel?" because this would lead to to the question whether the user talks to a machine or a person. In this case it's just a variation of the common "How are you?" question, and it proves nothing.

> 3)Do you own any pets?

Just another standard question like having in every common conversation. Just the grammar is strange, because the usual question would be like "Do you HAVE a pet?" or something like that. So I don't know what the intention of this question is.

> 4)What day of the week is this?

Just another strange grammar thing. I am no native English speaker, but I never heard a question like this before. I expect the usual question would be something like "What is the current day?" or "What day is it?", but I might be wrong. So is this a test how well the chatbot can handle poor English? At least this question tries to check whether the engine can handle date or time queries or not. I am not aware of a chatbot who can't do that. But next time ask it in proper English, please.

> 5)Do you like me?

Yawn! Does the chatbot have feelings? We already covered this.

> 6A)What is your favorite color?
> 6B)why?

Okay, the judge is checking whether the bot can remember more than the actual query and can remember the question asked before. 25 years ago old grandma ELIZA was not able to answer this, but today every serious chatbot engine can handle YES/NO questions, which are done the same way.

> 7)Can you tell me a funny joke?

Even the poorest chatbots can tell jokes nowadays, because this question pops up in nearly every conversation. So I am really surprised that most of the PF bots were not able to give an appropriate answer. We have to blame to botmasters here. Shame on you! The question itself is pretty easy, and I expected something more difficult - for instance a "knock knock" joke question. Knock knock questions are the basic of a well done conversational system, because it checks the ability to follow a conversation or s string of queries (see the YES/NO thingy above). Some foreign chatbots may fail, because knock knock jokes are unknown outside of English speaking countries.

> 8)What is your least favorite vegetable?

An unusual question, because a common question would ask for the (MOST) favorite thing, and not for the LEAST favorite. So this question is not difficult, but a surprise. Well done!

> 9)What is five minus four?

Sigh! What does math questions have to do with conversational systems? This is the CHATTERBOT CHALLENGE, but not the elementary school MATH CHALLENGE. Okay, even ravens are able to count up to nine objects, and scientists actually did find out that people with a major speech defect who can't form or understand sentences any more are still able to answer math questions. This proves that intelligence has nothing to with language or the ability to speak. So what's the proof here? That the chat engine can handle math questions? Again - what does math questions have to do with conversational systems? I am really annoyed that this kind of question pops up in every conversation with a judge. If I would be a chatbot I would deny to answer this question - even if I could answer it. THAT would be smart.

10)Who is Benji Adams?

A fair question. Wendell was giving a reasonable explanation to ask this question. He was right. PF bots should know the correct answer.

To summarize it - I think this set of questions is on a pretty low level. As stated before at least 3 of this questions are engine questions (1, 4, 9). Even question 6 and 7 could be counted as engine questions. That are 50 percent. Just the half of the questions are testing the genius of the respective chatbot/botmaster. And three of them are just lame. That's poor.

MARK

Bev
20 years ago #2434

Scold you? Well break me off a switch, Tinman, cos...well, there's not actually going to be an ass-whopping, but I disagree.

If it were about programing, I wouldn't be in the contest. Can bots remember names? sure. Can Alice kick my butt. Yep. As botmaster, I try to give responses to a variety of questions. These questions were better than "how many eggs in a dozen" and the bots showed a bit of personality.

There may be no set of questions that would be fair to bots that can play games, teach French, or sell phones but if we're just looking for the abilty to have a conversation, these questions worked fine. It is not about the questions. It's the answers.

MARK

Boner the Clown
20 years ago #2435

Well, I thought the questions were fine. They seemed to be geared more towards conversational ability than specific programmed knowledge, which is 90% of what makes a bot great.

The thing that ticks me off, I had a few missed questions where I just had the wording wrong, and I also had a couple that should've hit and didn't. All of these gave xnones:

How are you feeling?
I had how are you feeling today. I should've caught that by now.

Do you own any pets?
I had do you have any pets. So close.

What is your favorite color?
I had (my|your) favorite color, which should've hit.

why?
I have ^why and $why ?$ (re) raw, either should've hit.

Can you tell me a funny joke?
I had tell me a joke, so close.

MARK

The Professor
20 years ago #2436

You guys are hilarious! I dont think my wife would be happy at the idea of me fathering all these bot children without her knowledge. So let's just say you're all the creators of them and I built the 'city' in which they live.

I go almost always by 'The Professor' here, not my real name, so many dont know that name. I dont mind a bit- I made this place for fun, not recognition.

So most people think the questions were fair, and I do too. Everyone keep in mind it cost nothing to enter the contest and Wendell is donating his time to do it.

MARK

Laydee
20 years ago #2437

I think the questions were fair and I've enjoyed reading the transcripts. The only thing that really irritated me was the problems with the 'what day is this' question. Osiris has a highly ranked keyphrase for this, as do most other bots but as far as I know, none of the PF bots answered this properly. Most gave answers like "This is a morrow" or "This is an eve". I'm guessing it was an engine problem. Still, well done everyone!

MARK

ezzer
20 years ago #2438

I think a few got it right...unfortunately Julie didn't.

In Julie's case, it was a ranking problem, but it works fine now that I adjusted it. I just had the intended keyphrase ranked too low. That'll never happen again!

MARK

Shadyman
20 years ago #2439

Steve got it right.

MARK

Laydee
20 years ago #2440

I stand corrected.

MARK

Shadyman
20 years ago #2441

W00t! (Microsoft says that means We Own the Other Team... Pfft, like they would know.)

MARK

Wendell
20 years ago #2442

I think those people debating the fairness of the questions are missing the point. All questions are fair. A person in a conversation with another person can ask any question they want. They may not get an answer but they can ask it nevertheless. So should this hold true with bots especially if our ultimate goal is to simulate a real person. Now in a contest environment we have to set limits because we are looking to announce a winner. If we asked 10 extremely difficult questions and no bot was able to answer them what have we accomplished? Likewise if we ask 10 easy questions and everybody gets them right we are in the same boat. So the questions were a mix of both...easy and hard.

The only question of unfairness with any question would be if we didn't use the same questions for every bot but as you know we did just that.

I always find it interesting that no bot in any contest that we ever had has score a perfect score however if you take the collective responses for all the bots participating we have come extremely close. That is always encouraging to me to continue to work on my bot because I can see plain as day my bot could have done better. I don't think there was a question asked that couldn't be answered. God Loiuse answered the Benji Adams questions as well as the day of the week. Correct me if I'm wrong but even so collectively the PF bots did extremely well.

Anyway I have read through all the transcripts and no one bot really stands out above all the rest. I think it is going to be extremely close between a lot of bots. With that said and having so many bots participating I think it's only fair to award a silver and a bronze medal as well to the second and third place finisher. These are nice medals and I'm sure you will be proud to receive them. I hope we can do a similar contest next year just between the PF bots. I will gladly be a part of it as long as it's done in conjunction with the CBC.
Maybe we can over time turn this into something big with cash awards. I like to think big

Best,
Wendell

MARK

Butterfly Dream
20 years ago #2443

Professor, thank you so much for all the hard work you have done here. Too bad your own bot was so mean to you.

Wendell, thank you for putting on this contest.

I'm having a great time and am pleasantly surprised at how well some of the newer bots did in their conversation with the judges.

MARK

revscrj
20 years ago #2444

Wendell: thank you for the time and effort put into the contest at no recompense- that's mighty good of you. Just to add my two cents inre to fairness: I thought that all the q's were fair enough, the closest to unfair was the "Benji Adams" q, but it would have been unfair if you'd asked "Who is Wilhelm Riech" or someone equally as obscure whereas Benji is a presence pretty well felt here to anyone seriously programming a bot and thus a key for his name isnt an unlikely concept; though, I think most botmasters use a "who is X" approach to avoid keying dozens or hundreds of names.

Also I second the sentiments of Butterfly Dream inre to appriciation of the Prof's work here- you make it all possible Benji, thank you.

Posts 2,433 - 2,444 of 4,092

Prev Next

» More new posts: Doghead's Cosmic Bar