top of page
Writer's pictureNamrata Kapoor

Making a simple and fast chatbot in 10 minutes

In real-world response time for a chatbot matters a lot. Be it the travel industry, banks, or doctors, if you want to really help your customers, response time should be less, and similar to what it is while talking to a customer care representative.

Besides the time it is also important to understand the main motive of the chatbot, every industry cannot use a similar chatbot as they have different purposes and have a different set of corpus to reply from.

While transformers are good to get a suitable reply, it may take time to respond back. On the other hand where time is concerned various other methodologies can be applied and even find some rule-based systems to get an appropriate reply which is apt for the question asked.

How many times you may have contacted a travel agency for the refund of your tickets booked last year during the lock-down, I am sure getting an apt reply to it was far from reality.

Now let’s make a simple chatbot and install these packages:

Install nltk 
Install newspaper3k

Package newspaper3k has few advantages as below:

  1. · Multi-threaded article download framework

  2. · News URL can be identified

  3. · Text extraction can be done from HTML

  4. · Top image extraction from HTML

  5. · All image extraction can be done from HTML

  6. · Keyword extraction can be done from the text

  7. · Summary extraction can be done from the text

  8. · Author extraction can be done from the text

  9. · Google trending terms extraction

  10. · Works in 10+ languages (English, German, Arabic, Chinese, …)

Import libraries as below:

#import libraries
from newspaper import Article
import random
import nltk
import string
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.metrics.pairwise import cosine_similarity

I have already talked about CountVectorizer in my old blogs.

Cosine similarity, or the cosine kernel, computes similarity as the normalized dot product of X and Y:

sklearn.metrics.pairwise.cosine_similarity(X, Y=None, dense_output=True)

Parameters

X{ndarray, sparse matrix} of shape (n_samples_X, n_features) Input data. Y{ndarray, sparse matrix} of shape (n_samples_Y, n_features), default=None Input data.

If None, the output will be the pairwise similarities between all samples in X. dense_outputbool, default=True Whether to return dense output even when the input is sparse. If False, the output is sparse if both input arrays are sparse.

Returns

kernel matrix: ndarray of shape (n_samples_X, n_samples_Y)

import numpy as np
import warnings
warnings.filterwarnings('ignore')

Tokenization is already explained in my blog. Here we are taking data from a healthcare website

article=Article("https://www.mayoclinic.org/diseases-conditions/chronic-kidney-disease/symptoms-causes/syc-20354521")
article.download()
article.parse()
article.nlp()
corpus=article.text
print(corpus)

#tokenization
text=corpus
sentence_list=nltk.sent_tokenize(text) #A list of sentences

#Print the list of sentences
print(sentence_list)

Once you have the corpus ready, you may have to think about questions that a user or customer may ask or say, which doesn’t have any relation to the content we have.

It can be a greeting message, gratitude message, or a message like a bye. The team needs to brainstorm on such messages and their responses.

I tried to cover a few here.

Greeting bot response

#Random response to greeting
def greeting_response(text):
 text=text.lower()

 #Bots greeting
 bot_greetings=["howdy","hi","hola","hey","hello"]

  #User Greetings
 user_greetings=["wassup","howdy","hi","hola","hey","hello"]
 for word in text.split():
 if word in user_greetings:
 return random.choice(bot_greetings)
#Random response to greeting
def gratitude_response(text):
 text=text.lower()

Gratitude Bot Response:

#Bots gratitude
 bot_gratitude=["Glad to help","You are most welcome", "Pleasure to be of help"]

 #User Gratitude
 user_gratitude=["Thankyou so much","grateful","Thankyou","thankyou","thank you"]

 for word in text.split():
 if word in user_gratitude:
 return random.choice(bot_gratitude)

Sorting list

# Default title text
def index_sort(list_var):
 length=len(list_var)
 list_index=list(range(0,length))
 x=list_var
 for i in range(length):
 for j in range(length):
 if x[list_index[i]]>x[list_index[j]]:
 #swap
 temp=list_index[i]
 list_index[i]=list_index[j]
 list_index[j]=temp

 return list_index

Chatbot response function, which uses cosine similarities from predefined texts to respond from.

#Creat Bots Response
def bot_response(user_input):
 user_input=user_input.lower()
 sentence_list.append(user_input)
 bot_response=""
 cm=CountVectorizer().fit_transform(sentence_list)
 similarity_scores=cosine_similarity(cm[-1],cm)
 similarity_scores_list=similarity_scores.flatten()
 index=index_sort(similarity_scores_list)
 index=index[1:]
 response_flag=0
 j=0
 for i in range(len(index)):
 if similarity_scores_list[index[i]]>0.0:
  bot_response=bot_response+' '+sentence_list[index[i]]
 response_flag=1
 j=j+1
 if j>2:
 break

 if response_flag==0:
 bot_response=bot_response+" "+"I apologize, I dont understand"

 sentence_list.remove(user_input) 

 return bot_response

For exit from chat exit list words are written like 'exit', 'bye', 'see you later', 'quit'.

In response to these words, the chatbot will exit.

Start the chatbot and enjoy!

#Start Chat
print("Doc Bot: I am DOc bot and I will answer your queries about chronic kidney disease, if you want to exit type, bye")

exit_list=['exit','bye','see you later','quit']

while(True):
 user_input=input()
 if user_input.lower() in exit_list:
 print("Doc Bot: Bye Bye See you later")
 break
 elif greeting_response(user_input)!= None:
 print("Doc Bot: "+ greeting_response(user_input))
 elif gratitude_response(user_input)!= None:
 print("Doc Bot: "+ gratitude_response(user_input)) 
 else:
 print("Doc Bot: "+ bot_response(user_input))

See the responses from the chatbot below:



It is important to notice that "Thanks" was not in our function of bot_gratitude, hence the message. With time you can enlarge such vocabularies, or make use of regular expressions to fine-tune it.

Conclusion:

This is a small example to get you started with making chatbots that are fast and simple. You need to fine-tune the chatbots for different industries where the corpus is taken from live data or some deposits on the cloud.

We need to remember that live data has its own challenges and chat has to be responded from the latest data. An example of it is ticket booking in a travel agency.

Thanks for reading!

1,681 views

Recent Posts

See All
bottom of page