Chatbot to answer previously answered queries for your company
This is an end to end LLM project based on Google Gimini API, Langchain, HuggingFace. It is a Q&A system which will provide a streamlit based user interface for people where they can ask questions and get answers(if their queries are already present in the database).
- Use a real CSV file of FAQs that Codebasics company is using right now.
- Their human staff will use this file to assist their course learners.
- We will build an LLM based question and answer system that can reduce the workload of their human staff.
- Students should be able to use this system to ask questions directly and get answers within seconds
- Langchain + Google Gimini API(free): LLM based Q&A
- Streamlit: UI
- Huggingface instructor embeddings: Text embeddings
- FAISS: Vector databse
1.Clone this repository to your local machine using:
git clone https://github.com/OmSDeshmukh/FAQ-Assistant
2.Navigate to the project directory:
cd FAQ-Assistant
- Create a virtual environment:
python3 -m venv venv
- Activate the virtual environment:
# On Windows
venv\Scripts\activate
# On macOS/Linux
source venv/bin/activate
- Install the required dependencies using pip:
pip install -r requirements.txt
- Acquire an api key through makersuite.google.com and put it in .env file
GOOGLE_API_KEY="the_api_key_here"
- Run the Streamlit app by executing:
streamlit run main.py
2.The web app will open in your browser.
-
To create a database of FAQs, click on Create Database button. It will take some time before knowledgebase is created so please wait.
-
Once knowledge base is created you will see a directory called faiss_index in your current folder
-
Now you are ready to ask questions. Type your question in Question box and hit Enter
- src/main.py: The main Streamlit application script.
- src/langchain_helper.py: This has all the code related to getting the chain for final inference.
- src/get_vector_db.py: This will take the csv file and load the vector database into disk for faster retrieval.
- src/prompt_template.py: This stores the prompt template give as input to the llm.
- notebooks/main.ipynb: Same code as entire project but at a single place in a notebook for testing purposes.
- data/demo.csv: The csv file used for loading data.
- requirements.txt: A list of required Python packages for the project.
- .env: Configuration file for storing your Google API key.
A huge thanks to codebasics YouTube Channel for this valuable project idea. Here is the Video linkand Github repository