Ask My PDF
Client
Personal Project
Services
Ruby on Rails React.js + Tailwind ML Embeddings ML Completions
Industries
AI
Date
Oct 2023
Preview Project
About This Project:
The "Ask My PDF" app is a Ruby-based tool designed to make PDF files more interactive and searchable.
At its core, the app reads the content of a PDF file and converts it into "embeddings," which are essentially a numerical representation of the text. These embeddings are then stored in a CSV file.
When you ask a question, the app also generates an embedding for that question and compares it to the embeddings from the PDF. Based on this comparison, the app identifies the most relevant pages in the PDF where you're likely to find the answer to your question. Once it finds the most relevant pages it feeds it into gpt-3.5-tubro-instruct along with the question to get relevant response. It's like having a search engine tailored specifically for your PDF!
Here are some insights/decisions about the project:
Using 1 embeddings + content file
Using gpt-3.5-turbo-instruct instead of text-davinci-003 as it's more recent and has the same capabilities
Tried a few react + rails setups, and ended up having the most success with reactjs/react-rails
Using sqlite3 for caching the questions because I don't need more than that
Using fly.io for hosting because it's free and simple to deploy
Learned about how embeddings work on a basic math level (vectors + dot product)
First time using Ruby or Embeddings, so I used GPT4 for assistance / learning
I use Laravel at my day job so it was easier to understand how Ruby on Rails works
Things I'd do differently next time:
Better Tailwindcss integration: Spend more time configuring tailwind (currently just importing tailwind v2 via <link>)
Paid subscription model for authors: Allow users to login + upload + generate embeddings for their own PDF under a subscription model and then expose it to the public (any author can upload their books to make them searchable by their audience)
More context for completions: Pre-convert each page into a "Summary section" using gpt-3.5-turbo-instruct so that way each page's content is summarized and takes up less text/tokens. This way we can pass up more context to the final completion. So a page that is 2000 tokens long might only take up 500 tokens as a summary, allowing you to now pass up to 8 pages if you have a budget of 4000 tokens per completion (in theory)