SemanticPDF: Drag, Drop, Semantic Search - SemanticPDF is a simple, privacy-focused application that makes it easy to upload a PDF file and perform a semantic search on contents.
-
Updated
Apr 4, 2024 - TypeScript
SemanticPDF: Drag, Drop, Semantic Search - SemanticPDF is a simple, privacy-focused application that makes it easy to upload a PDF file and perform a semantic search on contents.
Jina VDR is a multilingual, multi-domain benchmark for visual document retrieval
Python program for searching pdf text, ranking the results and exporting highlighted search results in pdf. Uses trie structure, stack, heap, page graph. Converts queries to postfix notation. Allows for logical expressions and phrases. Offers did you mean functionality.
vue功能最全的pdf组件,支持渲染、页码提取与跳转、文件加载完成监听、页面变化监听、文本搜索、关键词高亮、目录提取
Use semantic search on PDFs locally
DocuVisQA(Document Visual Question Answering) is a Python project that leverages Google's Generative AI and Langchain for document processing, text splitting, and question answering. It also supports image processing with Streamlit for interactive UI.
CLI for merging PDF contexts.
In Development
A web interface that allows searching for PDFs by their content
PDF Parser built in Rust
A document indexing daemon that can populate Elasticsearch indexes with the contents and metadata of a number of document types including PDF, image scans, etc. Used to power Facile Search, however can be re-used for anything that requires search indexing for scanned documents.
Given a set of PDFs and the query, the most relevant pdf can be found with the help of TF-IDF. The code has not used any library to implement TF-IDF
Resume search application using openai RAG and file search . A demo application which shows power of RAG from openai to simplify resume screening . Open source VLM model example to follow
Are you short on time?! Can't you search all the PDFs one by one for the content you want?! Well, PDF-Founder is here...
A tool to search for text in PDF files using multiple methods, including OCR (Optical Character Recognition).
A powerful AI-powered PDF search and question-answering system built with LangChain, Pinecone Vector Store, OpenAI, and Supabase. Upload PDFs, ask questions, and get intelligent answers with persistent conversation memory.
Programa que busca uma lista de nomes das Partes Processuais nos PDFs do Diário Oficial.
A high-performance RAG system for PDFs using multi-vector embeddings (ColPali / ColQwen / ColSmol) with vector search in Qdrant, prefetch optimization, and reranking for improved relevance. Designed for speed, accuracy, and scalability, this system is ideal for building intelligent search, document understanding, and QA applications.
Website in PHP to index all pdf content and easy way to find any text
Repository for the Indexing, Search and Evaluation of UniChemFinder
Add a description, image, and links to the pdf-search topic page so that developers can more easily learn about it.
To associate your repository with the pdf-search topic, visit your repo's landing page and select "manage topics."