Skip to content

Add NLP pipeline, PDF highlighting, and CI/CD integration#4

Merged
Ishiezz merged 4 commits intomainfrom
feature/nlp-pdf-highlighting
Apr 18, 2026
Merged

Add NLP pipeline, PDF highlighting, and CI/CD integration#4
Ishiezz merged 4 commits intomainfrom
feature/nlp-pdf-highlighting

Conversation

@Ishiezz
Copy link
Copy Markdown
Collaborator

@Ishiezz Ishiezz commented Apr 18, 2026

Summary

Implements NLP Systems & Deployment module.

Features

  • Semantic chunking using RecursiveCharacterTextSplitter
  • PDF spatial highlighting using PyMuPDF
  • Streamlit session state handling for stable UI
  • Export pipeline for highlighted PDFs
  • GitHub Actions CI/CD workflow

Notes

  • Uses partial matching for robust PDF highlighting
  • Page-level filtering improves performance
  • CI pipeline avoids dependency on large model files

@Ishiezz Ishiezz merged commit 331906f into main Apr 18, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant