One-week MVP sprint board for OCW Sharif Scrapy's Downloader.
Getting Started View on GitHub
Project Goal: Deliver an MVP Scrapy-based downloader for Sharif OCW that:
- Fetches course metadata and sessions
- Downloads all downloadable files
- Organizes outputs into structured folders
- Provides progress tracking and basic error handling
Success Criteria:
- Able to download at least one complete course (videos + PDFs)
- Correct directory structure with sanitized filenames
- Basic duplicate detection + retry handling works
- GitHub issues, milestones, and PRs follow roadmap
Team Size: 1 developer (solo)
Roles & Responsibilities:
- Developer: Implement, test, document, manage repo, and review
Definition of Done (DoD):
- Code compiles and runs without errors
- Passes basic integration tests on one sample course
- Artifacts stored in correct directory structure
- Pull requests merged into
mainwith review checklist passed
- Python 3.11+
- pip, uv, or another Python package manager
# Clone the repository
git clone https://github.com/Mdevpro78/sharif-ocw-scrapy-downloader/
cd sharif-ocw-scrapy-downloader
# Install dependencies using UV (recommended)
make uv_sync_docs
# Or using pip
pip install -e .[docs]# Start the development server
make uv_mkdoc_serve
# Or using UV directly
uv run mkdocs serveMkDocForge can be easily run using Docker:
# Build the Docker image
make docker-build
# Start the container
make docker-up
# View logs
make docker-logs
# Stop the container
make docker-downAlternatively, you can use Docker Compose directly:
# Build and start in one command
docker compose up
# Or build and start in detached mode
docker compose up -dOnce running, access the documentation at http://localhost:8000.
| Path | Purpose |
|---|---|
| docs/ | 📚 Docs: guidelines, roadmap, static assets |
| src/ | 🧩 Source (Scrapy project + package) |
| .github/ | ⚙️ CI/CD workflows |
| scripts/ | 🧰 Utility scripts |
| Path | Role |
|---|---|
| src/scrapy.cfg | Scrapy config |
| src/sharif_ocw_downloader/config.py | Configuration management |
| src/sharif_ocw_downloader/items.py | Item definitions (data models) |
| src/sharif_ocw_downloader/middlewares.py | Middleware components |
| src/sharif_ocw_downloader/pipelines.py | Item pipelines (process/store) |
| src/sharif_ocw_downloader/settings.py | Scrapy settings |
| src/sharif_ocw_downloader/spiders/ | Spider implementations |
| File | Purpose |
|---|---|
| Dockerfile | 🐳 Build Docker image |
| docker-compose.yml | Orchestrate services |
| Makefile | Common automation tasks |
| mkdocs.yml | MkDocs site config |
| pyproject.toml | Python project metadata/config |
| cliff.toml | git-cliff (changelog) config |
| requirements.lock | Locked production deps |
| requirements-dev.lock | Locked development deps |
MkDocForge is highly configurable through the mkdocs.yml file. See the MkDocs documentation for basic configuration and explore our examples for advanced setups.
Contributions are welcome! Please check out our Contributing Guide for guidelines on how to make contributions.
- Development Teams: Create comprehensive documentation for software projects
- Technical Writers: Leverage markdown with powerful extensions for technical content
- Open Source Projects: Provide high-quality documentation with minimal overhead
- Organizations: Maintain consistent documentation standards across projects
This project is licensed under the MIT License - see the LICENSE file for details.