Skip to content

sabinsh/DataScienceCapstone

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Next Word Prediction - Data Science Capstone

This repository contains a Next Word Prediction project built in R and Shiny as part of the Data Science Capstone. The goal is to create a predictive text model that suggests the next word given a phrase. This project was completed as part of the Data Science Capstone from the Johns Hopkins University Data Science Specialization on Coursera.

📅 Completed on October 11, 2016


📂 Repository Structure

.
├── ShinyApp/
│   └── NextWordPrediction/   # R/Shiny app for interactive prediction
├── SlideDeck/                # Project presentation slides
└── README.md                 # Project documentation

🚀 Getting Started

Prerequisites

  • R (version 4.0 or higher recommended)
  • R packages:
    • shiny
    • stringr
    • data.table
    • (add more based on your actual code)

Install packages in R:

install.packages(c("shiny", "stringr", "data.table"))

Running the Shiny App

  1. Navigate to the Shiny app folder:
    setwd("ShinyApp/NextWordPrediction")
  2. Run the app:
    library(shiny)
    runApp()

This will launch the interactive web app where you can type a phrase and get next-word predictions.


📖 Project Overview

The app predicts the next word by analyzing text data and applying natural language processing techniques.
It uses n-gram models trained on text datasets to provide probable next words.

Key steps include:

  • Data preprocessing (cleaning, tokenization, removing punctuation/stopwords)
  • Building n-gram frequency models
  • Prediction logic based on most frequent n-grams
  • Shiny app interface for real-time predictions

🎯 Features

  • Input a phrase and get next word suggestions
  • Lightweight Shiny interface
  • Supports interactive exploration of text prediction

🖼️ Usage Example

Here’s how the Shiny app looks in action:

Shiny App Screenshot


📊 Slide Deck

The SlideDeck/ folder contains the presentation that explains:

  • Project motivation
  • Methodology
  • Model design
  • Results and challenges

🤝 Contributing

Contributions are welcome!

  1. Fork this repository
  2. Create a new branch (git checkout -b feature-branch)
  3. Commit your changes (git commit -m "Added new feature")
  4. Push to the branch (git push origin feature-branch)
  5. Open a Pull Request

📜 License

This project is licensed under the MIT License (or update with your actual license).

About

Next Word Prediction

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors