🌐 Available Languages: Türkçe 🇹🇷 | English 🇬🇧
Welcome to the Word Predictor project...
This project is a simple language modeling application that learns word bigrams—and optionally trigrams—from a given text source. It predicts the most likely next word following a user-provided word or sentence. It serves as a starting point for basic language processing tasks such as autocomplete or text generation.
- Text Cleaning: Input text is converted to lowercase and stripped of punctuation, which improves prediction accuracy.
- Bigram Generation: Builds a basic language model by counting the frequency of word pairs.
- Trigram Support (Optional): Supports trigram usage to make more contextual predictions by considering the two previous words. This feature can be enabled or disabled via
config.json. - Chained Prediction: After a user-provided word or sentence, it can predict a chain of words based on the configured
predictionChainLength. - Configuration Management: Application settings (input file path, prediction chain length, trigram usage) can be easily managed through the
config.jsonfile. - Modular Structure: The code is divided into separate packages (config, textprocessor, languagemodel, utils) with distinct responsibilities, making it easier to read and maintain.
- Robust Error Handling: Unexpected conditions such as file read errors or configuration issues are reported to the user with clear messages.
- Go must be installed (version 1.16 or later is recommended).
-
Clone or Download the Project: Download or copy the project files to your local machine.
-
Ensure Folder Structure: Make sure your project directory follows this structure:
project_folder/ ├── main.go ├── internal/ │ ├── config/ │ │ └── config.go │ ├── textprocessor/ │ │ └── text_processing.go │ ├── languagemodel/ │ │ └── language_model.go │ └── utils/ │ └── utils.go ├── config.json └── input.json -
Initialize the Go Module: Navigate to the root directory of your project (
project_folder) in the terminal and initialize the Go module:go mod init bigram # 'bigram' is your module name. # You can use a different name if preferred.
This command will create a
go.modfile in your project root. -
Download (or Verify) Required Dependencies: While still in the project root (where
go.modis located), run:go mod tidy
This command tidies up your
go.modfile and ensures that all necessary packages are downloaded and updated. -
Prepare the
input.jsonFile: Theinput.jsonfile should contain the text to be used for model training. A sampleinput.jsonis included with the project. Its structure should look like this:{ "text": "This is the entire text content that will be placed here. The model will learn from this text." } -
Configure the
config.jsonFile: Theconfig.jsonfile controls the behavior of the application:{ "inputFilePath": "input.json", // Path to the input text file "predictionChainLength": 3, // Number of words to predict "useTrigrams": true // Enable trigrams? (true/false) }You can adjust these values based on your needs.
-
Run the Application: From the project root (where the
bigrammodule is defined), run the application:go run .The application will prompt you to enter a word or sentence:
Enter a word or sentence:Enter your input and press
Enter. The application will then display the predicted word sequence.