Creating a Wordle Bot - Part 3 - Code
The code will consist of 3 main parts:
Part 1: Identifying the correct starting word
Part 2: Constructing the entire decision tree and exporting to json
Part 3: Creating a react app to traverse this tree
Part 1: Identifying the correct starting word:
Step 1: Download two lists of words, one being the list of all valid Wordle words and the second being all valid guess words. The first list is about 2500 words long and the second one is about 14000 long
Step 2: Create a nested dictionary for each guess word where you go through the list of possible words and generate
match string using “E”(exact match), “P”(is part of but in different location) and “N”(not part of) notation which is the key. For example if guess word is “crane” and possible word is “chest”, the match string would be “ENNNP” since first letter “c” is the same for both words, “r”, “a” and “n” do not occur in the possible word and “e” is present in the possible word but at a different spot
Count of this match string
List of possible words that generate this string
Step 3: When complete, use the dictionary for each guess word to come up with average count e.g., if guess word “crate” has the following dictionary: {
“ENNNP”: {“count”: 3, “instances” : [“uncle”, “juice”, “bocce”], “NNNNN”: {“count”: 2, “instances”: [“blonde”, dimly”],
“NNNNP”: {“count”: 3, “instances” : [“begin”, “hymen”, “newly”]
} , then the average count is (3+2+3)/3=2.66
Step 4: Shortlist 20 guess words with the lowest average count
Step 5: For each shortlisted word, construct a decision tree where the word represents the initial node and by using each unique key as representing a branch going out of the node. Now for each branch, create a new nested dictionary by using the “instances” for that key as the possible word list and the original guess word list.
Step 6: Now calculate the average of the count across all the nodes in the second level for all 20 trees. The one with the lowest average is our ideal starting word.
After executing this, I determined that the ideal starting word is “trace”
Part 2: Constructing the entire decision tree and exporting to json
Step 1: For each branch, use the dictionary created in previous step to identify the average count for each guess word and identify the one with the least average count as representing the the ideal word for this node. So if there are 140 branches going out from a shortlisted word, there will be 140 nodes, one attached to each branch. This represents the best guess for this node.
Step 2: Do the following recursively for the remaining tree at the second, third, fourth, fifth and sixth levels: Create a nested dictionary as before using the “instances” of the incoming branch as possible words
Step 3: Incorporate parallel processing for more efficient compute times.
Step 4: Take this nested dictionary and export it in json format
Part 3: React app to traverse tree
Step 1: Install node js and create a new react project. Copy the json created in the previous part into the src folder in this project
Step 2: Have 4o create the code necessary to implement the logic: A simple web page that contains 6 rows(each row representing a guess) with 5 cells each. Initially all cells have light grey background color. Only first row is populated with the word “TRACE” and is clickable. Other rows are blank and not clickable. Once user clicks on the current row, the cells change color so that Green represents the state “E”, Yellow is “P” and Dark Grey is “N”. Once user finishes setting all 5 cells, there is a button called “Best Guess” which becomes clickable. Once this button is clicked, the next row becomes active while the previous row becomes inactive along with the button. Also the active row is populated with the word associated with the node in the json corresponding to the match string chosen by the user. So if user chose “Green”, “Yellow”, “Dark Grey”, “Dark Grey”, “Yellow”, the corresponding match string is EPNNE and this is used to find the right word and populate the active row. This continues until the right word is found. There is also a Reset button that reverts the page to the initial state. This code is now used to replace contents of app.jsx in the src folder within the project folder.
Step 3: Run the code using npm run dev in the Pycharm terminal after navigating to the src folder within the project folder.
Step 4: After confirming the app runs well, create an account on Vercel and deploy it there.
# Identify first word! import csv import multiprocessing import concurrent.futures from collections import defaultdict import time import random # Configuration: Full dataset NUM_POSSIBLE_WORDS = 2500 # 🔹 Use 2500 possible words NUM_GUESS_WORDS = 14000 # 🔹 Use 14000 guess words TOP_N_GUESSES = 20 # 🔹 Keep top 20 words for decision tree CSV_FILE = "wordle_results.csv" RANDOM_SEED = 42 # 🔹 Ensure repeatability def default_pattern_data(): """ Returns a dictionary for storing pattern occurrences. """ return {"pattern_count": 0, "instances": []} def load_random_subset_from_file(file_path, num_words): """ Reads words from a text file and randomly selects a subset. """ try: with open(file_path, "r", encoding="utf-8") as file: words = [word.strip() for word in file if word.strip()] random.seed(RANDOM_SEED) # 🔹 Set seed for consistency random.shuffle(words) # 🔹 Shuffle before selection return words[:num_words] # 🔹 Pick `num_words` randomly except FileNotFoundError: print(f"Error: File '{file_path}' not found.") return [] except Exception as e: print(f"An error occurred: {e}") return [] def generate_pattern(guess_word, possible_word): """ Generates a 5-letter pattern ('E', 'P', 'N'). """ pattern = ["N"] * 5 possible_word_list = list(possible_word) # Step 1: Mark exact matches ('E') for i in range(5): if guess_word[i] == possible_word[i]: pattern[i] = "E" possible_word_list[i] = None # Step 2: Mark partial matches ('P') for i in range(5): if pattern[i] == "E": continue if guess_word[i] in possible_word_list: pattern[i] = "P" possible_word_list[possible_word_list.index(guess_word[i])] = None return "".join(pattern) def analyze_guess_word(guess_word, possible_words): """ Compares a guess word against all possible words. """ pattern_data = defaultdict(default_pattern_data) for possible_word in possible_words: pattern = generate_pattern(guess_word, possible_word) pattern_data[pattern]["pattern_count"] += 1 pattern_data[pattern]["instances"].append(possible_word) # 🔹 Compute total occurrences and number of patterns total_occurrences = sum(data["pattern_count"] for data in pattern_data.values()) num_patterns = len(pattern_data) # 🔹 Compute avg_occurrences correctly avg_occurrences = total_occurrences / num_patterns if num_patterns > 0 else 0 return guess_word, pattern_data, avg_occurrences def find_best_initial_guess(guess_words, possible_words): """ Finds the 20 best first-guess words using multiprocessing. """ print(f"🔹 Processing {len(guess_words)} guess words against {len(possible_words)} possible words...") start_time = time.time() results = [] # Use multiprocessing for efficiency with multiprocessing.Pool(processes=multiprocessing.cpu_count()) as pool: results = pool.starmap(analyze_guess_word, [(word, possible_words) for word in guess_words]) avg_occurrences_dict = {word: avg for word, _, avg in results} pattern_data_dict = {word: pattern_data for word, pattern_data, _ in results} # Get the top 20 words with the least average occurrences best_guess_words = sorted(avg_occurrences_dict, key=avg_occurrences_dict.get)[:TOP_N_GUESSES] best_guess_data = {word: (pattern_data_dict[word], avg_occurrences_dict[word]) for word in best_guess_words} elapsed_time = time.time() - start_time print(f"✅ First 20 guesses found in {elapsed_time:.2f} seconds.") return best_guess_words, best_guess_data def process_next_guess(first_guess_word, pattern_dict, guess_words): """ Finds the best second guess for each unique pattern in the first guess. """ next_guess_per_pattern = {} for pattern, data in pattern_dict.items(): possible_words = data["instances"] if len(possible_words) > 1: # Only analyze if more than one word exists next_guess_words, next_pattern_data = find_best_initial_guess(guess_words, possible_words) next_best_word = next_guess_words[0] # Best word for this pattern total_occurrences = sum( sum(pattern_data["pattern_count"] for pattern_data in p[0].values()) for p in next_pattern_data.values() ) num_patterns = sum(len(p[0]) for p in next_pattern_data.values()) avg_occurrences = total_occurrences / num_patterns if num_patterns > 0 else total_occurrences next_guess_per_pattern[pattern] = (next_best_word, avg_occurrences, next_pattern_data) return first_guess_word, next_guess_per_pattern def find_best_next_guess(best_guess_data, guess_words): """ Finds the best next guess for each unique pattern in the first guess. """ print(f"🔹 Finding best next guess for {TOP_N_GUESSES} words...") start_time = time.time() results = {} with concurrent.futures.ProcessPoolExecutor(max_workers=multiprocessing.cpu_count()) as executor: future_to_guess = { executor.submit(process_next_guess, word, data[0], guess_words): word for word, data in best_guess_data.items() } for future in concurrent.futures.as_completed(future_to_guess): try: word, next_guess_per_pattern = future.result() results[word] = next_guess_per_pattern # 🔹 Store per-pattern guesses except Exception as e: print(f"⚠ Error processing next guess: {e}") elapsed_time = time.time() - start_time print(f"✅ Best next guesses found in {elapsed_time:.2f} seconds.") return results def save_results_to_csv(best_first_guesses, best_guess_data, best_next_guesses): """ Saves results to a CSV file. """ with open(CSV_FILE, mode="w", newline="", encoding="utf-8") as file: writer = csv.writer(file) writer.writerow(["First Guess", "Pattern", "Avg Occurrences", "Next Best Guess", "Next Best Avg Occurrences"]) for first_guess in best_first_guesses: pattern_data, avg_occurrences = best_guess_data[first_guess] next_best_patterns = best_next_guesses.get(first_guess, {}) for pattern, (next_best_guess, next_avg_occurrences, _) in next_best_patterns.items(): writer.writerow([first_guess, pattern, avg_occurrences, next_best_guess, next_avg_occurrences]) print(f"✅ Results saved to {CSV_FILE}") if __name__ == "__main__": multiprocessing.set_start_method("spawn", force=True) # Load full dataset possible_words = load_random_subset_from_file("valid-wordle-answers.txt", NUM_POSSIBLE_WORDS) guess_words = load_random_subset_from_file("valid-wordle-guesses.txt", NUM_GUESS_WORDS) # Find best first-guess candidates best_first_guesses, best_guess_data = find_best_initial_guess(guess_words, possible_words) # Find best next guess for each first-level word best_next_guesses = find_best_next_guess(best_guess_data, guess_words) # Save results to CSV save_results_to_csv(best_first_guesses, best_guess_data, best_next_guesses) # Print confirmation print(f"\n✅ Processing complete! Results saved to {CSV_FILE}.")
# Build complete decision tree in a json file import json import multiprocessing import concurrent.futures from collections import defaultdict import time # Configuration MAX_ITERATIONS = 6 # 🔹 Limit tree depth to 6 JSON_FILE = "wordle_decision_tree.json" ROOT_GUESS = "trace" def default_pattern_data(): """ Returns a dictionary for storing pattern occurrences. """ return {"pattern_count": 0, "instances": []} def load_words_from_file(file_path): """ Reads all words from a text file. """ try: with open(file_path, "r", encoding="utf-8") as file: return [word.strip() for word in file if word.strip()] except FileNotFoundError: print(f"Error: File '{file_path}' not found.") return [] except Exception as e: print(f"An error occurred: {e}") return [] def generate_pattern(guess_word, possible_word): """ Generates a 5-letter pattern ('E', 'P', 'N'). """ pattern = ["N"] * 5 possible_word_list = list(possible_word) for i in range(5): if guess_word[i] == possible_word[i]: pattern[i] = "E" possible_word_list[i] = None for i in range(5): if pattern[i] == "E": continue if guess_word[i] in possible_word_list: pattern[i] = "P" possible_word_list[possible_word_list.index(guess_word[i])] = None return "".join(pattern) def analyze_guess_word(guess_word, possible_words): """ Compares a guess word against all possible words. """ pattern_data = defaultdict(default_pattern_data) for possible_word in possible_words: pattern = generate_pattern(guess_word, possible_word) pattern_data[pattern]["pattern_count"] += 1 pattern_data[pattern]["instances"].append(possible_word) total_occurrences = sum(data["pattern_count"] for data in pattern_data.values()) num_patterns = len(pattern_data) avg_occurrences = total_occurrences / num_patterns if num_patterns > 0 else 0 return guess_word, pattern_data, avg_occurrences def find_best_next_guess(guess_words, possible_words): """ Finds the best next guess word by minimizing avg occurrences. """ with multiprocessing.Pool(processes=multiprocessing.cpu_count()) as pool: results = pool.starmap(analyze_guess_word, [(word, possible_words) for word in guess_words]) avg_occurrences_dict = {word: avg for word, _, avg in results} pattern_data_dict = {word: pattern_data for word, pattern_data, _ in results} best_guess_word = min(avg_occurrences_dict, key=avg_occurrences_dict.get) best_guess_data = {best_guess_word: (pattern_data_dict[best_guess_word], avg_occurrences_dict[best_guess_word])} return best_guess_word, best_guess_data def process_next_guess(guess_word, pattern_dict, guess_words, depth): """ Recursively builds the decision tree for each pattern branch. """ next_level = {} for pattern, data in pattern_dict.items(): possible_words = data["instances"] if len(possible_words) == 1 or depth >= MAX_ITERATIONS: # Stop if only one word remains next_level[pattern] = {"word": possible_words[0], "instances": possible_words, "depth": depth} continue next_best_word, next_pattern_data = find_best_next_guess(guess_words, possible_words) next_level[pattern] = { "word": next_best_word, "instances": possible_words, "depth": depth, "branches": process_next_guess(next_best_word, next_pattern_data[next_best_word][0], guess_words, depth + 1) } return next_level def build_decision_tree(root_word, possible_words, guess_words): """ Builds the full decision tree starting from the root word. """ print(f"🔹 Starting decision tree with '{root_word}'") root_guess_word, root_guess_data = find_best_next_guess(guess_words, possible_words) decision_tree = { "word": root_guess_word, "depth": 0, "branches": process_next_guess(root_guess_word, root_guess_data[root_guess_word][0], guess_words, 1) } return decision_tree def save_tree_to_json(tree): """ Saves the decision tree to a JSON file. """ with open(JSON_FILE, "w", encoding="utf-8") as file: json.dump(tree, file, indent=4) print(f"✅ Decision tree saved to {JSON_FILE}") if __name__ == "__main__": multiprocessing.set_start_method("spawn", force=True) # Load full datasets (No random sampling) possible_words = load_words_from_file("valid-wordle-answers.txt") guess_words = load_words_from_file("valid-wordle-guesses.txt") # Build the decision tree decision_tree = build_decision_tree(ROOT_GUESS, possible_words, guess_words) # Save to JSON save_tree_to_json(decision_tree) print("\n✅ Decision tree generation complete!")
//App.jsx code to render react page using json decision tree import React, { useState, useEffect } from "react"; import decisionTree from "./decision_tree.json"; import questionMark from "./question_mark.jpeg"; // Ensure this image is in `src/` const WordleSolver = () => { const initialTileState = { letter: "", color: "bg-gray-500" }; // Default Light Gray const colorCycle = ["bg-green-500", "bg-yellow-500", "bg-gray-800"]; // Green → Yellow → Dark Gray const [rows, setRows] = useState( Array(6).fill().map(() => Array(5).fill({ ...initialTileState })) ); const [activeRow, setActiveRow] = useState(0); const [showTooltip, setShowTooltip] = useState(false); // Tooltip visibility // Set initial row with "TRACE" useEffect(() => { setRows((prev) => { const newRows = [...prev]; newRows[0] = [ { letter: "T", color: "bg-gray-500" }, { letter: "R", color: "bg-gray-500" }, { letter: "A", color: "bg-gray-500" }, { letter: "C", color: "bg-gray-500" }, { letter: "E", color: "bg-gray-500" }, ]; return newRows; }); }, []); // Handle tile color changes const handleTileClick = (rowIndex, colIndex) => { if (rowIndex !== activeRow) return; // Only allow clicks on active row setRows((prev) => { const newRows = [...prev]; const currentTile = newRows[rowIndex][colIndex]; // Cycle through colors (Green → Yellow → Dark Gray → Green) const nextColor = colorCycle[(colorCycle.indexOf(currentTile.color) + 1) % colorCycle.length]; newRows[rowIndex][colIndex] = { ...currentTile, color: nextColor }; return [...newRows]; // Ensure React detects state change }); }; // Compute pattern from tile colors const computePattern = () => { return rows[activeRow].map((tile) => { if (tile.color === "bg-green-500") return "E"; if (tile.color === "bg-yellow-500") return "P"; return "N"; // Default is dark gray }).join(""); }; // Handle "Next Guess" by traversing the tree const handleNextGuess = () => { if (activeRow >= 5) return; // Stop at row 6 const currentPattern = computePattern(); console.log(`🔍 Pattern entered: ${currentPattern}`); let currentBranch = decisionTree; let pathTaken = []; for (let i = 0; i <= activeRow; i++) { const pattern = rows[i].map(tile => { if (tile.color === "bg-green-500") return "E"; if (tile.color === "bg-yellow-500") return "P"; return "N"; }).join(""); pathTaken.push(pattern); if (currentBranch.branches && currentBranch.branches[pattern]) { currentBranch = currentBranch.branches[pattern]; } else { console.warn(`❌ Pattern '${pattern}' not found at depth ${i}!`); console.warn(`📌 Path taken: ${JSON.stringify(pathTaken)}`); return; } } if (currentBranch.word) { const nextWord = currentBranch.word.toUpperCase(); console.log(`✅ Next word selected: ${nextWord}`); setRows(prev => { const newRows = [...prev]; newRows[activeRow + 1] = nextWord.split("").map(letter => ({ letter, color: "bg-gray-500", })); // Populate next row with letters in gray return [...newRows]; // Ensure re-render }); setActiveRow(prev => prev + 1); } else { console.warn(`❌ No word found at end of path: ${JSON.stringify(pathTaken)}`); } }; // Handle Restart const handleRestart = () => { setRows( Array(6).fill().map(() => Array(5).fill({ ...initialTileState })) ); setActiveRow(0); setRows((prev) => { const newRows = [...prev]; newRows[0] = [ { letter: "T", color: "bg-gray-500" }, { letter: "R", color: "bg-gray-500" }, { letter: "A", color: "bg-gray-500" }, { letter: "C", color: "bg-gray-500" }, { letter: "E", color: "bg-gray-500" }, ]; return newRows; }); }; return ( <div className="flex flex-col items-center justify-center min-h-screen bg-gray-900 text-white relative"> {/* Header */} <div className="w-full max-w-lg p-4 bg-black flex justify-between items-center relative"> <h1 className="text-3xl font-bold">WORDLE SOLVER</h1> {/* Tooltip Container */} <div className="relative flex items-center"> <img src={questionMark} alt="Help" className="w-6 h-6 cursor-pointer fixed top-4 right-4 z-50" onMouseEnter={() => setShowTooltip(true)} onMouseLeave={() => setShowTooltip(false)} /> {/* Tooltip Content */} {showTooltip && ( <div className="absolute right-0 mt-8 w-64 p-3 bg-gray-800 text-sm rounded-lg shadow-lg z-50"> <p className="font-bold">HOW TO PLAY:</p> <ul className="list-disc pl-4"> <li>Open Wordle and keep this page open.</li> <li>Enter “TRACE” as the first guess.</li> <li>Click tiles to match colors from Wordle.</li> <li>Press "Next Guess" to get the next word.</li> <li>Continue updating tiles and clicking "Next Guess".</li> <li>Click "Restart" to start over.</li> </ul> </div> )} </div> </div> {/* Word Grid */} <div className="mt-6 flex flex-col items-center w-full"> {rows.map((row, rowIndex) => ( <div key={rowIndex} className="flex justify-center space-x-2 mb-2"> {row.map((tile, colIndex) => ( <div key={colIndex} className={`w-14 h-14 flex items-center justify-center text-2xl font-bold border border-gray-400 rounded-lg transition-all duration-200 ${ rowIndex <= activeRow ? tile.color : "bg-gray-500" }`} onClick={() => handleTileClick(rowIndex, colIndex)} > {/* Ensure letters remain visible */} {tile.letter} </div> ))} </div> ))} </div> {/* Buttons */} <div className="mt-6 flex space-x-4"> <button className={`px-4 py-2 text-white rounded ${activeRow < 5 ? "bg-blue-600 hover:bg-blue-700" : "bg-gray-600 cursor-not-allowed"}`} onClick={handleNextGuess} disabled={activeRow >= 5} > Next Guess </button> <button className="px-4 py-2 bg-red-600 hover:bg-red-700 text-white rounded" onClick={handleRestart} > Restart </button> </div> </div> ); }; export default WordleSolver;