Creating a Wordle Bot - Part 3 - Code

The code will consist of 3 main parts:

  • Part 1: Identifying the correct starting word

  • Part 2: Constructing the entire decision tree and exporting to json

  • Part 3: Creating a react app to traverse this tree

Part 1: Identifying the correct starting word:

Step 1: Download two lists of words, one being the list of all valid Wordle words and the second being all valid guess words. The first list is about 2500 words long and the second one is about 14000 long

Step 2: Create a nested dictionary for each guess word where you go through the list of possible words and generate

  • match string using “E”(exact match), “P”(is part of but in different location) and “N”(not part of) notation which is the key. For example if guess word is “crane” and possible word is “chest”, the match string would be “ENNNP” since first letter “c” is the same for both words, “r”, “a” and “n” do not occur in the possible word and “e” is present in the possible word but at a different spot

  • Count of this match string

  • List of possible words that generate this string

Step 3: When complete, use the dictionary for each guess word to come up with average count e.g., if guess word “crate” has the following dictionary: { 

“ENNNP”: {“count”: 3, “instances” : [“uncle”, “juice”, “bocce”], “NNNNN”: {“count”: 2, “instances”: [“blonde”, dimly”],

“NNNNP”: {“count”: 3, “instances” : [“begin”, “hymen”, “newly”]

} , then the average count is (3+2+3)/3=2.66

Step 4: Shortlist 20 guess words with the lowest average count

Step 5: For each shortlisted word, construct a decision tree where the word represents the initial node and by using each unique key as representing a branch going out of the node. Now for each branch, create a new nested dictionary by using the “instances” for that key as the possible word list and the original guess word list. 

Step 6: Now calculate the average of the count across all the nodes in the second level for all 20 trees. The one with the lowest average is our ideal starting word.

After executing this, I determined that the ideal starting word is “trace”

Part 2: Constructing the entire decision tree and exporting to json

Step 1: For each branch, use the dictionary created in previous step to identify the average count for each guess word and identify the one with the least average count as representing the the ideal word for this node. So if there are 140 branches going out from a shortlisted word, there will be 140 nodes, one attached to each branch. This represents the best guess for this node.

Step 2: Do the following recursively for the remaining tree at the second, third, fourth, fifth and sixth levels: Create a nested dictionary as before using the “instances” of the incoming branch as possible words

Step 3: Incorporate parallel processing for more efficient  compute times.

Step 4: Take this nested dictionary and export it in json format

Part 3: React app to traverse tree

Step 1: Install node js and create a new react project. Copy the json created in the previous part into the src folder in this project

Step 2: Have 4o create the code necessary to implement the logic: A simple web page that contains 6 rows(each row representing a guess) with 5 cells each. Initially all cells have light grey background color. Only first row is populated with the word “TRACE” and is clickable. Other rows are blank and not clickable. Once user clicks on the current row, the cells change color so that Green represents the state “E”, Yellow is “P” and Dark Grey is “N”. Once user finishes setting all 5 cells, there is a button called “Best Guess” which becomes clickable. Once this button is clicked, the next row becomes active while the previous row becomes inactive along with the button. Also the active row is populated with the word associated with the node in the json corresponding to the match string chosen by the user. So if user chose “Green”, “Yellow”, “Dark Grey”, “Dark Grey”, “Yellow”, the corresponding match string is EPNNE and this is used to find the right word and populate the active row. This continues until the right word is found. There is also a Reset button that reverts the page to the initial state. This code is now used to replace contents of app.jsx in the src folder within the project folder.

Step 3: Run the code using npm run dev in the Pycharm terminal after navigating to the src folder within the project folder. 

Step 4: After confirming the app runs well, create an account on Vercel and deploy it there.

# Identify first word!
import csv
import multiprocessing
import concurrent.futures
from collections import defaultdict
import time
import random

# Configuration: Full dataset
NUM_POSSIBLE_WORDS = 2500  # 🔹 Use 2500 possible words
NUM_GUESS_WORDS = 14000  # 🔹 Use 14000 guess words
TOP_N_GUESSES = 20  # 🔹 Keep top 20 words for decision tree
CSV_FILE = "wordle_results.csv"
RANDOM_SEED = 42  # 🔹 Ensure repeatability

def default_pattern_data():
    """ Returns a dictionary for storing pattern occurrences. """
    return {"pattern_count": 0, "instances": []}

def load_random_subset_from_file(file_path, num_words):
    """ Reads words from a text file and randomly selects a subset. """
    try:
        with open(file_path, "r", encoding="utf-8") as file:
            words = [word.strip() for word in file if word.strip()]
        random.seed(RANDOM_SEED)  # 🔹 Set seed for consistency
        random.shuffle(words)  # 🔹 Shuffle before selection
        return words[:num_words]  # 🔹 Pick `num_words` randomly
    except FileNotFoundError:
        print(f"Error: File '{file_path}' not found.")
        return []
    except Exception as e:
        print(f"An error occurred: {e}")
        return []

def generate_pattern(guess_word, possible_word):
    """ Generates a 5-letter pattern ('E', 'P', 'N'). """
    pattern = ["N"] * 5
    possible_word_list = list(possible_word)

    # Step 1: Mark exact matches ('E')
    for i in range(5):
        if guess_word[i] == possible_word[i]:
            pattern[i] = "E"
            possible_word_list[i] = None

    # Step 2: Mark partial matches ('P')
    for i in range(5):
        if pattern[i] == "E":
            continue
        if guess_word[i] in possible_word_list:
            pattern[i] = "P"
            possible_word_list[possible_word_list.index(guess_word[i])] = None

    return "".join(pattern)

def analyze_guess_word(guess_word, possible_words):
    """ Compares a guess word against all possible words. """
    pattern_data = defaultdict(default_pattern_data)

    for possible_word in possible_words:
        pattern = generate_pattern(guess_word, possible_word)
        pattern_data[pattern]["pattern_count"] += 1
        pattern_data[pattern]["instances"].append(possible_word)

    # 🔹 Compute total occurrences and number of patterns
    total_occurrences = sum(data["pattern_count"] for data in pattern_data.values())
    num_patterns = len(pattern_data)

    # 🔹 Compute avg_occurrences correctly
    avg_occurrences = total_occurrences / num_patterns if num_patterns > 0 else 0

    return guess_word, pattern_data, avg_occurrences

def find_best_initial_guess(guess_words, possible_words):
    """ Finds the 20 best first-guess words using multiprocessing. """
    print(f"🔹 Processing {len(guess_words)} guess words against {len(possible_words)} possible words...")

    start_time = time.time()
    results = []

    # Use multiprocessing for efficiency
    with multiprocessing.Pool(processes=multiprocessing.cpu_count()) as pool:
        results = pool.starmap(analyze_guess_word, [(word, possible_words) for word in guess_words])

    avg_occurrences_dict = {word: avg for word, _, avg in results}
    pattern_data_dict = {word: pattern_data for word, pattern_data, _ in results}

    # Get the top 20 words with the least average occurrences
    best_guess_words = sorted(avg_occurrences_dict, key=avg_occurrences_dict.get)[:TOP_N_GUESSES]
    best_guess_data = {word: (pattern_data_dict[word], avg_occurrences_dict[word]) for word in best_guess_words}

    elapsed_time = time.time() - start_time
    print(f"✅ First 20 guesses found in {elapsed_time:.2f} seconds.")

    return best_guess_words, best_guess_data

def process_next_guess(first_guess_word, pattern_dict, guess_words):
    """ Finds the best second guess for each unique pattern in the first guess. """
    next_guess_per_pattern = {}

    for pattern, data in pattern_dict.items():
        possible_words = data["instances"]

        if len(possible_words) > 1:  # Only analyze if more than one word exists
            next_guess_words, next_pattern_data = find_best_initial_guess(guess_words, possible_words)
            next_best_word = next_guess_words[0]  # Best word for this pattern

            total_occurrences = sum(
                sum(pattern_data["pattern_count"] for pattern_data in p[0].values())
                for p in next_pattern_data.values()
            )
            num_patterns = sum(len(p[0]) for p in next_pattern_data.values())

            avg_occurrences = total_occurrences / num_patterns if num_patterns > 0 else total_occurrences

            next_guess_per_pattern[pattern] = (next_best_word, avg_occurrences, next_pattern_data)

    return first_guess_word, next_guess_per_pattern




def find_best_next_guess(best_guess_data, guess_words):
    """ Finds the best next guess for each unique pattern in the first guess. """
    print(f"🔹 Finding best next guess for {TOP_N_GUESSES} words...")

    start_time = time.time()
    results = {}

    with concurrent.futures.ProcessPoolExecutor(max_workers=multiprocessing.cpu_count()) as executor:
        future_to_guess = {
            executor.submit(process_next_guess, word, data[0], guess_words): word for word, data in best_guess_data.items()
        }
        for future in concurrent.futures.as_completed(future_to_guess):
            try:
                word, next_guess_per_pattern = future.result()
                results[word] = next_guess_per_pattern  # 🔹 Store per-pattern guesses
            except Exception as e:
                print(f"⚠ Error processing next guess: {e}")

    elapsed_time = time.time() - start_time
    print(f"✅ Best next guesses found in {elapsed_time:.2f} seconds.")

    return results

def save_results_to_csv(best_first_guesses, best_guess_data, best_next_guesses):
    """ Saves results to a CSV file. """
    with open(CSV_FILE, mode="w", newline="", encoding="utf-8") as file:
        writer = csv.writer(file)
        writer.writerow(["First Guess", "Pattern", "Avg Occurrences", "Next Best Guess", "Next Best Avg Occurrences"])

        for first_guess in best_first_guesses:
            pattern_data, avg_occurrences = best_guess_data[first_guess]
            next_best_patterns = best_next_guesses.get(first_guess, {})

            for pattern, (next_best_guess, next_avg_occurrences, _) in next_best_patterns.items():
                writer.writerow([first_guess, pattern, avg_occurrences, next_best_guess, next_avg_occurrences])

    print(f"✅ Results saved to {CSV_FILE}")


if __name__ == "__main__":
    multiprocessing.set_start_method("spawn", force=True)

    # Load full dataset
    possible_words = load_random_subset_from_file("valid-wordle-answers.txt", NUM_POSSIBLE_WORDS)
    guess_words = load_random_subset_from_file("valid-wordle-guesses.txt", NUM_GUESS_WORDS)

    # Find best first-guess candidates
    best_first_guesses, best_guess_data = find_best_initial_guess(guess_words, possible_words)

    # Find best next guess for each first-level word
    best_next_guesses = find_best_next_guess(best_guess_data, guess_words)

    # Save results to CSV
    save_results_to_csv(best_first_guesses, best_guess_data, best_next_guesses)

    # Print confirmation
    print(f"\n✅ Processing complete! Results saved to {CSV_FILE}.")
# Build complete decision tree in a json file
import json
import multiprocessing
import concurrent.futures
from collections import defaultdict
import time

# Configuration
MAX_ITERATIONS = 6  # 🔹 Limit tree depth to 6
JSON_FILE = "wordle_decision_tree.json"
ROOT_GUESS = "trace"

def default_pattern_data():
    """ Returns a dictionary for storing pattern occurrences. """
    return {"pattern_count": 0, "instances": []}

def load_words_from_file(file_path):
    """ Reads all words from a text file. """
    try:
        with open(file_path, "r", encoding="utf-8") as file:
            return [word.strip() for word in file if word.strip()]
    except FileNotFoundError:
        print(f"Error: File '{file_path}' not found.")
        return []
    except Exception as e:
        print(f"An error occurred: {e}")
        return []

def generate_pattern(guess_word, possible_word):
    """ Generates a 5-letter pattern ('E', 'P', 'N'). """
    pattern = ["N"] * 5
    possible_word_list = list(possible_word)

    for i in range(5):
        if guess_word[i] == possible_word[i]:
            pattern[i] = "E"
            possible_word_list[i] = None

    for i in range(5):
        if pattern[i] == "E":
            continue
        if guess_word[i] in possible_word_list:
            pattern[i] = "P"
            possible_word_list[possible_word_list.index(guess_word[i])] = None

    return "".join(pattern)

def analyze_guess_word(guess_word, possible_words):
    """ Compares a guess word against all possible words. """
    pattern_data = defaultdict(default_pattern_data)

    for possible_word in possible_words:
        pattern = generate_pattern(guess_word, possible_word)
        pattern_data[pattern]["pattern_count"] += 1
        pattern_data[pattern]["instances"].append(possible_word)

    total_occurrences = sum(data["pattern_count"] for data in pattern_data.values())
    num_patterns = len(pattern_data)
    avg_occurrences = total_occurrences / num_patterns if num_patterns > 0 else 0

    return guess_word, pattern_data, avg_occurrences

def find_best_next_guess(guess_words, possible_words):
    """ Finds the best next guess word by minimizing avg occurrences. """
    with multiprocessing.Pool(processes=multiprocessing.cpu_count()) as pool:
        results = pool.starmap(analyze_guess_word, [(word, possible_words) for word in guess_words])

    avg_occurrences_dict = {word: avg for word, _, avg in results}
    pattern_data_dict = {word: pattern_data for word, pattern_data, _ in results}

    best_guess_word = min(avg_occurrences_dict, key=avg_occurrences_dict.get)
    best_guess_data = {best_guess_word: (pattern_data_dict[best_guess_word], avg_occurrences_dict[best_guess_word])}

    return best_guess_word, best_guess_data

def process_next_guess(guess_word, pattern_dict, guess_words, depth):
    """ Recursively builds the decision tree for each pattern branch. """
    next_level = {}

    for pattern, data in pattern_dict.items():
        possible_words = data["instances"]

        if len(possible_words) == 1 or depth >= MAX_ITERATIONS:  # Stop if only one word remains
            next_level[pattern] = {"word": possible_words[0], "instances": possible_words, "depth": depth}
            continue

        next_best_word, next_pattern_data = find_best_next_guess(guess_words, possible_words)
        next_level[pattern] = {
            "word": next_best_word,
            "instances": possible_words,
            "depth": depth,
            "branches": process_next_guess(next_best_word, next_pattern_data[next_best_word][0], guess_words, depth + 1)
        }

    return next_level

def build_decision_tree(root_word, possible_words, guess_words):
    """ Builds the full decision tree starting from the root word. """
    print(f"🔹 Starting decision tree with '{root_word}'")
    root_guess_word, root_guess_data = find_best_next_guess(guess_words, possible_words)
    decision_tree = {
        "word": root_guess_word,
        "depth": 0,
        "branches": process_next_guess(root_guess_word, root_guess_data[root_guess_word][0], guess_words, 1)
    }
    return decision_tree

def save_tree_to_json(tree):
    """ Saves the decision tree to a JSON file. """
    with open(JSON_FILE, "w", encoding="utf-8") as file:
        json.dump(tree, file, indent=4)
    print(f"✅ Decision tree saved to {JSON_FILE}")

if __name__ == "__main__":
    multiprocessing.set_start_method("spawn", force=True)

    # Load full datasets (No random sampling)
    possible_words = load_words_from_file("valid-wordle-answers.txt")
    guess_words = load_words_from_file("valid-wordle-guesses.txt")

    # Build the decision tree
    decision_tree = build_decision_tree(ROOT_GUESS, possible_words, guess_words)

    # Save to JSON
    save_tree_to_json(decision_tree)

    print("\n✅ Decision tree generation complete!")
//App.jsx code to render react page using json decision tree
import React, { useState, useEffect } from "react";
import decisionTree from "./decision_tree.json";
import questionMark from "./question_mark.jpeg"; // Ensure this image is in `src/`

const WordleSolver = () => {
    const initialTileState = { letter: "", color: "bg-gray-500" }; // Default Light Gray
    const colorCycle = ["bg-green-500", "bg-yellow-500", "bg-gray-800"]; // Green → Yellow → Dark Gray

    const [rows, setRows] = useState(
        Array(6).fill().map(() => Array(5).fill({ ...initialTileState }))
    );
    const [activeRow, setActiveRow] = useState(0);
    const [showTooltip, setShowTooltip] = useState(false); // Tooltip visibility

    // Set initial row with "TRACE"
    useEffect(() => {
        setRows((prev) => {
            const newRows = [...prev];
            newRows[0] = [
                { letter: "T", color: "bg-gray-500" },
                { letter: "R", color: "bg-gray-500" },
                { letter: "A", color: "bg-gray-500" },
                { letter: "C", color: "bg-gray-500" },
                { letter: "E", color: "bg-gray-500" },
            ];
            return newRows;
        });
    }, []);

    // Handle tile color changes
    const handleTileClick = (rowIndex, colIndex) => {
        if (rowIndex !== activeRow) return; // Only allow clicks on active row

        setRows((prev) => {
            const newRows = [...prev];
            const currentTile = newRows[rowIndex][colIndex];

            // Cycle through colors (Green → Yellow → Dark Gray → Green)
            const nextColor = colorCycle[(colorCycle.indexOf(currentTile.color) + 1) % colorCycle.length];

            newRows[rowIndex][colIndex] = { ...currentTile, color: nextColor };

            return [...newRows]; // Ensure React detects state change
        });
    };

    // Compute pattern from tile colors
    const computePattern = () => {
        return rows[activeRow].map((tile) => {
            if (tile.color === "bg-green-500") return "E";
            if (tile.color === "bg-yellow-500") return "P";
            return "N"; // Default is dark gray
        }).join("");
    };

    // Handle "Next Guess" by traversing the tree
    const handleNextGuess = () => {
        if (activeRow >= 5) return; // Stop at row 6

        const currentPattern = computePattern();
        console.log(`🔍 Pattern entered: ${currentPattern}`);

        let currentBranch = decisionTree;
        let pathTaken = [];

        for (let i = 0; i <= activeRow; i++) {
            const pattern = rows[i].map(tile => {
                if (tile.color === "bg-green-500") return "E";
                if (tile.color === "bg-yellow-500") return "P";
                return "N";
            }).join("");

            pathTaken.push(pattern);

            if (currentBranch.branches && currentBranch.branches[pattern]) {
                currentBranch = currentBranch.branches[pattern];
            } else {
                console.warn(`❌ Pattern '${pattern}' not found at depth ${i}!`);
                console.warn(`📌 Path taken: ${JSON.stringify(pathTaken)}`);
                return;
            }
        }

        if (currentBranch.word) {
            const nextWord = currentBranch.word.toUpperCase();
            console.log(`✅ Next word selected: ${nextWord}`);

            setRows(prev => {
                const newRows = [...prev];
                newRows[activeRow + 1] = nextWord.split("").map(letter => ({
                    letter,
                    color: "bg-gray-500",
                })); // Populate next row with letters in gray

                return [...newRows]; // Ensure re-render
            });

            setActiveRow(prev => prev + 1);
        } else {
            console.warn(`❌ No word found at end of path: ${JSON.stringify(pathTaken)}`);
        }
    };

    // Handle Restart
    const handleRestart = () => {
        setRows(
            Array(6).fill().map(() => Array(5).fill({ ...initialTileState }))
        );
        setActiveRow(0);

        setRows((prev) => {
            const newRows = [...prev];
            newRows[0] = [
                { letter: "T", color: "bg-gray-500" },
                { letter: "R", color: "bg-gray-500" },
                { letter: "A", color: "bg-gray-500" },
                { letter: "C", color: "bg-gray-500" },
                { letter: "E", color: "bg-gray-500" },
            ];
            return newRows;
        });
    };

    return (
        <div className="flex flex-col items-center justify-center min-h-screen bg-gray-900 text-white relative">
            {/* Header */}
            <div className="w-full max-w-lg p-4 bg-black flex justify-between items-center relative">
                <h1 className="text-3xl font-bold">WORDLE SOLVER</h1>

                {/* Tooltip Container */}
                <div className="relative flex items-center">
                    <img
                        src={questionMark}
                        alt="Help"
                        className="w-6 h-6 cursor-pointer fixed top-4 right-4 z-50"
                        onMouseEnter={() => setShowTooltip(true)}
                        onMouseLeave={() => setShowTooltip(false)}
                    />

                    {/* Tooltip Content */}
                    {showTooltip && (
                        <div className="absolute right-0 mt-8 w-64 p-3 bg-gray-800 text-sm rounded-lg shadow-lg z-50">
                            <p className="font-bold">HOW TO PLAY:</p>
                            <ul className="list-disc pl-4">
                                <li>Open Wordle and keep this page open.</li>
                                <li>Enter “TRACE” as the first guess.</li>
                                <li>Click tiles to match colors from Wordle.</li>
                                <li>Press "Next Guess" to get the next word.</li>
                                <li>Continue updating tiles and clicking "Next Guess".</li>
                                <li>Click "Restart" to start over.</li>
                            </ul>
                        </div>
                    )}
                </div>
            </div>

            {/* Word Grid */}
            <div className="mt-6 flex flex-col items-center w-full">
                {rows.map((row, rowIndex) => (
                    <div key={rowIndex} className="flex justify-center space-x-2 mb-2">
                        {row.map((tile, colIndex) => (
                            <div
                                key={colIndex}
                                className={`w-14 h-14 flex items-center justify-center text-2xl font-bold border border-gray-400 rounded-lg transition-all duration-200 ${
                                    rowIndex <= activeRow ? tile.color : "bg-gray-500"
                                }`}
                                onClick={() => handleTileClick(rowIndex, colIndex)}
                            >
                                {/* Ensure letters remain visible */}
                                {tile.letter}
                            </div>
                        ))}
                    </div>
                ))}
            </div>

            {/* Buttons */}
            <div className="mt-6 flex space-x-4">
                <button
                    className={`px-4 py-2 text-white rounded ${activeRow < 5 ? "bg-blue-600 hover:bg-blue-700" : "bg-gray-600 cursor-not-allowed"}`}
                    onClick={handleNextGuess}
                    disabled={activeRow >= 5}
                >
                    Next Guess
                </button>
                <button
                    className="px-4 py-2 bg-red-600 hover:bg-red-700 text-white rounded"
                    onClick={handleRestart}
                >
                    Restart
                </button>
            </div>
        </div>
    );
};

export default WordleSolver;
Next
Next

Creating a Wordle Bot- Part 2 - Learnings