Back to Projects

Uno AI Agent

I built two autonomous AI agents that win at Uno.

GAMEPLAY

THE PROJECT

I built two AI agents that play and win Uno, a multiplayer card game driven by uncertainty, hidden information, and shifting odds, using Monte Carlo Tree Search from scratch in Java. Rather than writing rules for every situation, both agents discover what works by simulating thousands of possible futures, tracking which moves lead to wins, and acting on that evidence. The result is an agent that reasons under uncertainty, adapts to any opponent, and plays competitive Uno without ever being told what a good move looks like.

ALGORITHMIC ARCHITECTURE

I engineered two progressively sophisticated MCTS implementations, each exploring the game tree in a fundamentally different way.

Click on any card to expand.

01 · EXPECTED OUTCOME AGENT

Flat Monte Carlo evaluation

One child per legal move. Equal rollouts on each. Root Q-values updated after every simulation.

02 · UCT AGENT

Upper Confidence Trees

Dynamic tree built using UCB1 to balance exploration and exploitation at every node level.

03 · BACKPROPAGATION

Q-value propagation up the path

After each rollout, outcome flows back through every node visited, root included. UCT's key advantage over flat MCTS.

04 · ROLLOUT POLICY

Uniform random simulation

Every player draws from their legal moves at random. With enough rollouts, the signal rises above the noise.

05 · REWARD SHAPING

Heuristic when depth cap hits

Win=+1, loss=-1 at game end. Mid-game cap: r = opponents' cards minus my cards. No domain knowledge needed.

06 · ANYTIME ALGORITHM

Budget-driven search

The outer loop runs until the time budget expires. More compute = better Q-value estimates. Stop whenever.

RESULTS AND PERFORMANCE

Both agents consistently outperform random opponents across 2-player and 4-player configurations, with the UCT agent demonstrating sharper decision-making as the game tree deepens over successive iterations. The Q-value estimates converge quickly enough within the time budget to produce reliably strong move selection.

SKILLS AND CONCEPTS

Core AI algorithms

Monte Carlo Tree Search (MCTS)

Implemented the full MCTS loop (selection, expansion, simulation, and backpropagation) tailored to Uno's branching legal moves and multi-player turns.

UCT / UCB1 selection

Coded UCB1 at each decision node so the agent balances exploiting strong lines with exploring under-sampled branches.

Flat Monte Carlo baseline

Built a root-only Monte Carlo agent that allocates rollouts evenly across legal moves for comparison against UCT.

Q-value estimation from simulations

Maintained running averages and visit counts per (state, action) to rank moves from empirical win rates and shaped rewards.

Reward shaping at non-terminal depth

When rollouts hit the depth cap, used a hand-size differential heuristic so partial games still produce a useful training signal.

Search and decision-making concepts

Anytime search under a wall-clock budget

Structured the outer loop so the agent returns the best move known when the per-move time budget expires.

Exploration vs exploitation

UCB1 formalizes the tradeoff between playing what looks best so far and probing moves with uncertain value.

Incremental tree growth

Unlike flat MC, UCT adds one child per iteration so effort concentrates along promising lines of play.

Root move ordering by empirical value

After search, selected the legal move at the root with the strongest aggregated Q-value or visit-weighted score.

Probability and statistics

Monte Carlo averaging

Treated each rollout as a sample; averaged outcomes so estimates stabilize as visit counts grow.

Law of large numbers (intuition)

With enough random rollouts, the empirical mean reward at a node approaches the expectation under the rollout policy.

Log term in UCB for confidence width

The √(log N / N_child) term shrinks as parent visits grow and grows for rarely tried children, which drives exploration.

Game theory and multi-agent reasoning

Stochastic multi-player games

Extended search and rollouts to 2- and 4-player Uno with rotating turns and different opponent models in simulation.

Hidden information

Opponents' hands are unknown; rollouts use the engine's information model so the agent reasons under uncertainty without cheating.

Opponents as part of the environment

Randomized opponent play in rollouts approximates a wide distribution of behaviors without hand-coded strategies.

Equilibrium concepts (background)

Not solving Nash equilibria explicitly; MCTS instead finds strong empirical responses via sampling.

Software engineering and Java

Java agent and game engine integration

Implemented agents against the course game API: legal move enumeration, state copy for simulation, and clean separation of search from rules.

Java Swing visualizer

Built a real-time display of hands, piles, and play so experiments and debugging are observable.

Data structures for the search tree

Used maps and node records to track children, visit counts, and Q accumulators per expanded state.

Profiling and time budgets

Bounded search by wall time so the agent respects engine limits and stays competitive in timed matches.

Key Metrics
Java
Implementation
MCTS + UCT
Core Algorithm
4-Player
Game Support
Technology Stack
Java
Monte Carlo Tree Search
UCT / UCB1
Backpropagation
Q-value Estimation
Java Swing
Game Tree Search
Heuristic Design
Available for Summer 2026 Internships

Let's Build Something Amazing Together

I'm actively seeking software engineering internship opportunities where I can apply my full-stack development, applied AI, data engineering, and research experience to drive measurable impact for your team.

Can start immediately for part-time roles, Summer 2026 for full-time internships.

Full-Stack & AI
React.js, Python, Java, cloud platforms, applied AI systems, and data pipelines
Engineering & Research
NSF-funded platform governance research focused on simulating multi-agent digital marketplaces in collaboration with MIT
Impact & Scale
Platforms serving 4,000+ participants, deployed experiments across 1,200+ datasets, conference presentations, and awards
vedkej@bu.edu
+1 (660) 270-4041
Boston, MA
Available Summer 2026