Pygame Basics for Building Customized Reinforcement Learning Environment(Grid-world)

3 min readOct 28, 2021

When studying Reinforcement Learning(RL) we usually start with the OpenAI gym library, however, I need a simple grid-world environment and found out the gym did not provide. At first, I tried to print out a matrix to visualize the environment, then I came across the Pygame solution.

Game Logic

The grid-world game logic is super simple, a tile at the starting point and need to reach the target point-win-10 scores, fall out of the window/crash to the boundary-lose-0 score. Then, we can add obstacles to make it a little bit more challenging, avoiding obstacles-more scores/crash-lose. The player with the highest score- champion.

Game Init & UI (Env, State)

We need to set up the game: a player representation and a target, then visualize it on a widow. The display and draw class enable us to set up the interface.

import pygamepygame.init()### UI ###
# screen size
w = 640
h = 480# tile size
BLOCK_SIZE = 40# color
WHITE = (255, 255, 255)
BLACK = (0,0,0)
RED = (200,0,0)
BLUE = (0, 0, 255)# setup screen
screen = pygame.display.set_mode((w, h))
pygame.display.set_caption('Grid World')# setup timer
pygame.time.Clock()
# tick 
SPEED = 60# agent position
x = 0
y = h-BLOCK_SIZE
    
### Game Loop/Progress
# Run until the player quit
playing = True
while playing:# Player events: mouse click, keyboard
    for event in pygame.event.get():
        if event.type == pygame.QUIT:
            playing = False
    screen.fill(BLACK)
# Draw a agent in the left bottom, a target on the top right
    pygame.draw.rect(screen, RED, pygame.Rect(x, y, BLOCK_SIZE, BLOCK_SIZE))
    pygame.draw.rect(screen, BLUE, pygame.Rect(w-BLOCK_SIZE, 0, BLOCK_SIZE, BLOCK_SIZE))# Flip the display
    pygame.display.flip()pygame.quit()

Progress & Events(Actions) →Update UI

Pygame is actually a wrapper of a library called Simple DirectMedia Layer, which allows us to get access to hardware like keyboard, mouse, etc. This bridges the screen and player, the interface is listening to all kinds of player actions/events such as key down, click, etc, and we will change the environment/screen according to those actions.

for event in pygame.event.get():
        if event.type == pygame.QUIT:
            playing = False        if event.type == pygame.KEYDOWN:
            if event.key == pygame.K_LEFT:
                x -= BLOCK_SIZE
            elif event.key == pygame.K_RIGHT:
                x += BLOCK_SIZE
            elif event.key == pygame.K_UP:
                y -= BLOCK_SIZE
            elif event.key == pygame.K_DOWN:
                y += BLOCK_SIZE

Update Score(Getting Reward)

here we do not define any reward unless the player arrives, no penalty when moving off-screen, but we will have to design a dense reward in each step in RL when the AI agent is playing. The reward will be granted when arriving at the destination below.

You might want to visualize the score, adding codes below

font = pygame.font.Font(None, 20)
...
# display the score
text = font.render("Score: " + str(score), True, WHITE)
screen.blit(text, [0, 0])

Game Over

When the player quits the game(), the agent arrives at the destination or the agent moves off the screen/boundary, the game is over.

# game state
off_screen = False
arrived = False
score = 0
...
# move off the screen
    if x < 0 or x > w-BLOCK_SIZE or y<0 or y>h-BLOCK_SIZE:
        off_screen = True# arrives the destination
    if x== w-BLOCK_SIZE and y ==0:
        arrived = True
        score = 100

The complete code is here https://github.com/reneelin1712/gridWorld/blob/main/demo.py

In the next post, I will adjust the codes from human player to AI player, implementing RL algorithm.