Learning Progress (Figure 8)
This is a video of the entire learning process, showing the reward, sender and receiver policy, permissibility regions and symbols (columns) for both agents (top/bottom row). The separate steps are also listed in the table below with links to the PDFs.