edelwace-0.1.0.0: HaskTorch Reinforcement Learning Agents for GACE
Safe HaskellNone
LanguageHaskell2010

RPB.HER

Description

Hindsight Experience Replay

Synopsis

Documentation

data Strategy Source #

Hindsight Experience Replay Strategies for choosing Goals

Constructors

Final

Only Final States are additional targets

Random

Replay with k random states encountered so far (basically RPB)

Episode

Replay with k random states from same episode.

Future

Replay with k random states from same episode, that were observed after

Instances

Instances details
Eq Strategy Source # 
Instance details

Defined in RPB.HER

Show Strategy Source # 
Instance details

Defined in RPB.HER

data Buffer a Source #

Strict Simple/Naive Replay Buffer

Constructors

Buffer 

Fields

Instances

Instances details
Functor Buffer Source # 
Instance details

Defined in RPB.HER

Methods

fmap :: (a -> b) -> Buffer a -> Buffer b #

(<$) :: a -> Buffer b -> Buffer a #

Eq a => Eq (Buffer a) Source # 
Instance details

Defined in RPB.HER

Methods

(==) :: Buffer a -> Buffer a -> Bool #

(/=) :: Buffer a -> Buffer a -> Bool #

Show a => Show (Buffer a) Source # 
Instance details

Defined in RPB.HER

Methods

showsPrec :: Int -> Buffer a -> ShowS #

show :: Buffer a -> String #

showList :: [Buffer a] -> ShowS #

mkBuffer :: Buffer Tensor Source #

Create a new, empty HER Buffer on the GPU

empty :: Buffer Tensor Source #

Create an empty HER Buffer

size :: Buffer Tensor -> Int Source #

How many Trajectories are currently stored in memory

push :: Int -> Tensor -> Buffer Tensor -> Tensor -> Tensor -> Tensor -> Tensor -> Tensor -> Buffer Tensor Source #

Calculate reward and done and Push new memories into Buffer

push' :: Int -> Buffer Tensor -> Buffer Tensor -> Buffer Tensor Source #

Push one buffer into another one

push'' :: Int -> Buffer Tensor -> Tensor -> Tensor -> Tensor -> Tensor -> Tensor -> Tensor -> Tensor -> Buffer Tensor Source #

Alternative Push if tensors are not in a buffer yet

drop' :: Buffer Tensor -> Buffer Tensor Source #

Drop everything after last done (used for single episode)

drop :: Int -> Buffer Tensor -> Buffer Tensor Source #

Drop number of entries from the beginning of the Buffer

envSplit :: Int -> Buffer Tensor -> [Buffer Tensor] Source #

Split buffer collected from pool by env

epsSplit :: Buffer Tensor -> [Buffer Tensor] Source #

Split a buffer into episodes, dropping the last unfinished

sample :: Tensor -> Buffer Tensor -> Buffer Tensor Source #

Get the given indices from Buffer

sampleTargets :: Strategy -> Int -> Tensor -> Buffer Tensor -> IO (Buffer Tensor) Source #

Sample Additional Goals according to Strategy (drop first). Random is basically the same as Episode you just have to give it the entire buffer, not just the episode.

asRPB :: Buffer Tensor -> Buffer Tensor Source #

Convert HER Buffer to RPB for training

targetCriterion :: Map String Bool -> Tensor Source #

Convert target predicate map to boolean mask tensor