gymwipe.envs.core module¶

class BaseEnv(frequencyBand, deviceCount)[source]¶

Bases: gym.core.Env

A subclass of the OpenAI gym environment that models the Radio Resource Manager frequency band assignment problem. It sets a frequency band and an action space (depending on the number of devices to be used for frequency band assignment).

The action space is a dict space of two discrete spaces: The device number and the assignment duration.

Parameters:	band (frequency) – The physical frequency band to be used for the simulation deviceCount (`int`) – The number of devices to be included in the environment’s action space

metadata = {'render.modes': ['human']}[source]¶

MAX_ASSIGN_DURATION = 20[source]¶

ASSIGNMENT_DURATION_FACTOR = 1000[source]¶

seed(seed=None)[source]¶: Sets the seed for this environment’s random number generator and returns it in a single-item list.

render(mode='human', close=False)[source]¶: Renders the environment to stdout.

class Interpreter[source]¶

Bases: abc.ABC

An Interpreter is an instance that observes the system’s behavior by sniffing the packets received by the RRM’s physical layer and infers observations and rewards for a frequency band assignment learning agent. Thus, RRM and learning agent can be used in any domain with only swapping the interpreter.

This class serves as an abstract base class for all Interpreter implementations.

When implementing an interpreter, the following three methods have to be overridden:

onPacketReceived()

getReward()

getObservation()

The following methods provide default implementations that you might also want to override depending on your use case:

reset()

onFrequencyBandAssignment()

getDone()

getInfo()

onPacketReceived(senderIndex, receiverIndex, payload)[source]¶

Is invoked whenever the RRM receives a packet that is not addressed to it.

Parameters:	senderIndex (`int`) – The device index of the received packet’s sender (as in the gym environment’s action space) receiverIndex (`int`) – The device index of the received packet’s receiver (as in the gym environment’s action space) payload (`Transmittable`) – The received packet’s payload

onFrequencyBandAssignment(deviceIndex, duration)[source]¶

Is invoked whenever the RRM assigns the frequency band.

Parameters:	deviceIndex (`int`) – The index (as in the gym environment’s action space) of the device that the frequency band is assigned to. duration (`int`) – The duration of the assignment in multiples of `TIME_SLOT_LENGTH`

getReward()[source]¶

Returns a reward that depends on the last channel assignment.

Return type:	`float`

getObservation()[source]¶

Returns an observation of the system’s state.

Return type:	`Any`

getDone()[source]¶

Returns whether an episode has ended.

Note

Reinforcement learning problems do not have to be split into episodes. In this case, you do not have to override the default implementation as it always returns False.

Return type:	`bool`

getInfo()[source]¶

Returns a dict providing additional information on the environment’s state that may be useful for debugging but is not allowed to be used by a learning agent.

Return type:	`Dict`[~KT, ~VT]

getFeedback()[source]¶

You may want to call this at the end of a frequency band assignment to get feedback for your learning agent. The return values are ordered like they need to be returned by the step() method of a gym environment.

Return type:	`Tuple`[`Any`, `float`, `bool`, `Dict`[~KT, ~VT]]
Returns:	A 4-tuple with the results of `getObservation()`, `getReward()`, `getDone()`, and `getInfo()`

reset()[source]¶: This method is invoked when the environment is reset – override it with your initialization tasks if you feel like it.