gymwipe.envs.core module

class BaseEnv(frequencyBand, deviceCount)[source]

Bases: gym.core.Env

A subclass of the OpenAI gym environment that models the Radio Resource Manager frequency band assignment problem. It sets a frequency band and an action space (depending on the number of devices to be used for frequency band assignment).

The action space is a dict space of two discrete spaces: The device number and the assignment duration.

Parameters:
  • band (frequency) – The physical frequency band to be used for the simulation
  • deviceCount (int) – The number of devices to be included in the environment’s action space
metadata = {'render.modes': ['human']}[source]
MAX_ASSIGN_DURATION = 20[source]
ASSIGNMENT_DURATION_FACTOR = 1000[source]
seed(seed=None)[source]

Sets the seed for this environment’s random number generator and returns it in a single-item list.

render(mode='human', close=False)[source]

Renders the environment to stdout.

class Interpreter[source]

Bases: abc.ABC

An Interpreter is an instance that observes the system’s behavior by sniffing the packets received by the RRM’s physical layer and infers observations and rewards for a frequency band assignment learning agent. Thus, RRM and learning agent can be used in any domain with only swapping the interpreter.

This class serves as an abstract base class for all Interpreter implementations.

When implementing an interpreter, the following three methods have to be overridden:

The following methods provide default implementations that you might also want to override depending on your use case:

onPacketReceived(senderIndex, receiverIndex, payload)[source]

Is invoked whenever the RRM receives a packet that is not addressed to it.

Parameters:
  • senderIndex (int) – The device index of the received packet’s sender (as in the gym environment’s action space)
  • receiverIndex (int) – The device index of the received packet’s receiver (as in the gym environment’s action space)
  • payload (Transmittable) – The received packet’s payload
onFrequencyBandAssignment(deviceIndex, duration)[source]

Is invoked whenever the RRM assigns the frequency band.

Parameters:
  • deviceIndex (int) – The index (as in the gym environment’s action space) of the device that the frequency band is assigned to.
  • duration (int) – The duration of the assignment in multiples of TIME_SLOT_LENGTH
getReward()[source]

Returns a reward that depends on the last channel assignment.

Return type:float
getObservation()[source]

Returns an observation of the system’s state.

Return type:Any
getDone()[source]

Returns whether an episode has ended.

Note

Reinforcement learning problems do not have to be split into episodes. In this case, you do not have to override the default implementation as it always returns False.

Return type:bool
getInfo()[source]

Returns a dict providing additional information on the environment’s state that may be useful for debugging but is not allowed to be used by a learning agent.

Return type:Dict[~KT, ~VT]
getFeedback()[source]

You may want to call this at the end of a frequency band assignment to get feedback for your learning agent. The return values are ordered like they need to be returned by the step() method of a gym environment.

Return type:Tuple[Any, float, bool, Dict[~KT, ~VT]]
Returns:A 4-tuple with the results of getObservation(), getReward(), getDone(), and getInfo()
reset()[source]

This method is invoked when the environment is reset – override it with your initialization tasks if you feel like it.