Parameter usage

Usage of `neorl.make()`

NeoRL uses OpenAI Gym API, allowing users to create an env via neorl.make()

For neorl.make() func, the parameters are shown below:

param	type	description
`task`	str	The task name you want to create. A full list of tasks is available here
`reward_func`	func	A customized reward function, which should be provided if you want to calculate reward instead of using built-in reward of dataset.

The following code segment shows the usage of neorl with a customized reward function.

Example

import neorl

def customized_reward_func(data):
    obs = data["obs"]
    action = data["action"]
    obs_next = data["next_obs"]

    single_reward = False
    if len(obs.shape) == 1:
        single_reward = True
        obs = obs.reshape(1, -1)
    if len(action.shape) == 1:
        action = action.reshape(1, -1)
    if len(obs_next.shape) == 1:
        obs_next = obs_next.reshape(1, -1)

    CRF = 3.0
    CRC = 1.0

    fatigue = obs_next[:, -2]
    consumption = obs_next[:, -1]

    cost = CRF * fatigue + CRC * consumption

    reward = -cost

    if single_reward:
        reward = reward[0].item()
    else:
        reward = reward.reshape(-1, 1)

    return reward

env = neorl.make("ib", reward_func=customized_reward_func)  # create the industrial benchmark env

Usage of `get_dataset()`

For get_dataset() func, the parameters are shown below:

param	type	description
`task_name_version`	str	The name and version (if applicable) of the task, default is the same as `task` while making env
`data_type`	str	Which type of policy is used to collect data. It should be one of ["high", "medium", "low"], default to `high`
`train_num`	int	The num of trajectory of training data. Note that the num should be less than 10,000, `100` by default
`need_val`	bool	Whether needs to download validation data, default to `True`
`val_ratio`	float	The ratio of validation data to training data, default to `0.1`
`path`	str	The directory of data to load from or download to `./data/`
`use_data_reward`	bool	Whether uses default data reward. If false, a customized reward function should be provided by users while making env

Note that task_name_version is the same as task while making env by default. For instance, env = neorl.make("citylearn") will bind citylearn with env and dataset, which indicates env.get_dataset() will obtain citylearn data by default. For flexibility, task_name_version can be other task considering some people only intend to obtain data using an existing env instead of creating a neo one.

When calling get_dataset(), it will first look at local path for appropriate dataset ("appropriate" means the data type should match with the target data and the num of trajectories should not be less than the target data's). Meanwhile, MD5 is utilized to ensure dataset is complete and correct. If local dataset is not applicable, it will download the least appropriate dataset from remote server to path according to local data_map.json.

Example

import neorl

env = neorl.make("finance")
train_data, val_data = env.get_dataset(data_type="medium", train_num=100, need_val=True, val_ratio=0.2, use_data_reward=True)

It will load 100 trajectories for train_data and 10 trajectories for val_data, both using "medium" policy and built-in data reward.

import neorl

env = neorl.make("citylearn")
train_data, _ = env.get_dataset(task_name_version="HalfCheetah-v3", data_type="low", train_num=50, need_val=False, use_data_reward=True)

It will load 50 trajectories for train_data without val_data, using "low" policy and built-in data reward.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parameter usage

Usage of `neorl.make()`

Example

Usage of `get_dataset()`

Example

Clone this wiki locally

Parameter usage

Usage of neorl.make()

Example

Usage of get_dataset()

Example

Clone this wiki locally

Usage of `neorl.make()`

Usage of `get_dataset()`