Skip to content

Releases: AVSLab/bsk_rl

v1.2.0

24 Jul 00:09

Choose a tag to compare

Release Notes

  • Add an example script where reward is based on the probability of successfully observing targets covered by clouds in the Cloud Environment with Re-imaging example.
  • Add a conjunction checking dynamics model in ConjunctionDynModel.
  • Add utilities for relative motion state setup, cd2hill, hill2cd, and relative_to_chief.
  • Add a dtype argument to the environment (or individual satellites) and sets the default dtype to np.float64.
  • Add support for continuous action spaces (e.g. for control problems) with ContinuousAction.
  • Add models and action for impulsive thrust and drift with a continuous action space (ImpulsiveThrust).
  • Changed inconsistent uses of datastore to data_store.
  • Added property data_store_kwargs to GlobalReward that is unpacked in the DataStore constructor.
  • Implemented ResourceReward to reward based on the level of a property in the satellite multiplied by some coefficient.
  • Allow rewarders to mark a satellite as truncated or terminated with the is_truncated and is_terminated methods.
  • Added example script for using curriculum learning with RLlib in Curriculum Learning example.
  • Updated the list of publications
  • Added the option to compute value with sMDP rewards at the start of the step in the RLlib configuration.
  • Add the ability to observe remaining time in Time.
  • Allow for the time_limit to be randomized.
  • Added observation for arbitrary relative states between two satellites in RelativeProperties.
  • Allow for the transmitterPacketSize to be specified. The default sets it to the instrument’s baud rate.
  • Add a maximum range checking dynamics model in MaxRangeDynModel. Useful for keeping an agent in the vicinity of a target early in training.
  • Add properties in spacecraft dynamics for orbital element observations.
  • Fix an issue with failure penalties in the PettingZoo environment when the rewarder does not return a reward for a satellite.
  • Allow for per-episode randomization of ResourceReward weights and observation of those weights with ResourceRewardWeight.
  • Add ImpulsiveThrustHill for impulsive thrust in the Hill frame.
  • Separate random_circular_orbit and random_orbit to avoid misleading altitude argument.
  • Add fault modeling example script using four reaction wheels in the Fault Environment example.
  • Introduce a new RSO inspection environment, primarily consisting of RSOInspectionReward, RSOPoints, RSOInspectorFSWModel, and RSODynModel. An example environment setup is described in the RSO Inspection example.
  • Add a maximum duration option to Image.
  • Fix a bug where a satellite’s initial data was never added to the rewarder.
  • Fix a bug where using multiple of the same rewarder would cause some settings to be overwritten.
  • Add the ability to define metaagents that concatenate satellite action and observation spaces in the environment.

What's Changed

New Contributors

Full Changelog: v1.1.0...v1.2.0

v1.1.0

27 Feb 04:45

Choose a tag to compare

  • Add ability in SatProperties to define new observations with a custom function.
  • Add deepcopy to mutable inputs to the environment so that an environment argument dictionary can be copied without being affected by things that happen in the environment. This fixes compatibility with RLlib 2.33.0+. Note that this means that the satellite object passed to the environment is not the same object as the one used in the environment, as is the case for rewarders and communication objects.
  • Add additional observation properties for satellites and opportunities.
  • Add connectors for multiagent semi-MDPs, as demonstrated in a new single agent and multiagent example.
  • Add a min_period option to CommunicationMethod.
  • Cache agents in the ConstellationTasking environment to improve performance.
  • Add option to generate_obs_retasking_only to prevent computing observations for satellites that are continuing their current action.
  • Allow for ImagingSatellite to default to a different type of opportunity than target. Also allows for access filters to include an opportunity type.
  • Improve performance of Eclipse observations by about 95%.
  • Logs a warning if the initial battery charge or buffer level is incompatible with its capacity.
  • Optimize communication when all satellites are communicating with each other.
  • Enable Vizard visualization of the environment by setting the vizard_dir and vizard_settings options in the environment.
  • Allow for the specification of multiple rewarders in the environment.

v1.0.1

29 Aug 18:04

Choose a tag to compare

Issue #0: Release 1.0.1

v1.0.0

12 Jun 16:51

Choose a tag to compare

Version 1.0.0 release.