There is also a different way to go on from here. Alternatively we can estimate the distances of the mobile tag to the receiver units using two-way ranging (TWR). TWR uses the most simplistic idea: we exchange messages between the mobile tag and the stationary receiver and measure the round trip time (RTT) of the message. From this RTT we can directly calculate the distance between those two. The hardware footprint is low as we do not need to precisely synchronize the clocks of the receiver and the mobile tag. We only need to synchronize the channel access. However, here also lies the general drawback of this approach: to estimate a single position we need to send 2 messages (back and forth) between 4 receivers and the mobile tag (= 8 messages). As the channel is potentially shared to localise several mobile objects the positioning update rate quickly drops.
Literature for Multi-Agent Reinforcement Learning:
Literature for Asynchronous Advantage Actor-Critic (A3C) (+ A2C):
Literature for DD-PPO: Near-Perfect PointGoal Navigators:
Literature for Explainable RL:
Literature for Curiosity:
Literature for Simulation-to-Reality Transfer:
Literature for MAML:
Literature for Hierarchical Reinforcement Learning:
Literature for Option Critics:
Literature for Batch Reinforcement Learning:
Literature for World Models:
Literature for Model-Based RL:
Literature for Generative Adversarial Reinforcement Learning:
Literature for Imitation Learning:
Literature for Deep Deterministic Policy Gradient:
Literature for Trust-Region Policy Optimization (TRPO):
Literature for Proximal Policy Optimization (PPO):