摘要 |
<p>In one embodiment, a method for optimization of network protocol options with reinforcement learning and propagation is disclosed. The method comprises: interacting, by a learning component of a server of a network, with one or more clients and an environment of the network; conducting, by the learning component, different trials of one or more options in different states for network communication via a protocol of the network; receiving, by the learning component, performance feedback for the different trials as rewards; and utilizing, by the learning component, the different trials and associated resulting rewards to improve a decision-making policy associated with the server for negotiation of the one or more options. Other embodiments are also described.</p> |