强化学习

强化学习是机器学习中的一个领域，强调如何基于环境而行动，以取得最大化的预期利益。其灵感来源于心理学中的行为主义理论，即有机体如何在环境给予的奖励或惩罚的刺激下，逐步形成对刺激的预期，产生能获得最大利益的习惯性行为。这个方法具有普适性，因此在其他许多领域都有研究，例如博弈论、控制论、运筹学、信息论、模拟优化方法、多主体系统学习、群体智能、统计学以及遗传算法。在运筹学和控制理论研究的语境下，强化学习被称作“近似动态规划”（approximate dynamic programming，ADP）。在最优控制理论中也有研究这个问题，虽然大部分的研究是关于最优解的存在和特性，并非是学习或者近似方面。在经济学和博弈论中，强化学习被用来解释在有限理性的条件下如何出现平衡。

单词	Reinforcement learning
释义	Reinforcement learning 原声例句两分钟论文 This is about multiplayer reinforcement learning, if you will. 这是一项多人强化学习任务。两分钟论文 This algorithm was based on a combination of a neural network and reinforcement learning. 该算法基于神经网络和强化学习的结合。两分钟论文 The goal was to learn to perform a backflip through reinforcement learning. 目标是通过强化学习来学习表演后空翻。双语版 TED-Ed 演讲精选 The DeepMind researchers worked out an ingenious way to plug this preference for novelty into reinforcement learning. DeepMind 研究人员找到巧妙的方法将这种对新奇事物的偏好插入到强化学习中。十万个为什么 Researchers used a technique called reinforcement learning where they gave robots cooperative tasks instead of competitive ones. 研究人员使用了一种叫做强化学习的技术，他们给机器人合作任务，而不是竞争任务。经济学人-科技 That software was a piece of artificial intelligence called a deep evolutionary reinforcement learning algorithm, or derl. 该软件是一款人工智能软件，叫做深度进化强化学习算法，简称derl。 TED演讲（视频版）双语精选 We don't know how the RLHF reinforcement learning works, we don't know what other gadgets are in there. 我们不知道 RLHF 强化学习是如何工作的，我们不知道里面还有什么其他的小工具。中级英语短文 By applying the evolved neural circuits, the researchers construct spiking neural networks for image classification and reinforcement learning tasks. 通过应用进化的神经环路，研究人员构建了用于图像分类和强化学习任务的脉冲神经网络。 TED演讲（视频版）双语精选 It's an example that doesn't work two weeks later because they're constantly changing things with reinforcement learning and so forth. 这个例子两周后就不成立了，因为他们通过强化学习等不断进行改变。科学60秒-科学美国人 2021年3月合集 Reinforcement learning is great for that but it isn't perfect in every situation. 强化学习对此非常有用，但并非在所有情况下都是完美的。两分钟论文 Everything is learned from scratch with a few small modifications to the reinforcement learning algorithm. 所有结果都是通过对于增强学习算法的一个小调整从无到有学出来的。两分钟论文 The neural network was used to understand the video feed, and reinforcement learning is there to come up with the appropriate actions. 神经网络用于理解视频画面的输入，强化学习则会提出合适的对策行为。两分钟论文 A really cool piece of work that can potentially open up new ways of thinking about reinforcement learning. 这真的是一篇很棒的文章，它可能会开启一种新的思考强化学习的方式。中级英语短文 Combined with on-policy and off-policy deep reinforcement learning algorithms, NeuEvo achieves comparable performance with artificial neural networks, as shown in the study. 结合政策内外深度强化学习算法，NeuEvo 实现了与人工神经网络相当的性能，如研究所示。科学60秒-科学美国人 2021年3月合集 This is where deep reinforcement learning comes in. 这就是深度强化学习的用武之地。两分钟论文 This work is a collaboration between OpenAI and DeepMind's security team and is about introducing more human control in reinforcement learning problems. 这篇论文由 OpenAI 和 DeepMind 的安全小组合作完成，目标是在强化学习问题中引入更多人为控制。两分钟论文 The key factors to make this happen is to apply two modifications to the original reinforcement learning algorithm. 关键点就是在原始增强学习算法上增加两项调整。问答进行中 In a reinforcement learning problem, he is our agent, and he's trying to learn a policy - that is, how to interact with his environment. 在强化学习问题中，他是我们的代理，他正在尝试学习策略 - 即如何与他的环境交互。经济学人-科技 The Meta team's crucial contribution was therefore to augment reinforcement learning with natural-language processing. 因此，该团队做出的关键贡献是利用自然语言处理来增强强化学习。弗里德曼播客集 We will not be able to use reinforcement learning with human feedback to hardwire its values into it. 我们将无法使用带有人类反馈的强化学习来将其价值观硬连接到其中。中文百科强化学习强化学习是机器学习中的一个领域，强调如何基于环境而行动，以取得最大化的预期利益。其灵感来源于心理学中的行为主义理论，即有机体如何在环境给予的奖励或惩罚的刺激下，逐步形成对刺激的预期，产生能获得最大利益的习惯性行为。这个方法具有普适性，因此在其他许多领域都有研究，例如博弈论、控制论、运筹学、信息论、模拟优化方法、多主体系统学习、群体智能、统计学以及遗传算法。在运筹学和控制理论研究的语境下，强化学习被称作“近似动态规划”（approximate dynamic programming，ADP）。在最优控制理论中也有研究这个问题，虽然大部分的研究是关于最优解的存在和特性，并非是学习或者近似方面。在经济学和博弈论中，强化学习被用来解释在有限理性的条件下如何出现平衡。英语百科 Reinforcement learning 强化学习 Reinforcement learning is an area of machine learning inspired by behaviorist psychology, concerned with how software agents ought to take actions in an environment so as to maximize some notion of cumulative reward. The problem, due to its generality, is studied in many other disciplines, such as game theory, control theory, operations research, information theory, simulation-based optimization, multi-agent systems, swarm intelligence, statistics, and genetic algorithms. In the operations research and control literature, the field where reinforcement learning methods are studied is called approximate dynamic programming. The problem has been studied in the theory of optimal control, though most studies are concerned with the existence of optimal solutions and their characterization, and not with the learning or approximation aspects. In economics and game theory, reinforcement learning may be used to explain how equilibrium may arise under bounded rationality.
随便看	stacked gate structure的意思 stacked goal的意思 stacked graph的意思 stacked heads的意思 stacked heel的意思 stacked heel的意思 stacked hydroclone contactor的意思 stackedhydroclone contactor的意思 stacked IIL的意思 stacked integrated circuit的意思 stacked interrupt的意思 stacked job的意思 stacked job control的意思 stacked job process的意思 stacked job processing的意思 stacked lamellar aggregate的意思 stacked laser diode的意思 stacked length的意思 stacked line的意思 stacked loop的意思 stacked loops的意思 stacked material的意思 stacked measure的意思 stacked memory的意思 stacked modal的意思

Reinforcement learning

强化学习

Reinforcement learning 强化学习