diff --git a/jose.00306/10.21105.jose.00306.crossref.xml b/jose.00306/10.21105.jose.00306.crossref.xml new file mode 100644 index 0000000..2156a01 --- /dev/null +++ b/jose.00306/10.21105.jose.00306.crossref.xml @@ -0,0 +1,357 @@ + + + + 20251029140344-e3dfa4e41078c0a8e498df4eba0998a1024e19e9 + 20251029140344 + + JOSS Admin + admin@theoj.org + + The Open Journal + + + + + Journal of Open Source Education + JOSE + 2577-3569 + + 10.21105/jose + https://jose.theoj.org + + + + + 10 + 2025 + + + 8 + + 92 + + + + Reinforcement Learning: A Comprehensive Open-Source Course + + + + Ali Hassan Ali + Abdelwanis + + Department of Interconnected Automation Systems, University of Siegen, Germany + + https://orcid.org/0009-0001-5853-5900 + + + Barnabas + Haucke-Korber + + Department of Power Electronics and Electrical Drives, Paderborn University, Germany + + https://orcid.org/0000-0003-0862-2069 + + + Darius + Jakobeit + + Department of Power Electronics and Electrical Drives, Paderborn University, Germany + + https://orcid.org/0009-0002-1576-2465 + + + Wilhelm + Kirchgässner + + Department of Power Electronics and Electrical Drives, Paderborn University, Germany + + https://orcid.org/0000-0001-9490-1843 + + + Marvin + Meyer + + Department of Power Electronics and Electrical Drives, Paderborn University, Germany + + https://orcid.org/0009-0008-2879-7118 + + + Maximilian + Schenke + + Department of Power Electronics and Electrical Drives, Paderborn University, Germany + + https://orcid.org/0000-0001-5427-9527 + + + Hendrik + Vater + + Department of Power Electronics and Electrical Drives, Paderborn University, Germany + + https://orcid.org/0009-0005-0654-8741 + + + Oliver + Wallscheid + + Department of Power Electronics and Electrical Drives, Paderborn University, Germany + + https://orcid.org/0000-0001-9362-8777 + + + Daniel + Weber + + Department of Power Electronics and Electrical Drives, Paderborn University, Germany + + https://orcid.org/0000-0003-3367-5998 + + + + 10 + 29 + 2025 + + + 306 + + + 10.21105/jose.00306 + + + http://creativecommons.org/licenses/by/4.0/ + http://creativecommons.org/licenses/by/4.0/ + http://creativecommons.org/licenses/by/4.0/ + + + + Software archive + 10.5281/zenodo.17347442 + + + GitHub review issue + https://github.com/openjournals/jose-reviews/issues/306 + + + + 10.21105/jose.00306 + https://jose.theoj.org/papers/10.21105/jose.00306 + + + https://jose.theoj.org/papers/10.21105/jose.00306.pdf + + + + + + Reinforcement learning: An introduction + Sutton + IEEE Transactions on Neural Networks + 16 + 2005 + Sutton, R. S., & Barto, A. G. (2005). Reinforcement learning: An introduction. IEEE Transactions on Neural Networks, 16, 285–286. https://api.semanticscholar.org/CorpusID:9166388 + + + Lectures on reinforcement learning + Silver + 2015 + Silver, D. (2015). Lectures on reinforcement learning. url: https://www.davidsilver.uk/teaching/. + + + The hugging face deep reinforcement learning class + Simonini + GitHub repository + 2023 + Simonini, T., & Sanseviero, O. (2023). The hugging face deep reinforcement learning class. In GitHub repository. https://github.com/huggingface/deep-rl-class; GitHub. + + + CS234: Reinforcement learning winter 2025 + Brunskill + 2025 + Brunskill, E. (2025). CS234: Reinforcement learning winter 2025. url: https://web.stanford.edu/class/cs234/. + + + Spinning up in deep reinforcement learning + Achiam + 2018 + Achiam, J. (2018). Spinning up in deep reinforcement learning. url: https://spinningup.openai.com/. + + + Jupyter notebooks – a publishing format for reproducible computational workflows + Kluyver + 2016 + Kluyver, T., Ragan-Kelley, B., & Pérez, F. et al. (2016). Jupyter notebooks – a publishing format for reproducible computational workflows (F. Loizides & B. Schmidt, Eds.; pp. 87–90). IOS Press. + + + Pandas-dev/pandas: pandas + pandas + 10.5281/zenodo.3509134 + 2020 + pandas. (2020). Pandas-dev/pandas: pandas (latest). Zenodo. https://doi.org/10.5281/zenodo.3509134 + + + Data Structures for Statistical Computing in Python + McKinney + Proceedings of the 9th Python in Science Conference + 10.25080/Majora-92bf1922-00a + 2010 + McKinney, Wes. (2010). Data Structures for Statistical Computing in Python. In Stéfan van der Walt & Jarrod Millman (Eds.), Proceedings of the 9th Python in Science Conference (pp. 56–61). https://doi.org/10.25080/Majora-92bf1922-00a + + + Gymnasium + Towers + 10.5281/zenodo.8127026 + 2023 + Towers, M., Terry, J. K., & Kwiatkowski, A. et al. (2023). Gymnasium. Zenodo. https://doi.org/10.5281/zenodo.8127026 + + + PyTorch: An imperative style, high-performance deep learning library + Paszke + Advances in neural information processing systems 32 + 2019 + Paszke, A., Gross, S., & Massa, F. et al. (2019). PyTorch: An imperative style, high-performance deep learning library. In Advances in neural information processing systems 32 (pp. 8024–8035). Curran Associates, Inc. http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf + + + OpenAI gym + Brockman + 2016 + Brockman, G., Cheung, V., & Pettersson, L. et al. (2016). OpenAI gym. + + + Stable-Baselines3: Reliable reinforcement learning implementations + Raffin + Journal of Machine Learning Research + 268 + 22 + 2021 + Raffin, A., Hill, A., & Gleave, A. et al. (2021). Stable-Baselines3: Reliable reinforcement learning implementations. Journal of Machine Learning Research, 22(268), 1–8. http://jmlr.org/papers/v22/20-1364.html + + + Mastering the game of Go with deep neural networks and tree search + Silver + Nature + 7587 + 529 + 10.1038/nature16961 + 0028-0836 + 2016 + Silver, D., Huang, A., & Maddison, C. J. et al. (2016). Mastering the game of Go with deep neural networks and tree search. Nature, 529(7587), 484–489. https://doi.org/10.1038/nature16961 + + + Mastering chess and shogi by self-play with a general reinforcement learning algorithm + Silver + CoRR + abs/1712.01815 + 2017 + Silver, D., Hubert, T., & Schrittwieser, J. et al. (2017). Mastering chess and shogi by self-play with a general reinforcement learning algorithm. CoRR, abs/1712.01815. http://arxiv.org/abs/1712.01815 + + + Playing atari with deep reinforcement learning + Mnih + CoRR + abs/1312.5602 + 2013 + Mnih, V., Kavukcuoglu, K., & Silver, D. et al. (2013). Playing atari with deep reinforcement learning. CoRR, abs/1312.5602. http://arxiv.org/abs/1312.5602 + + + Grandmaster level in StarCraft II using multi-agent reinforcement learning + Vinyals + Nat. + 7782 + 575 + 10.1038/s41586-019-1724-z + 2019 + Vinyals, O., Babuschkin, I., & M. Czarnecki, W. et al. (2019). Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nat., 575(7782), 350–354. https://doi.org/10.1038/s41586-019-1724-z + + + Transferring online reinforcement learning for electric motor control from simulation to real-world experiments + Book + IEEE Open Journal of Power Electronics + 2 + 10.1109/OJPEL.2021.3065877 + 2021 + Book, G., Traue, A., & Balakrishna, P. et al. (2021). Transferring online reinforcement learning for electric motor control from simulation to real-world experiments. IEEE Open Journal of Power Electronics, 2, 187–201. https://doi.org/10.1109/OJPEL.2021.3065877 + + + Reinforcement learning in robotics: A survey + Kober + The International Journal of Robotics Research + 11 + 32 + 10.1177/0278364913495721 + 2013 + Kober, J., Bagnell, J. A., & Peters, J. (2013). Reinforcement learning in robotics: A survey. The International Journal of Robotics Research, 32(11), 1238–1274. https://doi.org/10.1177/0278364913495721 + + + Training a helpful and harmless assistant with reinforcement learning from human feedback + Bai + 2022 + Bai, Y., Jones, A., & Ndousse, K. et al. (2022). Training a helpful and harmless assistant with reinforcement learning from human feedback. https://arxiv.org/abs/2204.05862 + + + Applications of reinforcement learning in finance – trading with a double deep q-network + Zejnullahu + 2022 + Zejnullahu, F., Moser, M., & Osterrieder, J. (2022). Applications of reinforcement learning in finance – trading with a double deep q-network. https://arxiv.org/abs/2206.14267 + + + Reinforcement learning for intelligent healthcare applications: A survey + Coronato + Artificial Intelligence in Medicine + 109 + 10.1016/j.artmed.2020.101964 + 0933-3657 + 2020 + Coronato, A., Naeem, M., De Pietro, G., & Paragliola, G. (2020). Reinforcement learning for intelligent healthcare applications: A survey. Artificial Intelligence in Medicine, 109, 101964. https://doi.org/10.1016/j.artmed.2020.101964 + + + An efficient deep reinforcement learning model for urban traffic control + Lin + CoRR + abs/1808.01876 + 2018 + Lin, Y., Dai, X., Li, L., & Wang, F.-Y. (2018). An efficient deep reinforcement learning model for urban traffic control. CoRR, abs/1808.01876. http://arxiv.org/abs/1808.01876 + + + API design for machine learning software: Experiences from the scikit-learn project + Buitinck + ECML PKDD workshop: Languages for data mining and machine learning + 2013 + Buitinck, L., Louppe, G., & Blondel, M. et al. (2013). API design for machine learning software: Experiences from the scikit-learn project. ECML PKDD Workshop: Languages for Data Mining and Machine Learning, 108–122. + + + Safe reinforcement learning-based control in power electronic systems + Weber + 2023 international conference on future energy solutions (FES) + 10.1109/FES57669.2023.10182718 + 2023 + Weber, D., Schenke, M., & Wallscheid, O. (2023). Safe reinforcement learning-based control in power electronic systems. 2023 International Conference on Future Energy Solutions (FES), 1–6. https://doi.org/10.1109/FES57669.2023.10182718 + + + A deep q-learning direct torque controller for permanent magnet synchronous motors + Schenke + IEEE Open Journal of the Industrial Electronics Society + 2 + 10.1109/OJIES.2021.3075521 + 2021 + Schenke, M., & Wallscheid, O. (2021). A deep q-learning direct torque controller for permanent magnet synchronous motors. IEEE Open Journal of the Industrial Electronics Society, 2, 388–400. https://doi.org/10.1109/OJIES.2021.3075521 + + + + + + diff --git a/jose.00306/10.21105.jose.00306.pdf b/jose.00306/10.21105.jose.00306.pdf new file mode 100644 index 0000000..907e200 Binary files /dev/null and b/jose.00306/10.21105.jose.00306.pdf differ diff --git a/jose.00306/paper.jats/10.21105.jose.00306.jats b/jose.00306/paper.jats/10.21105.jose.00306.jats new file mode 100644 index 0000000..9dea027 --- /dev/null +++ b/jose.00306/paper.jats/10.21105.jose.00306.jats @@ -0,0 +1,865 @@ + + +
+ + + + +Journal of Open Source Education +JOSE + +2577-3569 + +Open Journals + + + +306 +10.21105/jose.00306 + +Reinforcement Learning: A Comprehensive Open-Source +Course + + + +https://orcid.org/0009-0001-5853-5900 + +Abdelwanis +Ali Hassan Ali + + + + +https://orcid.org/0000-0003-0862-2069 + +Haucke-Korber +Barnabas + + + + +https://orcid.org/0009-0002-1576-2465 + +Jakobeit +Darius + + + + +https://orcid.org/0000-0001-9490-1843 + +Kirchgässner +Wilhelm + + + + +https://orcid.org/0009-0008-2879-7118 + +Meyer +Marvin + + + + +https://orcid.org/0000-0001-5427-9527 + +Schenke +Maximilian + + + + +https://orcid.org/0009-0005-0654-8741 + +Vater +Hendrik + + + + +https://orcid.org/0000-0001-9362-8777 + +Wallscheid +Oliver + + + + +https://orcid.org/0000-0003-3367-5998 + +Weber +Daniel + + + + + +Department of Power Electronics and Electrical Drives, +Paderborn University, Germany + + + + +Department of Interconnected Automation Systems, University +of Siegen, Germany + + + + +19 +7 +2023 + +8 +92 +306 + +Authors of papers retain copyright and release the +work under a Creative Commons Attribution 4.0 International License (CC +BY 4.0) +2025 +The article authors + +Authors of papers retain copyright and release the work under +a Creative Commons Attribution 4.0 International License (CC BY +4.0) + + + +data science +Python +TensorFlow +PyTorch +Jupyter notebook +reproducible workflow +open science +reinforcement learning +exploratory data analysis +machine learning +supervised learning + + + + + + Summary +

We present an open-source repository of an extensive course on + reinforcement learning. It is specifically designed for master + students in engineering and computer science. The course aims to + introduce beginners to the fundamentals of reinforcement learning and + progress towards advanced algorithms. This is done using examples + spanning many different classic control engineering tasks. It is + structured to be accessible to students with limited prior programming + experience by introducing the basics of Python.

+

The course spans 14 weeks, comprising 14 lectures and 12 exercises. + Accompanying video materials from real lectures and exercises are + provided to aid in understanding the course content. They are + available on a + YouTube + channel under an Creative Commons license. The open-source + nature of the course allows other teachers to freely adapt the + materials for their own teaching purposes. The primary goal is to + equip learners with a solid theory of reinforcement learning + principles, as well as the practical tools to solve real-world + engineering problems from different domains, such as electrical + engineering.

+

The lecture follows Richard S. Sutton and Andrew G. Barto’s + fundamentals book on reinforcement learning + (Sutton + & Barto, 2005) and takes inspiration from the reinforcement + learning lecture script delivered by David Silver + (Silver, + 2015). The exercises are programmed in Python using Jupyter + notebooks + (Kluyver + et al., 2016) for presentation. Important libraries for machine + and reinforcement learning are introduced, such as pandas + (McKinney, + 2010; + pandas, + 2020), gymnasium + (Towers + et al., 2023), PyTorch + (Paszke + et al., 2019), scikit-learn + (Buitinck + et al., 2013), and stable-baselines3 + (Raffin + et al., 2021).

+

The authors of this course have experience working with + reinforcement learning in the domain of electrical engineering, in + particular in electric drive + (Schenke + & Wallscheid, 2021) and grid control + (Weber + et al., 2023). The course has first been held under the + constraints of the COVID-19 pandemic in 2020, resorting to an online, + asynchronous learning experience. It has been extended with a session + on more contemporary algorithms in 2022. In subsequent years the + course has been revised to incorporate experience from teaching the + course and to align the structure of the exercises. All versions (for + each year’s revision) are available inside the publicly available + GitHub + repository.

+
+ + Statement of Need +

Recent developments in (deep) reinforcement learning caused + considerable excitement in both academia and + popular + science media. Starting with beating champions in complex + board games such as chess + (Silver + et al., 2017) and Go + (Silver + et al., 2016), breaking human records in a wide variety of + video games + (Mnih + et al., 2013; + Vinyals + et al., 2019), up to recent solutions in real-world (control) + applications + (Bai et + al., 2022; + Book et + al., 2021; + Coronato + et al., 2020; + Kober + et al., 2013; + Lin et + al., 2018; + Zejnullahu + et al., 2022), reinforcement learning agents have been proven + to be a control or decision-making solution for a wide variety of + application domains. Reinforcement learning poses an elegant and + data-driven path to a control solution with minimal expert knowledge + involved, which makes it highly attractive for many different research + domains. A similar development has already been observed in recent + years with regard to deep supervised learning.

+

An increasing amount of educational resources is available due to + the traction RL has gained in recent years. However, most courses lack + either the continuity of topics ranging from the foundations up to the + advanced topics of deep reinforcement learning, practical programming + exercises accompanying each theoretical lecture, the testing at + university level, or free availability. Alternative courses often + focus on games + (Simonini + & Sanseviero, 2023) or a mix of theoretical and practical + questions for their exercises + (Achiam, + 2018; + Brunskill, + 2025). In contrast, our course utilizes practical application + scenarios from a wide variety of domains with a strong focus on + classical control engineering tasks. This course can therefore help + accelerate establishing reinforcement learning solutions within + real-world applications.

+
+ + Target Audience and Learning Goals +

The target learner audience of this course are master students from + the subjects of engineering, computer science and anyone who is + interested in the concepts of reinforcement learning. Its exercises + are designed to be solvable by students without (strong) programming + background when done in the presented order. Students learn to utilize + reinforcement learning depending on the problem. They learn how to + incorporate expert knowledge into their reinforcement learning + solution, e.g., by designing the features or reward functions. + Exercises start with a very low-level introduction of the programming + language Python. Later exercises introduce advanced techniques that + can be utilized in more comprehensive environments, such as electric + drive states prediction or vehicle control. Students should have + experience with algorithm notation to be able to practically implement + the algorithms which are presented in the lectures. Some basic + understanding of stochastics is advised to understand mathematical + background. At the end of the course, students should have gained the + following skills:

+ + +

Understand basic concepts and functionalities of reinforcement + learning methods.

+
+ +

Be able to understand and evaluate state-of-the-art + algorithms.

+
+ +

Have the ability to implement basic and advanced algorithms + using open-source libraries in Python.

+
+ +

Be able to select a fitting solution when presented with a new + task.

+
+ +

Can critically interpret and evaluate results and + performance.

+
+
+
+ + Content +

The course is structured as a one semester university-level course + with two sessions each week: one lecture and one exercise. The + contents of the latest iteration of the course (summer term 2025) are + presented in the following.

+

A summary of lectures and exercises can be found in table 1 and + table 2, respectively.

+ + +

Summary of course lectures.

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
LectureContent
01Introduction to Reinforcement Learning
02Markov Decision Processes
03Dynamic Programming
04Monte Carlo Methods
05Temporal-Difference Learning
06Multi-Step Bootstrapping
07Planning and Learning with Tabular Methods
08Function Approximation with Supervised Learning
09On-Policy Prediction with Function Approximation
10Value-Based Control with Function Approximation
11Stochastic Policy Gradient Methods
12Deterministic Policy Gradient Methods
13Further Contemporary RL Algorithms (TRPO, PPO)
14Outlook and Research Insights
+
+ + +

Summary of course exercises.

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ExerciseContent
01Basics of Python for Scientific Computing
02Basic Markov Chain, Reward and Decision Problems
03Dynamic Programming
04Race Track with Monte Carlo Learning
05Race Track with Temporal-Difference Learning
06Inverted Pendulum with Tabular Multi-Step Methods
07Inverted Pendulum within Dyna Framework
08Predicting Electric Drive with Supervised Learning
09Evaluate Given Agents in Mountain Car Problem
10Mountain Car Valley Using Semi-Gradient Sarsa
11Moon Landing with Actor-Critic Methods
12Shoot for the moon with DDPG & PPO
+
+

Lectures and exercises which share the same number also deal with + the same topics. Thus, theoretical basics are provided in the lecture, + which are to be implemented and evaluated in the exercises on the + basis of specific application examples which are taken from third + party open-source libraries + (Brockman + et al., 2016; + Towers + et al., 2023). This allows the learners to internalize learned + contents practically. However, the lecture can be studied + independently of the exercises and the exercises independently of the + lecture in case of self-learning.

+

The lecture slides were created in LaTex and published accordingly + to allow for consistent display and easy adaptation of the material by + other instructors. The practical exercises were implemented in Jupyter + notebooks + (Kluyver + et al., 2016). These also allow a quick implementation of + further, or modification of existing, content.

+
+ + Conclusion +

The presented course provides a complete introduction to the + fundamentals and contemporary applications of reinforcement learning. + By combining theory and practice, the learner is enabled to analyze + and solve (even intricate control engineering) problems in the context + of reinforcement learning. Both the lecture content and the exercises + are open-source and designed to be easily adapted by other + instructors. Due to the recorded explanatory videos, this course can + easily be used by self-learners.

+
+ + Author’s Contribution +

Authors are listed in alphabetical order. Wilhelm Kirchgässner, + Maximilian Schenke, Oliver Wallscheid, and Daniel Weber have created + and held this course since the summer term of 2020. Barnabas + Haucke-Korber, Darius Jakobeit, and Marvin Meyer joined the University + of Paderborn at a later date and supported with revising and holding + the exercises in 2023. In 2024, Hendrik Vater contributed by aligning + the exercises to a common format. In 2025, Ali Hassan Ali Abdelwanis + supported updating the exercises to the newest libraries and + contributed to their revision.

+
+ + Acknowledgements +

We would like to thank all of the students who helped improving the + course by attending lectures, solving the exercises and giving + valuable feedback, as well as the open-source community for asking + questions and suggesting changes on GitHub.

+
+ + + + + + + + SuttonRichard S. + BartoAndrew G. + + Reinforcement learning: An introduction + IEEE Transactions on Neural Networks + 2005 + 16 + https://api.semanticscholar.org/CorpusID:9166388 + 285 + 286 + + + + + + SilverDavid + + Lectures on reinforcement learning + url: https://www.davidsilver.uk/teaching/ + 2015 + + + + + + SimoniniThomas + SansevieroOmar + + The hugging face deep reinforcement learning class + GitHub repository + https://github.com/huggingface/deep-rl-class; GitHub + 2023 + + + + + + BrunskillEmma + + CS234: Reinforcement learning winter 2025 + url: https://web.stanford.edu/class/cs234/ + 2025 + + + + + + AchiamJosh + + Spinning up in deep reinforcement learning + url: https://spinningup.openai.com/ + 2018 + + + + + + KluyverThomas + Ragan-KelleyBenjamin + PérezFernando et al + + Jupyter notebooks – a publishing format for reproducible computational workflows + + LoizidesF. + SchmidtB. + + IOS Press + 2016 + 87 + 90 + + + + + + pandas + + Pandas-dev/pandas: pandas + Zenodo + 202002 + https://doi.org/10.5281/zenodo.3509134 + 10.5281/zenodo.3509134 + + + + + + McKinney + + Data Structures for Statistical Computing in Python + Proceedings of the 9th Python in Science Conference + + Walt + Millman + + 2010 + 10.25080/Majora-92bf1922-00a + 56 + 61 + + + + + + TowersMark + TerryJordan K. + KwiatkowskiAriel et al + + Gymnasium + Zenodo + 202303 + 20230708 + https://zenodo.org/record/8127025 + 10.5281/zenodo.8127026 + + + + + + PaszkeAdam + GrossSam + MassaFrancisco et al + + PyTorch: An imperative style, high-performance deep learning library + Advances in neural information processing systems 32 + Curran Associates, Inc. + 2019 + http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf + 8024 + 8035 + + + + + + BrockmanGreg + CheungVicki + PetterssonLudwig et al + + OpenAI gym + 2016 + + + + + + RaffinAntonin + HillAshley + GleaveAdam et al + + Stable-Baselines3: Reliable reinforcement learning implementations + Journal of Machine Learning Research + 2021 + 22 + 268 + http://jmlr.org/papers/v22/20-1364.html + 1 + 8 + + + + + + SilverDavid + HuangAja + MaddisonChris J. et al + + Mastering the game of Go with deep neural networks and tree search + Nature + 2016 + 529 + 7587 + 0028-0836 + 10.1038/nature16961 + 484 + 489 + + + + + + SilverDavid + HubertThomas + SchrittwieserJulian et al + + Mastering chess and shogi by self-play with a general reinforcement learning algorithm + CoRR + 2017 + abs/1712.01815 + http://arxiv.org/abs/1712.01815 + + + + + + MnihVolodymyr + KavukcuogluKoray + SilverDavid et al + + Playing atari with deep reinforcement learning + CoRR + 2013 + abs/1312.5602 + http://arxiv.org/abs/1312.5602 + + + + + + VinyalsOriol + BabuschkinIgor + M. CzarneckiWojciech et al + + Grandmaster level in StarCraft II using multi-agent reinforcement learning + Nat. + 2019 + 575 + 7782 + https://doi.org/10.1038/s41586-019-1724-z + 10.1038/s41586-019-1724-z + 350 + 354 + + + + + + BookGerrit + TraueArne + BalakrishnaPraneeth et al + + Transferring online reinforcement learning for electric motor control from simulation to real-world experiments + IEEE Open Journal of Power Electronics + 202103 + 2 + 10.1109/OJPEL.2021.3065877 + 187 + 201 + + + + + + KoberJens + BagnellJ. Andrew + PetersJan + + Reinforcement learning in robotics: A survey + The International Journal of Robotics Research + 2013 + 32 + 11 + https://doi.org/10.1177/0278364913495721 + 10.1177/0278364913495721 + 1238 + 1274 + + + + + + BaiYuntao + JonesAndy + NdousseKamal et al + + Training a helpful and harmless assistant with reinforcement learning from human feedback + 2022 + https://arxiv.org/abs/2204.05862 + + + + + + ZejnullahuFrensi + MoserMaurice + OsterriederJoerg + + Applications of reinforcement learning in finance – trading with a double deep q-network + 2022 + https://arxiv.org/abs/2206.14267 + + + + + + CoronatoAntonio + NaeemMuddasar + De PietroGiuseppe + ParagliolaGiovanni + + Reinforcement learning for intelligent healthcare applications: A survey + Artificial Intelligence in Medicine + 2020 + 109 + 0933-3657 + https://www.sciencedirect.com/science/article/pii/S093336572031229X + 10.1016/j.artmed.2020.101964 + 101964 + + + + + + + LinYilun + DaiXingyuan + LiLi + WangFei-Yue + + An efficient deep reinforcement learning model for urban traffic control + CoRR + 2018 + abs/1808.01876 + http://arxiv.org/abs/1808.01876 + + + + + + BuitinckLars + LouppeGilles + BlondelMathieu et al + + API design for machine learning software: Experiences from the scikit-learn project + ECML PKDD workshop: Languages for data mining and machine learning + 2013 + 108 + 122 + + + + + + WeberDaniel + SchenkeMaximilian + WallscheidOliver + + Safe reinforcement learning-based control in power electronic systems + 2023 international conference on future energy solutions (FES) + 2023 + + 10.1109/FES57669.2023.10182718 + 1 + 6 + + + + + + SchenkeMaximilian + WallscheidOliver + + A deep q-learning direct torque controller for permanent magnet synchronous motors + IEEE Open Journal of the Industrial Electronics Society + 202104 + 2 + 10.1109/OJIES.2021.3075521 + 388 + 400 + + + + +