Blood glucose control, for example, in diabetes mellitus or severe illness, requires strict adherence to a protocol of food, insulin administration and exercise personalized to each patient. An artificial pancreas for automated treatment could boost quality of glucose control and patients' independence. The components required for an artificial pancreas are: i) continuous glucose monitoring (CGM), ii) smart controllers and iii) insulin pumps delivering the optimal amount of insulin. In recent years, medical devices for CGM and insulin administration have undergone rapid progression and are now commercially available. Yet, clinically available devices still require regular patients' or caregivers' attention as they operate in open-loop control with frequent user intervention. Dosage-calculating algorithms are currently being studied in intensive care patients [1] , for short overnight control to supplement conventional insulin delivery [2] , and for short periods where patients rest and follow a prescribed food regime [3] . Fully automated algorithms that can respond to the varying activity levels seen in outpatients, with unpredictable and unreported food intake, and which provide the necessary personalized control for individuals is currently beyond the state-of-the-art. Here, we review and discuss reinforcement learning algorithms, controlling insulin in a closed-loop to provide individual insulin dosing regimens that are reactive to the immediate needs of the patient.