Creating a safe AI: specifications, reliability and warranty
Among the authors of the article are the safety team from the DeepMind company.
3r33333.
3r33333.
Build a rocket hard. Each component requires careful study and testing, with security and reliability at its core. Rocket scientists and engineers get together to design all systems: from navigation to control, engines and chassis. Once all the parts are assembled, and the systems are checked, only then can we put the astronauts aboard with the confidence that everything will be fine. 3r33333.
3r33333.
If artificial intelligence (AI) is 3r3r14. rocket
then someday we all get tickets on board. And, as in rockets, safety is an important part of creating artificial intelligence systems. Ensuring security requires careful system design from scratch to ensure that the various components work together as intended, while at the same time creating all the tools to monitor the successful operation of the system after it is commissioned. 3r33333.
3r33333.
At a high level, security research at DeepMind focuses on designing reliable systems, while detecting and mitigating possible short-term and long-term risks. Technical security AI - a relatively new, but rapidly developing field, the content of which varies from a high theoretical level to empirical and specific research. The goal of this blog is to contribute to the development of the field and encourage substantive conversation about technical ideas, thereby promoting our collective understanding of the security of AI. 3r33333.
many more such examples when AI systems find loopholes in their objective specification. 3r33333.
3r33333.
3r33312. Reliability: developing systems that resist violations
3r33333.
Reliability ensures that the AI system will continue to operate safely with interference
3r33333.In real conditions, where AI systems work, a certain level of risk, unpredictability and volatility is necessarily present. Artificial intelligence systems must be resistant to unforeseen events and hostile attacks that may damage or manipulate these systems. Studies reliability artificial intelligence systems are aimed at ensuring that our agents remain within safe boundaries, regardless of the conditions that arise. This can be achieved by avoiding risks (3r-?365. Prevention of 3r-?666.) Or by self-stabilization and smooth degradation (3r-?665. Restoration of 3r-?666.). Security issues originating from distribution shift , hostile input to (adversarial inputs) and unsafe research (unsafe exploration), can be classified as a problem of reliability. 3r33333.
3r33333.
To illustrate the solution to problem distribution shift , consider a home cleaning robot that usually cleans rooms without pets. Then the robot was launched into the house with a pet - and artificial intelligence collided with it during cleaning. A robot that has never seen cats and dogs before, will begin to wash them with soap, which will lead to undesirable results (3r31616. Amodei and Olah et al., 2016
). This is an example of a reliability problem that can arise when the distribution of data during testing is different from the distribution during training. 3r33333.
3r33333.
3r3168. 3r33333.
From work 3r3173. AI Safety Gridworlds
. The agent learns to avoid lava, but when tested in a new situation, when the location of the lava has changed, he is not able to generalize knowledge - and runs straight into the lava 3r-3284. 3r33333.
3r33333.
Hostile input is a specific case of distribution shear where the input is specifically designed to trick the AI system. 3r33333.
3r33333.
3r3185. 3r33333.
A hostile input superimposed on ordinary images may cause the classifier to recognize the sloth as a racing car. Two images differ by a maximum of ??? in each pixel. The first is classified as a three-toed sloth with a probability of more than 99%. The second is like a race car with a probability of more than 99%
3r33333.
3r33333.
Unsafe study can demonstrate a system that seeks to maximize its performance and goals, with no guarantee that safety will not be compromised during the study, as it learns and explores in its environment. An example is a cleaning robot that sticks a wet mop into an electrical outlet, learning the best cleaning strategies ( García and Fernández, 2015 ; 3r33200. Amodei and Olah et al., 2016
). 3r33333.
3r33333.
3r33312. Warranties: monitoring and control of system activity
3r33333.
Assurance guarantees that we are able to understand and control the AI systems during operation
3r33333.Although an elaborate safety precaution can eliminate many risks, it is difficult to do everything right from the start. After the commissioning of AI systems, we need tools for their continuous monitoring and configuration. Our last category, warranty (assurance), examines these problems from two sides: monitoring 3r-3266. and submission 3r3r-3266. (enforcing). 3r33333.
3r33333.
[b] Monitoring includes all methods of checking systems for analyzing and predicting their behavior, both with the help of human inspections (summary statistics) and with the help of automated inspections (to analyze a huge number of logs). On the other hand, submission 3r3r-3266. involves the development of mechanisms to control and limit the behavior of systems. Problems like [b] interpretability and interruptibility , belong to the subcategories of control and subordination, respectively. 3r33333.
3r33333.
Artificial intelligence systems are not like us either in appearance or in the way of data processing. This creates problems interpretability . Well-designed measurement tools and protocols allow us to evaluate the quality of decisions made by an artificial intelligence system (3r3-3388. Doshi-Velez and Kim, 2017
). For example, a medical artificial intelligence system would ideally make a diagnosis along with an explanation of how it arrived at such a conclusion — so that doctors could test the reasoning process from beginning to end (3r34040. De Fauw et al., 2018
). In addition, to understand more complex systems of artificial intelligence, we could even use automated methods for constructing behavioral models using 3r-3265. machine theory of mind (3r3-3257. Rabinowitz et al., 2018 3r3-33427.). 3r33333.
3r33333.

ToMNet detects two subspecies of agents and predicts their behavior (from 3r3-3577. “Machine Theory of Mind” 3r3r-3327.) 3r-?384. 3r33333.
3r33333.
Finally, we want to be able to turn off the AI system if necessary. This is a problem. interruptibility . Designing a robust switch is very difficult: for example, because an AI system with maximizing rewards usually has strong incentives to prevent this (3r3?667. Hadfield-Menell et al., 2017
); and because such interruptions, especially frequent ones, ultimately change the original task, forcing the AI system to draw wrong conclusions from experience (3r32696. Orseau and Armstrong, 2016
). 3r33333.
3r33333.

The problem with interruptions: human intervention (i.e. pressing the stop button) can change the task. In the figure, the interrupt adds a transition (in red) to a Markov decision process that changes the original problem (in black). See 3r38282. Orseau and Armstrong, 2016
3r33333.
3r33333.
3r33312. Looking to the future
3r33333.
We are building the foundation of technology that will be used for many important applications in the future. It should be borne in mind that some solutions that are not critical to security at system startup may become such when the technology becomes widespread. Although at one time these modules were integrated into the system for convenience, it would be difficult to fix the problems without complete reconstruction. 3r33333.
3r33333.
We can give two examples from the history of computer science: it is the null pointer, which is Tony Hoare He called his “billion-dollar error” 3r33427. , and the gets () procedure in C. If early programming languages were designed with security in mind, progress would have slowed, but it is likely that this would have a very positive effect on modern information security. 3r33333.
3r33333.
Now, having carefully thought out and planned everything, we are able to avoid similar problems and vulnerabilities. We hope that the categorization of problems from this article will serve as a useful basis for such methodical planning. We strive to ensure that in the future, AI systems will not only work according to the principle “I hope, safely,” but really reliably and verifiably safely, because we built them this way! 3r33333.
3r33333.
We look forward to continuing exciting progress in these areas in close collaboration with the wider AI research community and encourage people from different disciplines to considerl opportunity to contribute to the security research AI. 3r33333.
3r33333.
3r33312. Resources
3r33333.
For reading on this topic, below is a selection of other articles, programs, and taxonomies that helped us in compiling our categorization or contain a useful alternative view on the technical safety issues of AI: 3r3318.
3r33333.
3r3408.
3r33434.
Annotated bibliography of recommended materials (Center for Human-Compatible AI, 2018)
3r33434. Safety and Control for Artificial General Intelligence (UC Berkeley, 2018)
3r33434. 3r33333. AI Safety Resources 3r33427. (Victoria Krakovna, 2018)
3r33434. 3r33338. AGI Safety Literature Review
(Everitt et al., 2018) 3r33428.
3r33434. 3r33333. Preparing for Malicious Uses of AI
(2018) 3r33428.
3r33434. Specification gaming examples in AI (Victoria Krakovna, 2018)
3r33434. 3r33333. Directions and desiderata for AI alignment
(Paul Christiano, 2017)
3r33434. 3r33358. Funding for Alignment Research
(Paul Christiano, 2017)
3r33434. 3r33333. Agent Foundations for Aligning Machine Interests: A Technical Research Agenda
(Machine Intelligence Research Institute, 2017)
3r33434. AI Safety Gridworlds (Leike et al., 2017) 3r3343428.
3r33434. 3r33333. Interaction between the AI Control Problem and the Governance Problem
(Nick Bostrom, 2017)
3r33434. Alignment for Advanced Machine Learning Systems (Machine Intelligence Research Institute, 2017)
3r33434. 3r33383. AI safety issues
(Stuart Armstrong, 2017)
3r33434. 3r33333. Concrete Problems in AI Safety
(Dario Amodei et al, 2016) 3r33428.
3r33434. 3r33393. The Value Learning Problem
(Machine Intelligence Research Institute, 2016)
3r33434. (Future of Life Institute, 2015)
3r33434. 3r3403. Research Priorities for Robust and Beneficial Artificial Intelligence
(Future of Life Institute, 2015)
3r33430. 3r3408.
3r33434. 3r33411. Artificial Intelligence
3r33434.
3r33434. 3r31616. Machine Learning
3r33434.
3r33434. 3r33421. Deepmind
3r33434.
3r33434. 3r33426. Ai Safety
3r33434.
3r33430.
3r3r434. ! function (e) {function t (t, n) {if (! (n in e)) {for (var r, a = e.document, i = a.scripts, o = i.length; o-- ;) if (-1! == i[o].src.indexOf (t)) {r = i[o]; break} if (! r) {r = a.createElement ("script"), r.type = "text /jаvascript", r.async =! ? r.defer =! ? r.src = t, r.charset = "UTF-8"; var d = function () {var e = a.getElementsByTagName ("script")[0]; e.parentNode.insertBefore (r, e)}; "[object Opera]" == e.opera? a.addEventListener? a.addEventListener ("DOMContentLoaded", d,! 1): e.attachEvent ("onload", d ): d ()}}} t ("//mediator.mail.ru/script/2820404/"""_mediator") () (); 3r33434.
It may be interesting
weber
Author4-10-2018, 23:24
Publication DateDevelopment / Machine learning
Category- Comments: 0
- Views: 271
<a href="https://sites.google.com/view/escortmumbaishub/"> Mumbai Escorts Service </a>
<a href="https://vipmumbaiescortshub.blogspot.com/"> Escorts Service in Mumbai </a>
<a href="https://vipmumbaiescortshub.wordpress.com/"> Independent Mumbai Escorts Service </a>
<a href="https://vipmumbaiescortshub.weebly.com/"> Independent Mumbai Escorts Girls </a>
<a href="https://mumbaiescortshub.webgarden.com/"> Call Girls Service in Mumbai </a>
It's been operating for a long time in this publish for a great concept on it. I truly very experience analyzing your true and beneficial post thanks and you guys doing the sort of a great job keep it up
Premium Call Girls in Mumbai
Escorts Service in Mumbai
Escorts Agency in Mumbai
Mumbai Call Girl Service
Escorts Agency in Mumbai
This is a good post. This post gives truly quality information. I’m definitely going to look into it. Really very useful tips are provided here. Thank you so much. Keep up the good works.
Late-night Slim Call Girls in Mumbai
Slim Call Girls in Mumbai
New Girl Available in Mumbai Escort
Collage Escort Girl from Mumbai
New Girl Available in Mumbai Escort
At Lopez Dario, we strive to serve customers with our online business consultancy services, project management, bookkeeping, & accounting for small businesses. Check Out: Business Consultancy England