• Guest
HabraHabr
  • Main
  • Users

  • Development
    • Programming
    • Information Security
    • Website development
    • JavaScript
    • Game development
    • Open source
    • Developed for Android
    • Machine learning
    • Abnormal programming
    • Java
    • Python
    • Development of mobile applications
    • Analysis and design of systems
    • .NET
    • Mathematics
    • Algorithms
    • C#
    • System Programming
    • C++
    • C
    • Go
    • PHP
    • Reverse engineering
    • Assembler
    • Development under Linux
    • Big Data
    • Rust
    • Cryptography
    • Entertaining problems
    • Testing of IT systems
    • Testing Web Services
    • HTML
    • Programming microcontrollers
    • API
    • High performance
    • Developed for iOS
    • CSS
    • Industrial Programming
    • Development under Windows
    • Image processing
    • Compilers
    • FPGA
    • Professional literature
    • OpenStreetMap
    • Google Chrome
    • Data Mining
    • PostgreSQL
    • Development of robotics
    • Visualization of data
    • Angular
    • ReactJS
    • Search technologies
    • Debugging
    • Test mobile applications
    • Browsers
    • Designing and refactoring
    • IT Standards
    • Solidity
    • Node.JS
    • Git
    • LaTeX
    • SQL
    • Haskell
    • Unreal Engine
    • Unity3D
    • Development for the Internet of things
    • Functional Programming
    • Amazon Web Services
    • Google Cloud Platform
    • Development under AR and VR
    • Assembly systems
    • Version control systems
    • Kotlin
    • R
    • CAD/CAM
    • Customer Optimization
    • Development of communication systems
    • Microsoft Azure
    • Perfect code
    • Atlassian
    • Visual Studio
    • NoSQL
    • Yii
    • Mono и Moonlight
    • Parallel Programming
    • Asterisk
    • Yandex API
    • WordPress
    • Sports programming
    • Lua
    • Microsoft SQL Server
    • Payment systems
    • TypeScript
    • Scala
    • Google API
    • Development of data transmission systems
    • XML
    • Regular expressions
    • Development under Tizen
    • Swift
    • MySQL
    • Geoinformation services
    • Global Positioning Systems
    • Qt
    • Dart
    • Django
    • Development for Office 365
    • Erlang/OTP
    • GPGPU
    • Eclipse
    • Maps API
    • Testing games
    • Browser Extensions
    • 1C-Bitrix
    • Development under e-commerce
    • Xamarin
    • Xcode
    • Development under Windows Phone
    • Semantics
    • CMS
    • VueJS
    • GitHub
    • Open data
    • Sphinx
    • Ruby on Rails
    • Ruby
    • Symfony
    • Drupal
    • Messaging Systems
    • CTF
    • SaaS / S+S
    • SharePoint
    • jQuery
    • Puppet
    • Firefox
    • Elm
    • MODX
    • Billing systems
    • Graphical shells
    • Kodobred
    • MongoDB
    • SCADA
    • Hadoop
    • Gradle
    • Clojure
    • F#
    • CoffeeScript
    • Matlab
    • Phalcon
    • Development under Sailfish OS
    • Magento
    • Elixir/Phoenix
    • Microsoft Edge
    • Layout of letters
    • Development for OS X
    • Forth
    • Smalltalk
    • Julia
    • Laravel
    • WebGL
    • Meteor.JS
    • Firebird/Interbase
    • SQLite
    • D
    • Mesh-networks
    • I2P
    • Derby.js
    • Emacs
    • Development under Bada
    • Mercurial
    • UML Design
    • Objective C
    • Fortran
    • Cocoa
    • Cobol
    • Apache Flex
    • Action Script
    • Joomla
    • IIS
    • Twitter API
    • Vkontakte API
    • Facebook API
    • Microsoft Access
    • PDF
    • Prolog
    • GTK+
    • LabVIEW
    • Brainfuck
    • Cubrid
    • Canvas
    • Doctrine ORM
    • Google App Engine
    • Twisted
    • XSLT
    • TDD
    • Small Basic
    • Kohana
    • Development for Java ME
    • LiveStreet
    • MooTools
    • Adobe Flash
    • GreaseMonkey
    • INFOLUST
    • Groovy & Grails
    • Lisp
    • Delphi
    • Zend Framework
    • ExtJS / Sencha Library
    • Internet Explorer
    • CodeIgniter
    • Silverlight
    • Google Web Toolkit
    • CakePHP
    • Safari
    • Opera
    • Microformats
    • Ajax
    • VIM
  • Administration
    • System administration
    • IT Infrastructure
    • *nix
    • Network technologies
    • DevOps
    • Server Administration
    • Cloud computing
    • Configuring Linux
    • Wireless technologies
    • Virtualization
    • Hosting
    • Data storage
    • Decentralized networks
    • Database Administration
    • Data Warehousing
    • Communication standards
    • PowerShell
    • Backup
    • Cisco
    • Nginx
    • Antivirus protection
    • DNS
    • Server Optimization
    • Data recovery
    • Apache
    • Spam and antispam
    • Data Compression
    • SAN
    • IPv6
    • Fidonet
    • IPTV
    • Shells
    • Administering domain names
  • Design
    • Interfaces
    • Web design
    • Working with sound
    • Usability
    • Graphic design
    • Design Games
    • Mobile App Design
    • Working with 3D-graphics
    • Typography
    • Working with video
    • Work with vector graphics
    • Accessibility
    • Prototyping
    • CGI (graphics)
    • Computer Animation
    • Working with icons
  • Control
    • Careers in the IT industry
    • Project management
    • Development Management
    • Personnel Management
    • Product Management
    • Start-up development
    • Managing the community
    • Service Desk
    • GTD
    • IT Terminology
    • Agile
    • Business Models
    • Legislation and IT-business
    • Sales management
    • CRM-systems
    • Product localization
    • ECM / EDS
    • Freelance
    • Venture investments
    • ERP-systems
    • Help Desk Software
    • Media management
    • Patenting
    • E-commerce management
    • Creative Commons
  • Marketing
    • Conferences
    • Promotion of games
    • Internet Marketing
    • Search Engine Optimization
    • Web Analytics
    • Monetize Web services
    • Content marketing
    • Monetization of IT systems
    • Monetize mobile apps
    • Mobile App Analytics
    • Growth Hacking
    • Branding
    • Monetize Games
    • Display ads
    • Contextual advertising
    • Increase Conversion Rate
  • Sundry
    • Reading room
    • Educational process in IT
    • Research and forecasts in IT
    • Finance in IT
    • Hakatonas
    • IT emigration
    • Education abroad
    • Lumber room
    • I'm on my way

Creating a safe AI: specifications, reliability and warranty

Among the authors of the article are the safety team from the DeepMind company.
3r33333.  
3r33333.  
Build a rocket hard. Each component requires careful study and testing, with security and reliability at its core. Rocket scientists and engineers get together to design all systems: from navigation to control, engines and chassis. Once all the parts are assembled, and the systems are checked, only then can we put the astronauts aboard with the confidence that everything will be fine. 3r33333.  
3r33333.  
If artificial intelligence (AI) is 3r3r14. rocket
then someday we all get tickets on board. And, as in rockets, safety is an important part of creating artificial intelligence systems. Ensuring security requires careful system design from scratch to ensure that the various components work together as intended, while at the same time creating all the tools to monitor the successful operation of the system after it is commissioned. 3r33333.  
3r33333.  
At a high level, security research at DeepMind focuses on designing reliable systems, while detecting and mitigating possible short-term and long-term risks. Technical security AI - a relatively new, but rapidly developing field, the content of which varies from a high theoretical level to empirical and specific research. The goal of this blog is to contribute to the development of the field and encourage substantive conversation about technical ideas, thereby promoting our collective understanding of the security of AI. 3r33333.  
many more such examples when AI systems find loopholes in their objective specification. 3r33333.  
3r33333.  
3r33312. Reliability: developing systems that resist violations
3r33333.  

Reliability ensures that the AI ​​system will continue to operate safely with interference

3r33333.  
In real conditions, where AI systems work, a certain level of risk, unpredictability and volatility is necessarily present. Artificial intelligence systems must be resistant to unforeseen events and hostile attacks that may damage or manipulate these systems. Studies reliability artificial intelligence systems are aimed at ensuring that our agents remain within safe boundaries, regardless of the conditions that arise. This can be achieved by avoiding risks (3r-?365. Prevention of 3r-?666.) Or by self-stabilization and smooth degradation (3r-?665. Restoration of 3r-?666.). Security issues originating from distribution shift , hostile input to (adversarial inputs) and unsafe research (unsafe exploration), can be classified as a problem of reliability. 3r33333.  
3r33333.  
To illustrate the solution to problem distribution shift , consider a home cleaning robot that usually cleans rooms without pets. Then the robot was launched into the house with a pet - and artificial intelligence collided with it during cleaning. A robot that has never seen cats and dogs before, will begin to wash them with soap, which will lead to undesirable results (3r31616. Amodei and Olah et al., 2016
). This is an example of a reliability problem that can arise when the distribution of data during testing is different from the distribution during training. 3r33333.  
3r33333.  
3r3168. 3r33333.  

From work 3r3173. AI Safety Gridworlds
. The agent learns to avoid lava, but when tested in a new situation, when the location of the lava has changed, he is not able to generalize knowledge - and runs straight into the lava 3r-3284.
3r33333.  
3r33333.  
Hostile input is a specific case of distribution shear where the input is specifically designed to trick the AI ​​system. 3r33333.  
3r33333.  
3r3185. 3r33333.  

A hostile input superimposed on ordinary images may cause the classifier to recognize the sloth as a racing car. Two images differ by a maximum of ??? in each pixel. The first is classified as a three-toed sloth with a probability of more than 99%. The second is like a race car with a probability of more than 99%
3r33333.  
3r33333.  
Unsafe study can demonstrate a system that seeks to maximize its performance and goals, with no guarantee that safety will not be compromised during the study, as it learns and explores in its environment. An example is a cleaning robot that sticks a wet mop into an electrical outlet, learning the best cleaning strategies ( García and Fernández, 2015 ; 3r33200. Amodei and Olah et al., 2016
). 3r33333.  
3r33333.  
3r33312. Warranties: monitoring and control of system activity
3r33333.  

Assurance guarantees that we are able to understand and control the AI ​​systems during operation

3r33333.  
Although an elaborate safety precaution can eliminate many risks, it is difficult to do everything right from the start. After the commissioning of AI systems, we need tools for their continuous monitoring and configuration. Our last category, warranty (assurance), examines these problems from two sides: monitoring 3r-3266. and submission 3r3r-3266. (enforcing). 3r33333.  
3r33333.  
[b] Monitoring
includes all methods of checking systems for analyzing and predicting their behavior, both with the help of human inspections (summary statistics) and with the help of automated inspections (to analyze a huge number of logs). On the other hand, submission 3r3r-3266. involves the development of mechanisms to control and limit the behavior of systems. Problems like [b] interpretability and interruptibility , belong to the subcategories of control and subordination, respectively. 3r33333.  
3r33333.  
Artificial intelligence systems are not like us either in appearance or in the way of data processing. This creates problems interpretability . Well-designed measurement tools and protocols allow us to evaluate the quality of decisions made by an artificial intelligence system (3r3-3388. Doshi-Velez and Kim, 2017
). For example, a medical artificial intelligence system would ideally make a diagnosis along with an explanation of how it arrived at such a conclusion — so that doctors could test the reasoning process from beginning to end (3r34040. De Fauw et al., 2018
). In addition, to understand more complex systems of artificial intelligence, we could even use automated methods for constructing behavioral models using 3r-3265. machine theory of mind
(3r3-3257. Rabinowitz et al., 2018 3r3-33427.). 3r33333.  
3r33333.  
Creating a safe AI: specifications, reliability and warranty 3r33333.  

ToMNet detects two subspecies of agents and predicts their behavior (from 3r3-3577. “Machine Theory of Mind” 3r3r-3327.) 3r-?384.
3r33333.  
3r33333.  
Finally, we want to be able to turn off the AI ​​system if necessary. This is a problem. interruptibility . Designing a robust switch is very difficult: for example, because an AI system with maximizing rewards usually has strong incentives to prevent this (3r3?667. Hadfield-Menell et al., 2017
); and because such interruptions, especially frequent ones, ultimately change the original task, forcing the AI ​​system to draw wrong conclusions from experience (3r32696. Orseau and Armstrong, 2016
). 3r33333.  
3r33333.  
3r33333.  

The problem with interruptions: human intervention (i.e. pressing the stop button) can change the task. In the figure, the interrupt adds a transition (in red) to a Markov decision process that changes the original problem (in black). See 3r38282. Orseau and Armstrong, 2016
3r33333.  
3r33333.  
3r33312. Looking to the future
3r33333.  
We are building the foundation of technology that will be used for many important applications in the future. It should be borne in mind that some solutions that are not critical to security at system startup may become such when the technology becomes widespread. Although at one time these modules were integrated into the system for convenience, it would be difficult to fix the problems without complete reconstruction. 3r33333.  
3r33333.  
We can give two examples from the history of computer science: it is the null pointer, which is Tony Hoare He called his “billion-dollar error” 3r33427. , and the gets () procedure in C. If early programming languages ​​were designed with security in mind, progress would have slowed, but it is likely that this would have a very positive effect on modern information security. 3r33333.  
3r33333.  
Now, having carefully thought out and planned everything, we are able to avoid similar problems and vulnerabilities. We hope that the categorization of problems from this article will serve as a useful basis for such methodical planning. We strive to ensure that in the future, AI systems will not only work according to the principle “I hope, safely,” but really reliably and verifiably safely, because we built them this way! 3r33333.  
3r33333.  
We look forward to continuing exciting progress in these areas in close collaboration with the wider AI research community and encourage people from different disciplines to considerl opportunity to contribute to the security research AI. 3r33333.  
3r33333.  
3r33312. Resources
3r33333.  
For reading on this topic, below is a selection of other articles, programs, and taxonomies that helped us in compiling our categorization or contain a useful alternative view on the technical safety issues of AI: 3r3318.  
3r33333.  
3r3408.  
3r33434.
Annotated bibliography of recommended materials
(Center for Human-Compatible AI, 2018)
 
3r33434. Safety and Control for Artificial General Intelligence (UC Berkeley, 2018)
 
3r33434. 3r33333. AI Safety Resources 3r33427. (Victoria Krakovna, 2018)
 
3r33434. 3r33338. AGI Safety Literature Review
(Everitt et al., 2018) 3r33428.  
3r33434. 3r33333. Preparing for Malicious Uses of AI
(2018) 3r33428.  
3r33434. Specification gaming examples in AI (Victoria Krakovna, 2018)
 
3r33434. 3r33333. Directions and desiderata for AI alignment
(Paul Christiano, 2017)
 
3r33434. 3r33358. Funding for Alignment Research
(Paul Christiano, 2017)
 
3r33434. 3r33333. Agent Foundations for Aligning Machine Interests: A Technical Research Agenda
(Machine Intelligence Research Institute, 2017)
 
3r33434. AI Safety Gridworlds (Leike et al., 2017) 3r3343428.  
3r33434. 3r33333. Interaction between the AI ​​Control Problem and the Governance Problem
(Nick Bostrom, 2017)
 
3r33434. Alignment for Advanced Machine Learning Systems (Machine Intelligence Research Institute, 2017)
 
3r33434. 3r33383. AI safety issues
(Stuart Armstrong, 2017)
 
3r33434. 3r33333. Concrete Problems in AI Safety
(Dario Amodei et al, 2016) 3r33428.  
3r33434. 3r33393. The Value Learning Problem
(Machine Intelligence Research Institute, 2016)
 
3r33434. (Future of Life Institute, 2015)
 
3r33434. 3r3403. Research Priorities for Robust and Beneficial Artificial Intelligence
(Future of Life Institute, 2015)
 
3r33430. 3r3408.  
3r33434. 3r33411. Artificial Intelligence
3r33434.  
3r33434. 3r31616. Machine Learning
3r33434.  
3r33434. 3r33421. Deepmind
3r33434.  
3r33434. 3r33426. Ai Safety
3r33434.  
3r33430.

3r3r434. ! function (e) {function t (t, n) {if (! (n in e)) {for (var r, a = e.document, i = a.scripts, o = i.length; o-- ;) if (-1! == i[o].src.indexOf (t)) {r = i[o]; break} if (! r) {r = a.createElement ("script"), r.type = "text /jаvascript", r.async =! ? r.defer =! ? r.src = t, r.charset = "UTF-8"; var d = function () {var e = a.getElementsByTagName ("script")[0]; e.parentNode.insertBefore (r, e)}; "[object Opera]" == e.opera? a.addEventListener? a.addEventListener ("DOMContentLoaded", d,! 1): e.attachEvent ("onload", d ): d ()}}} t ("//mediator.mail.ru/script/2820404/"""_mediator") () (); 3r33434.

It may be interesting

  • Comments
  • About article
  • Similar news
This publication has no comments.

weber

Author

4-10-2018, 23:24

Publication Date

Development / Machine learning

Category
  • Comments: 0
  • Views: 271
SHOCK! New software for phishing does
Open lesson "Feature Engineering on the
Let's talk about metrics as a way to
“How to turn a simple project into a
Dump, extract: The architecture of
Energomash: RD-181 rocket engines can
Write a comment
Name:*
E-Mail:


Comments
Born and raised in Sarawak, Malaysia. ICE CREAM is now one of the very few DJs who are active in the International scene. He had trained his way into playing at the top clubs all over Borneo since 2010. Check Out: DJ Ice cream
Yesterday, 22:19

noorseo

Thanks for the information your article brings. I see the novelty of your writing, I will share it for everyone to read together. I look forward to reading many articles from you.
<a href="https://sites.google.com/view/escortmumbaishub/"> Mumbai Escorts Service </a> 
<a href="https://vipmumbaiescortshub.blogspot.com/"> Escorts Service in Mumbai </a> 
<a href="https://vipmumbaiescortshub.wordpress.com/"> Independent Mumbai Escorts Service </a> 
<a href="https://vipmumbaiescortshub.weebly.com/"> Independent Mumbai Escorts Girls </a> 
<a href="https://mumbaiescortshub.webgarden.com/"> Call Girls Service in Mumbai </a> 


It's been operating for a long time in this publish for a great concept on it. I truly very experience analyzing your true and beneficial post thanks and you guys doing the sort of a great job keep it up
Premium Call Girls in Mumbai 
Escorts Service in Mumbai 
Escorts Agency in Mumbai 
Mumbai Call Girl Service 
Escorts Agency in Mumbai 


This is a good post. This post gives truly quality information. I’m definitely going to look into it. Really very useful tips are provided here. Thank you so much. Keep up the good works.
Late-night Slim Call Girls in Mumbai 
Slim Call Girls in Mumbai 
New Girl Available in Mumbai Escort 
Collage Escort Girl from Mumbai 
New Girl Available in Mumbai Escort 


Yesterday, 11:12

karishma Agarwal

If you go to file1.php and use an include, then the path is looked at from file1.php to file2.php to include it. But DIR allows us to give file1.php the correct path to file2.php when file1.php is not the file being executed. The interpreter is looking at being inside the project folder. Then if file1 calls to file2 via include, the interpreter will first look for require('file2.php') inside the project folder, NOT the inc geometry dash folder.
Yesterday, 05:25

ferrymalika

The Daily Reports is the reliable and authentic news and blog publisher. Visit The Daily Reports for up-to-date US news, international news and policy analysis. Check out: International Politics News


At Lopez Dario, we strive to serve customers with our online business consultancy services, project management, bookkeeping, & accounting for small businesses. Check Out: Business Consultancy England
21 January 2021 22:30

saifwordpress

nice post, keep up with this interesting work. It really is good to know that this topic is being covered also on this web site so cheers for taking time to discuss this!  https://l23movies.club/
21 January 2021 15:35

Legend SEO

Adv
Website for web developers. New scripts, best ideas, programming tips. How to write a script for you here, we have a lot of information about various programming languages. You are a webmaster or a beginner programmer, it does not matter, useful articles will help to make your favorite business faster.

Login

Registration Forgot password