• Guest
HabraHabr
  • Main
  • Users

  • Development
    • Programming
    • Information Security
    • Website development
    • JavaScript
    • Game development
    • Open source
    • Developed for Android
    • Machine learning
    • Abnormal programming
    • Java
    • Python
    • Development of mobile applications
    • Analysis and design of systems
    • .NET
    • Mathematics
    • Algorithms
    • C#
    • System Programming
    • C++
    • C
    • Go
    • PHP
    • Reverse engineering
    • Assembler
    • Development under Linux
    • Big Data
    • Rust
    • Cryptography
    • Entertaining problems
    • Testing of IT systems
    • Testing Web Services
    • HTML
    • Programming microcontrollers
    • API
    • High performance
    • Developed for iOS
    • CSS
    • Industrial Programming
    • Development under Windows
    • Image processing
    • Compilers
    • FPGA
    • Professional literature
    • OpenStreetMap
    • Google Chrome
    • Data Mining
    • PostgreSQL
    • Development of robotics
    • Visualization of data
    • Angular
    • ReactJS
    • Search technologies
    • Debugging
    • Test mobile applications
    • Browsers
    • Designing and refactoring
    • IT Standards
    • Solidity
    • Node.JS
    • Git
    • LaTeX
    • SQL
    • Haskell
    • Unreal Engine
    • Unity3D
    • Development for the Internet of things
    • Functional Programming
    • Amazon Web Services
    • Google Cloud Platform
    • Development under AR and VR
    • Assembly systems
    • Version control systems
    • Kotlin
    • R
    • CAD/CAM
    • Customer Optimization
    • Development of communication systems
    • Microsoft Azure
    • Perfect code
    • Atlassian
    • Visual Studio
    • NoSQL
    • Yii
    • Mono и Moonlight
    • Parallel Programming
    • Asterisk
    • Yandex API
    • WordPress
    • Sports programming
    • Lua
    • Microsoft SQL Server
    • Payment systems
    • TypeScript
    • Scala
    • Google API
    • Development of data transmission systems
    • XML
    • Regular expressions
    • Development under Tizen
    • Swift
    • MySQL
    • Geoinformation services
    • Global Positioning Systems
    • Qt
    • Dart
    • Django
    • Development for Office 365
    • Erlang/OTP
    • GPGPU
    • Eclipse
    • Maps API
    • Testing games
    • Browser Extensions
    • 1C-Bitrix
    • Development under e-commerce
    • Xamarin
    • Xcode
    • Development under Windows Phone
    • Semantics
    • CMS
    • VueJS
    • GitHub
    • Open data
    • Sphinx
    • Ruby on Rails
    • Ruby
    • Symfony
    • Drupal
    • Messaging Systems
    • CTF
    • SaaS / S+S
    • SharePoint
    • jQuery
    • Puppet
    • Firefox
    • Elm
    • MODX
    • Billing systems
    • Graphical shells
    • Kodobred
    • MongoDB
    • SCADA
    • Hadoop
    • Gradle
    • Clojure
    • F#
    • CoffeeScript
    • Matlab
    • Phalcon
    • Development under Sailfish OS
    • Magento
    • Elixir/Phoenix
    • Microsoft Edge
    • Layout of letters
    • Development for OS X
    • Forth
    • Smalltalk
    • Julia
    • Laravel
    • WebGL
    • Meteor.JS
    • Firebird/Interbase
    • SQLite
    • D
    • Mesh-networks
    • I2P
    • Derby.js
    • Emacs
    • Development under Bada
    • Mercurial
    • UML Design
    • Objective C
    • Fortran
    • Cocoa
    • Cobol
    • Apache Flex
    • Action Script
    • Joomla
    • IIS
    • Twitter API
    • Vkontakte API
    • Facebook API
    • Microsoft Access
    • PDF
    • Prolog
    • GTK+
    • LabVIEW
    • Brainfuck
    • Cubrid
    • Canvas
    • Doctrine ORM
    • Google App Engine
    • Twisted
    • XSLT
    • TDD
    • Small Basic
    • Kohana
    • Development for Java ME
    • LiveStreet
    • MooTools
    • Adobe Flash
    • GreaseMonkey
    • INFOLUST
    • Groovy & Grails
    • Lisp
    • Delphi
    • Zend Framework
    • ExtJS / Sencha Library
    • Internet Explorer
    • CodeIgniter
    • Silverlight
    • Google Web Toolkit
    • CakePHP
    • Safari
    • Opera
    • Microformats
    • Ajax
    • VIM
  • Administration
    • System administration
    • IT Infrastructure
    • *nix
    • Network technologies
    • DevOps
    • Server Administration
    • Cloud computing
    • Configuring Linux
    • Wireless technologies
    • Virtualization
    • Hosting
    • Data storage
    • Decentralized networks
    • Database Administration
    • Data Warehousing
    • Communication standards
    • PowerShell
    • Backup
    • Cisco
    • Nginx
    • Antivirus protection
    • DNS
    • Server Optimization
    • Data recovery
    • Apache
    • Spam and antispam
    • Data Compression
    • SAN
    • IPv6
    • Fidonet
    • IPTV
    • Shells
    • Administering domain names
  • Design
    • Interfaces
    • Web design
    • Working with sound
    • Usability
    • Graphic design
    • Design Games
    • Mobile App Design
    • Working with 3D-graphics
    • Typography
    • Working with video
    • Work with vector graphics
    • Accessibility
    • Prototyping
    • CGI (graphics)
    • Computer Animation
    • Working with icons
  • Control
    • Careers in the IT industry
    • Project management
    • Development Management
    • Personnel Management
    • Product Management
    • Start-up development
    • Managing the community
    • Service Desk
    • GTD
    • IT Terminology
    • Agile
    • Business Models
    • Legislation and IT-business
    • Sales management
    • CRM-systems
    • Product localization
    • ECM / EDS
    • Freelance
    • Venture investments
    • ERP-systems
    • Help Desk Software
    • Media management
    • Patenting
    • E-commerce management
    • Creative Commons
  • Marketing
    • Conferences
    • Promotion of games
    • Internet Marketing
    • Search Engine Optimization
    • Web Analytics
    • Monetize Web services
    • Content marketing
    • Monetization of IT systems
    • Monetize mobile apps
    • Mobile App Analytics
    • Growth Hacking
    • Branding
    • Monetize Games
    • Display ads
    • Contextual advertising
    • Increase Conversion Rate
  • Sundry
    • Reading room
    • Educational process in IT
    • Research and forecasts in IT
    • Finance in IT
    • Hakatonas
    • IT emigration
    • Education abroad
    • Lumber room
    • I'm on my way

Models Sequence-to-Sequence Part 1

 3r33300. 3r3-31. Good day everyone!
 3r33300.
 3r33300. And we have again opened a new stream for the revised course 3r336. "Data scientist"
: another Excellent teacher , slightly modified based on the program updates. Well, as usual interesting 3r310. open lessons
and selections of interesting materials. Today we will begin the analysis of seq2seq models from Tensor Flow.
 3r33300.
 3r33300. Go.
 3r33300.
 3r33300. As discussed in 3r320. tutorial RNN
(we recommend to read it before reading this article), recurrent neural networks can be taught to simulate a language. And an interesting question arises: is it possible to train the network on certain data to generate a meaningful answer? For example, can we teach a neural network to translate from English to French? It turns out that we can.
 3r33300.
 3r33300. This guide will show you how to create and train such an end-to-end system. Copy main repository Tensor Flow and TensorFlow model repository with GitHub . Then, you can start by running the translation program:
 3r33300.
 3r33300.
cd models /tutorials /rnn /translate
python translate.py --data_dir[your_data_directory]

 3r33300. Models Sequence-to-Sequence Part 1
 3r33300.
 3r33300. She will upload the data to translate English to French from 3r348. WMT’15 3-333272. , prepare them for training and train. This will require about 20 GB of hard disk space and quite a lot of time for downloading and preparation, so you can start the process now and continue reading this tutorial.
 3r33300.
 3r33300. The manual will refer to the following files:
 3r33300.
 3r33300.
 3r33300. 3r3-300.  3r33300. 3r3365. File
 3r33300. 3r3365. What is in it? 3r366.  3r33300. 3r3108.  3r33300. 3r3-300.  3r33300. 3r3105. tensorflow /tensorflow /python /ops /seq2seq.py
 3r33300. 3r3105. Library for creating sequence-to-sequence models
 3r33300. 3r3108.  3r33300. 3r3-300.  3r33300. 3r3105. models /tutorials /rnn /translate /seq2seq_model.py
 3r33300. 3r3105. Sequence-to-sequence of the neural translation model
 3r33300. 3r3108.  3r33300. 3r3-300.  3r33300. 3r3105. models /tutorials /rnn /translate /data_utils.py
 3r33300. 3r3105. Auxiliary functions for preparing translation data 3r3106.  3r33300. 3r3108.  3r33300. 3r3-300.  3r33300. 3r3105. models /tutorials /rnn /translate /translate.py
 3r33300. 3r3105. A binary that trains and runs the translation model
 3r33300. 3r3108.  3r33300. 3r31-10.
 3r33300. The basics of sequence-to-sequence
 3r33300.
 3r33300. The basic sequence-to-sequence model, as presented in Cho et al., 2014 ( Pdf ), Consists of two recurrent neural networks (RNN): an encoder (encoder) that processes the input data, and a decoder (decoder) that generates output data. The basic architecture is shown below:
 3r33300.
 3r33300.  3r33300.
 3r33300. Each box in the picture above represents a cell in the RNN, usually a GRU cell - a managed recurrent block, or an LSTM cell - a long short-term memory (read 3r3189. Tutorial RNN 3r37272. For more details). Encoders and decoders can have common weights or, more often, use different sets of parameters. Multi-layered cells have been used successfully in sequence-to-sequence models, for example, for translation Sutskever et al., 2014 ( Pdf 3r3323272.).
 3r33300.
 3r33300. In the base model described above, each input must be encoded in a state of a fixed-size state, since this is the only thing that is transmitted to the decoder. To give the decoder more direct access to the input data, in 3r3142. Bahdanau et al., 2014
( Pdf 3r3323272.) The attention mechanism was presented. We will not go into the details of the mechanism of attention (for this you can get acquainted with the work of the link); suffice it to say that it allows the decoder to look into the input data at each decoding step. A multi-layered sequence-to-sequence network with LSTM cells and the attentional mechanism in the decoder is as follows:
 3r33300.
 3r33300. 3r33150.
 3r33300.
 3r33300. TensorFlow library seq2seq
 3r33300.
 3r33300. As you can see above, there are different sequence-to-sequence models. All of them can use different RNN cells, but all of them accept encoder input data and decoder input data. This is the basis of the TensorFlow seq2seq library interface (tensorflow /tensorflow /python /ops /seq2seq.py). This basic, RNN, codec, sequence-to-sequence model works as follows.
 3r33300.
 3r33300.
outputs, states = basic_rnn_seq2seq (encoder_inputs, decoder_inputs, cell)
 3r33300. In the call above, encoder_inputs is a list of tensors representing the encoder input data, corresponding to the letters A, B, C from the image above. Similarly, decoder_inputs - tensors representing decoder input data. GO, W, X, Y, Z from the first picture.
 3r33300.
 3r33300. Argument cell - instance class tf.contrib.rnn.RNNCell 3r3-33284. which determines which cell will be used in the model. You can use existing cells, for example, GRUCell or LSTMCell or you can write your own. In addition, tf.contrib.rnn provides shells for creating layered cells, adding exceptions to cell input and output data, or other transformations. Read 3r3189. RNN Tutorial
, for examples.
 3r33300.
 3r33300. Call basic_rnn_seq2seq returns two arguments: outputs and states . They both represent a list of tensors of the same length as 3r3r-3283. decoder_inputs
. outputs corresponds to the output of the decoder at each time step, in the first picture it is W, X, Y, Z, EOS. Returned states represents the internal state of the decoder at each time step.
 3r33300.
 3r33300. In many applications using the model's sequence-to-sequence, the decoder output at time t is transmitted back to input to the decoder at time t + 1. When testing, during the decoding sequence, this is how a new one is constructed. On the other hand, during training, it is customary to transmit to the decoder correct input data at each time step, even if the decoder was previously mistaken. Functions in 3r38383. seq2seq.py support both modes with argument feed_previous . For example, analyze the following usage of the nested RNN model.
 3r33300.
 3r33300.

    outputs, states = embedding_rnn_seq2seq (
encoder_inputs, decoder_inputs, cell,
num_encoder_symbols, nr_decoder_symbols,
embedding_size, output_projection = None,

 3r33300. In the model embedding_rnn_seq2seq All input data (as 3r3-3833. encoder_inputs 3r-3-33284., and 3-333283. decoder_inputs 3r-3-3284.) are integer tensors that represent discrete values. They will be embedded in a solid representation (for details on the attachment, refer to 3r3-33235. Guide to Vector Representations 3r-33272.), But to create these attachments, you need to specify the maximum number of discrete characters: 3r-33283. num_encoder_symbols on the side of the encoder and num_decoder_symbols on the decoder side.
 3r33300.
 3r33300. In the call above, we ask feed_previous False value. This means that the decoder will use the tensors 3r-3283. decoder_inputs in the form in which they are provided. If we ask feed_previous true value, the decoder will use only the first element decoder_inputs . All other tensors from the list will be ignored, and the previous value of the decoder output will be used instead. This is used to decode translations in our translation model, but it can also be used during training, to improve the resilience of the model to its errors. Approximately like Bengio et al., 2015 ( Pdf ).
 3r33300.
 3r33300. Another important argument used above is output_projection . Without clarification, the conclusions of the nested model will be the form tensors of the number of training samples on 3r-3383. num_decoder_symbols because they represent the logs of each generated character. When training models with large output dictionaries, for example, with a large num_decoder_symbols , storing these large tensors becomes impractical. Instead, it is better to return smaller tensors, which will subsequently be projected onto a large tensor using 3r-3383. output_projection . This allows us to use our seq2seq models with softmax sampled losses, as described in 3r-3269. Jean et. al., 201? ( Pdf ).
 3r33300.
 3r33300. In addition to basic_rnn_seq2seq and embedding_rnn_seq2seq at seq2seq.py There are some more sequence-to-sequence models. Pay attention to them. All of them have a similar interface, so we will not go into their details. For our translation model below, we use embedding_attention_seq2seq .
 3r33300.
 3r33300. Continuation will follow.
3r33300. 3r33300. 3r33300.
3r33300.
3r33300. 3r33300. 3r33300. 3r33300.

It may be interesting

  • Comments
  • About article
  • Similar news
This publication has no comments.

weber

Author

23-11-2018, 06:18

Publication Date

Big Data / Data Mining / Machine learning

Category
  • Comments: 0
  • Views: 338
What I don't like in C and C ++
xonsh - python as a replacement for
How to create procedural art in less
Digital events in Moscow from October
Honey, we kill free text content
Liquid cooling brakes. System
Write a comment
Name:*
E-Mail:


Comments
Nice information, valuable and excellent design, as share good stuff with good ideas and concepts, lots of great information and inspiraopencarttion, both of which I need, thanks to offer such a helpful information here.
Today, 17:58

taxiseo2

This is a wonderful article, Given so much info in it, These type of articles keeps the users interest in the website, and keep on sharing more ... good luck.

opencart eticaret
Today, 17:53

taxiseo2

I really loved reading your blog. It was very well authored and easy to undertand. Unlike additional blogs I have read which are really not tht good. I also found your posts very interesting. In fact after reading. I had to go show it to my friend and he ejoyed it as well!seo toronto



Hey what a brilliant post I have come across and believe me I have been searching out for this similar kind of post for past a week and hardly came across this. Thank you very much and will look for more postings from you. [Url = https: //mtsoul.net] 먹튀 검증 [/ url]

Today, 16:41

raymond weber

I recently came across your blog and have been reading along. I thought I would leave my first comment. I don't know what to say except that I have enjoyed reading. Nice blog. I will keep visiting this blog very often.먹튀검증

Today, 15:58

raymond weber

You understand your projects stand out of the crowd. There is something unique about them. It seems to me all of them are brilliant.https://mtsoul.net

Today, 13:58

raymond weber

Adv
Website for web developers. New scripts, best ideas, programming tips. How to write a script for you here, we have a lot of information about various programming languages. You are a webmaster or a beginner programmer, it does not matter, useful articles will help to make your favorite business faster.

Login

Registration Forgot password