Models Sequence-to-Sequence Part 1
3r33300. 3r3-31. Good day everyone!
3r33300.
3r33300. And we have again opened a new stream for the revised course 3r336. "Data scientist"
: another Excellent teacher , slightly modified based on the program updates. Well, as usual interesting 3r310. open lessons
and selections of interesting materials. Today we will begin the analysis of seq2seq models from Tensor Flow.
3r33300.
3r33300. Go.
3r33300.
3r33300. As discussed in 3r320. tutorial RNN
(we recommend to read it before reading this article), recurrent neural networks can be taught to simulate a language. And an interesting question arises: is it possible to train the network on certain data to generate a meaningful answer? For example, can we teach a neural network to translate from English to French? It turns out that we can.
3r33300.
3r33300. This guide will show you how to create and train such an end-to-end system. Copy main repository Tensor Flow and TensorFlow model repository with GitHub . Then, you can start by running the translation program:
3r33300.
3r33300. cd models /tutorials /rnn /translate
python translate.py --data_dir[your_data_directory]
3r33300.
3r33300.
3r33300. She will upload the data to translate English to French from 3r348. WMT’15 3-333272. , prepare them for training and train. This will require about 20 GB of hard disk space and quite a lot of time for downloading and preparation, so you can start the process now and continue reading this tutorial.
3r33300.
3r33300. The manual will refer to the following files:
3r33300.
3r33300.
3r33300. 3r3-300. 3r33300. 3r3365. File
3r33300. 3r3365. What is in it? 3r366. 3r33300. 3r3108. 3r33300. 3r3-300. 3r33300. 3r3105. tensorflow /tensorflow /python /ops /seq2seq.py
3r33300. 3r3105. Library for creating sequence-to-sequence models
3r33300. 3r3108. 3r33300. 3r3-300. 3r33300. 3r3105. models /tutorials /rnn /translate /seq2seq_model.py
3r33300. 3r3105. Sequence-to-sequence of the neural translation model
3r33300. 3r3108. 3r33300. 3r3-300. 3r33300. 3r3105. models /tutorials /rnn /translate /data_utils.py
3r33300. 3r3105. Auxiliary functions for preparing translation data 3r3106. 3r33300. 3r3108. 3r33300. 3r3-300. 3r33300. 3r3105. models /tutorials /rnn /translate /translate.py
3r33300. 3r3105. A binary that trains and runs the translation model
3r33300. 3r3108. 3r33300. 3r31-10.
3r33300. The basics of sequence-to-sequence
3r33300.
3r33300. The basic sequence-to-sequence model, as presented in Cho et al., 2014 ( Pdf ), Consists of two recurrent neural networks (RNN): an encoder (encoder) that processes the input data, and a decoder (decoder) that generates output data. The basic architecture is shown below:
3r33300.
3r33300. 3r33300.
3r33300. Each box in the picture above represents a cell in the RNN, usually a GRU cell - a managed recurrent block, or an LSTM cell - a long short-term memory (read 3r3189. Tutorial RNN 3r37272. For more details). Encoders and decoders can have common weights or, more often, use different sets of parameters. Multi-layered cells have been used successfully in sequence-to-sequence models, for example, for translation Sutskever et al., 2014 ( Pdf 3r3323272.).
3r33300.
3r33300. In the base model described above, each input must be encoded in a state of a fixed-size state, since this is the only thing that is transmitted to the decoder. To give the decoder more direct access to the input data, in 3r3142. Bahdanau et al., 2014 ( Pdf 3r3323272.) The attention mechanism was presented. We will not go into the details of the mechanism of attention (for this you can get acquainted with the work of the link); suffice it to say that it allows the decoder to look into the input data at each decoding step. A multi-layered sequence-to-sequence network with LSTM cells and the attentional mechanism in the decoder is as follows:
3r33300.
3r33300. 3r33150.
3r33300.
3r33300. TensorFlow library seq2seq
3r33300.
3r33300. As you can see above, there are different sequence-to-sequence models. All of them can use different RNN cells, but all of them accept encoder input data and decoder input data. This is the basis of the TensorFlow seq2seq library interface (tensorflow /tensorflow /python /ops /seq2seq.py). This basic, RNN, codec, sequence-to-sequence model works as follows.
3r33300.
3r33300. outputs, states = basic_rnn_seq2seq (encoder_inputs, decoder_inputs, cell)
3r33300. In the call above, encoder_inputs
is a list of tensors representing the encoder input data, corresponding to the letters A, B, C from the image above. Similarly, decoder_inputs
- tensors representing decoder input data. GO, W, X, Y, Z from the first picture.
3r33300.
3r33300. Argument cell
- instance class tf.contrib.rnn.RNNCell 3r3-33284. which determines which cell will be used in the model. You can use existing cells, for example,
, for examples. GRUCell
or LSTMCell
or you can write your own. In addition, tf.contrib.rnn
provides shells for creating layered cells, adding exceptions to cell input and output data, or other transformations. Read 3r3189. RNN Tutorial
3r33300.
3r33300. Call basic_rnn_seq2seq
returns two arguments: outputs
and states
. They both represent a list of tensors of the same length as 3r3r-3283. decoder_inputs . outputs
corresponds to the output of the decoder at each time step, in the first picture it is W, X, Y, Z, EOS. Returned states
represents the internal state of the decoder at each time step.
3r33300.
3r33300. In many applications using the model's sequence-to-sequence, the decoder output at time t is transmitted back to input to the decoder at time t + 1. When testing, during the decoding sequence, this is how a new one is constructed. On the other hand, during training, it is customary to transmit to the decoder correct input data at each time step, even if the decoder was previously mistaken. Functions in 3r38383. seq2seq.py support both modes with argument feed_previous
. For example, analyze the following usage of the nested RNN model.
3r33300.
3r33300.
outputs, states = embedding_rnn_seq2seq (
encoder_inputs, decoder_inputs, cell,
num_encoder_symbols, nr_decoder_symbols,
embedding_size, output_projection = None,
3r33300. In the model
embedding_rnn_seq2seq
All input data (as 3r3-3833. encoder_inputs 3r-3-33284., and 3-333283. decoder_inputs 3r-3-3284.) are integer tensors that represent discrete values. They will be embedded in a solid representation (for details on the attachment, refer to 3r3-33235. Guide to Vector Representations 3r-33272.), But to create these attachments, you need to specify the maximum number of discrete characters: 3r-33283. num_encoder_symbols on the side of the encoder and num_decoder_symbols
on the decoder side.3r33300.
3r33300. In the call above, we ask
feed_previous
False value. This means that the decoder will use the tensors 3r-3283. decoder_inputs in the form in which they are provided. If we ask feed_previous
true value, the decoder will use only the first element decoder_inputs
. All other tensors from the list will be ignored, and the previous value of the decoder output will be used instead. This is used to decode translations in our translation model, but it can also be used during training, to improve the resilience of the model to its errors. Approximately like Bengio et al., 2015 ( Pdf ).3r33300.
3r33300. Another important argument used above is
output_projection
. Without clarification, the conclusions of the nested model will be the form tensors of the number of training samples on 3r-3383. num_decoder_symbols because they represent the logs of each generated character. When training models with large output dictionaries, for example, with a large num_decoder_symbols
, storing these large tensors becomes impractical. Instead, it is better to return smaller tensors, which will subsequently be projected onto a large tensor using 3r-3383. output_projection . This allows us to use our seq2seq models with softmax sampled losses, as described in 3r-3269. Jean et. al., 201? ( Pdf ).3r33300.
3r33300. In addition to
basic_rnn_seq2seq
and embedding_rnn_seq2seq
at seq2seq.py
There are some more sequence-to-sequence models. Pay attention to them. All of them have a similar interface, so we will not go into their details. For our translation model below, we use embedding_attention_seq2seq
.3r33300.
3r33300. Continuation will follow.
3r33300. 3r33300. 3r33300.
3r33300.
3r33300. 3r33300. 3r33300. 3r33300.
It may be interesting
weber
Author23-11-2018, 06:18
Publication DateBig Data / Data Mining / Machine learning
Category- Comments: 0
- Views: 338
opencart eticaret
Hey what a brilliant post I have come across and believe me I have been searching out for this similar kind of post for past a week and hardly came across this. Thank you very much and will look for more postings from you. [Url = https: //mtsoul.net] 먹튀 검증 [/ url]