# Overview of Deep Domain Adaptation Basic Methods (Part 1)

3r33876. 3r3-31. 3r33838. The development of deep neural networks for image recognition inhales new life into already known areas of research in machine learning. One such area is domain adaptation. The essence of this adaptation is to train the model on the data from the source domain (source domain) so that it shows a comparable quality on the target domain (target domain). For example, the source domain can be synthetic data that can be “cheaply” generated, and the target domain is user photos. Then the task of domain adaptation is to train a model on synthetic data that will work well with “real” objects. 3r33838. 3r33861. 3r33876. 3r33838. In the machine vision group [email protected] We are working on various application tasks, and among them there are often those for which there is little training data. In these cases, the generation of synthetic data and the adaptation of the model trained on them can help greatly. A good applied example of this approach is the problem of detecting and recognizing products on the shelves in a store. Obtaining photos of such shelves and their markings are quite laborious, but they can be generated quite easily. Therefore, we decided to dive deeper into the topic of domain adaptation. 3r33838. 3r33861. 3r33876. 3r33838. 3r33853. 3r33861. 3r33876. 3r33838. Studies in domain adaptation address the issues of using the previous experience accumulated by the neural network in a new task. Will the network be able to isolate some features from the source domain and use them in the target domain? Although the neural network in machine learning has only a remote relation to the neural networks in the human brain, it is still the Holy Grail of artificial intelligence researchers that teach the neural networks about the capabilities that a person possesses. And people are able to use previous experience and accumulated knowledge to understand new concepts. 3r33838. 3r33861. 3r33876. 3r33838. In addition, domain adaptation can help solve one of the fundamental problems of deep learning: to train large networks with high recognition quality, a very large amount of data is needed, which in practice is not always available. One solution may be to use domain adaptation methods on synthetic data that can be generated in virtually unlimited quantities. 3r33838. 3r33861. 3r33876. 3r33838. Quite often, in applied tasks, there is a case when only data from one domain is available for learning, and it is necessary to apply the model on another domain. For example, a network that determines the aesthetic quality of a photograph can be trained on a database available on the web, collected from the site of amateur photographers. And it is planned to use this network on ordinary photos, the quality level of which is on average different from the level of a photo from a specialized photo site. As an alternative solution, you can consider adapting the model to ordinary unpartitioned photos. 3r33838. 3r33861. 3r33876. 3r33838. Such theoretical and applied issues lie in the domain adaptation domain. In this article, I will talk about the main current research in this area, based on deep learning, and datasets for comparing different methods. The main idea of deep domain adaptation is to train a deep neural network on the source domain, which will translate the image into such a vector representation (embedding) (usually the last layer of the network), which when used on the target domain will result in high quality. 3r33838. 3r33861. 3r33876. 3r3335. The main benchmarks 3r3202. 3r33861. 3r33876. 3r33838. As in any field of machine learning, in a domain adaptation a certain amount of research accumulates over time, which must be compared with each other. For this, the community produces datasets, in the training part of which the models are trained, and in the test part they are compared. Despite the fact that the area of deep domain adaptation research is still relatively young, there are already quite a large number of articles and databases that are used in these articles. I will list the main ones, focusing on the adaptation of the domain of synthetic data to "real". 3r33838. 3r33861. 3r33876.

Figures

3r33861. 3r33876. 3r33838. Apparently, according to the tradition, established 3r348. By Yann LeKun 3r3r5353. (one of the pioneers of deep learning, director of Facebook AI Research), in computer vision, the simplest datasets are associated with handwritten numbers or letters. There are several data sets with numbers that originally appeared to experiment with models for image recognition. In articles on domain adaptation, one can find their most diverse combinations in pairs of source - target domain. Among these:

3r33861. 3r33876. 3r33844. 3r33876. 3r33851. MNIST 3r33853. - handwritten numbers, does not need additional representation; 3r33854. 3r33876. 3r33851. 3r361. USPS 3r35353. - handwritten numbers in low resolution; 3r33854. 3r33876. 3r33851. 3r366. SVHN 3r33853. - house numbers with Google Street View; 3r33854. 3r33876. 3r33851.

Synth Numbers - synthetic numbers, as the name suggests. 3r33854. 3r33876. 3r33856. 3r33861. 3r33876. 3r33838. From the point of view of the task of learning on synthetic data for use in the "real" world, the pairs of the most interest are: 3r-3864. 3r33861. 3r33876. 3r33844. 3r33876. 3r33851. Source: MNIST, Target: SVHN; 3r33854. 3r33876. 3r33851. Source: USPS, Target: MNIST; 3r33854. 3r33876. 3r33851. Source: Synth Numbers, Target: SVHN. 3r33854. 3r33876. 3r33856. 3r33861. 3r33876. 3r33838. Office

3r33861. 3r33876. 3r33838. This Dataset contains 31 categories of different items, each of which is presented in 3 domains: an image from Amazon, a photo from a webcam and a photo from a digital camera. 3r33838. 3r33861. 3r33876. 3r33838. 3r3114. 3r33838. 3r33861. 3r33876. 3r33838. It is useful for checking how the model will react to the addition of the background and the quality of the shooting to the target domain. 3r33838. 3r33861. 3r33876. 3r3122. Road signs

3r33861. 3r33876. 3r33838. Another pair of datasets for teaching the model on synthetic data and applying it on “real” dаta: 3r36464. 3r33861. 3r33876. 3r33844. 3r33876. 3r33851. Source: Synth Signs - images of road signs generated so that they look like real signs on the street; 3r33854. 3r33876. 3r33851. Target: 3r3138. GTSRB 3r33853. - a fairly well-known base for recognition, containing signs from German roads. 3r33854. 3r33876. 3r33856. 3r33861. 3r33876. 3r33838. 3r3146. 3r33838. 3r33861. 3r33876. 3r33838. A feature of this pair of databases is that the data from Synth Signs are made quite similar to “real” data, so the domains are quite close. 3r33838. 3r33861. 3r33876. 3r3154. From the window of the car

3r33861. 3r33876. 3r33838. Datasets for segmentation. Quite an interesting couple, most close to real conditions. Baseline data is obtained using the game engine (GTA 5), and target data from real life. Similar approaches are used to train models that are used in autonomous cars. 3r33838. 3r33861. 3r33876. 3r33844. 3r33876. 3r33851. 3r3165. SYNTHIA

or the GTA 5 engine - pictures with city views from the car window, generated using the game engine; 3r33854. 3r33876. 3r33851. 3r33170. Cityscapes

- Photos of the car, made in 50 different cities. 3r33854. 3r33876. 3r33856. 3r33861. 3r33876. 3r33838. 3r3178. 3r33838. 3r33861. 3r33876. 3r3182. VisDA

3r33861. 3r33876. 3r33838. This dataset is used in competition 3r3187. Visual Domain Adaptation Challenge

which is held as part of the workshop at the ECCV and ICCV. In the source domain, there are 12 categories of labeled objects generated using CAD, such as an airplane, a horse, a person, etc. The target domain contains unallocated images from the same 12 categories taken from ImageNet. In the competition, which was held in 201? the 13th category was added: Unknown. 3r33838. 3r33861. 3r33876. 3r33838. 3r3193. 3r33838. 3r33861. 3r33876. 3r33838. As can be seen from all of the above, there are quite a lot of interesting and diverse datasets for domain adaptation, you can train and test models for various tasks (classification, segmentation, detection) and various conditions (synthetic data, photos, types of streets). 3r33838. 3r33861. 3r33876.

Deep Domain Adaptation

3r33861. 3r33876. 3r33838. There is a rather extensive and diverse classification of methods for domain adaptation (for example, you can get acquainted for example 3r3206. Here, 3r3r5353.). I will give in this article a simplified division of methods according to their key features. Modern deep domain adaptation methods can be divided into 3 large groups:

3r33861. 3r33876. 3r33844. 3r33876. 3r33851. 3r33838. Discrepancy-based

: approaches based on minimizing the distance between vector representations on the source and target domains by introducing this distance into the loss-function. 3r33854. 3r33876. 3r33851. 3r33838. Adversarial-based

: these approaches use a competitive (adversarial) loss function, which appeared in GANs, to train a network that is invariant with respect to the domain. The methods of this family have been actively developed in the last couple of years. 3r33854. 3r33876. 3r33851. 3r33838. Mixed methods

that do not use adversarial loss, but apply ideas from the discrepancy-based family, as well as recent developments from deep learning: self-ensembling, new layers, loss functions, etc. These approaches show the best results in the VisDA competition. 3r33854. 3r33876. 3r33856. 3r33861. 3r33876. 3r33838. From each section we will review several major, in my opinion, results obtained in the last 1-3 years. 3r33838. 3r33861. 3r33876.

Discrepancy-based

3r33861. 3r33876. 3r33838. When the problem arises of adapting the model to new data, the first thing that comes to mind is the use of fine-tuning, i.e. Additional model training on new data. To do this, you must take into account the measure of discrepancy between domains. This type of domain adaptation can be divided into three approaches: Class Criterion, Statistical criterion and Architecture Criterion. 3r33838. 3r33861. 3r33876.

Class Criterion

3r33861. 3r33876. 3r33838. Methods from this family are mainly used when tagged data from the target domain is available to us. One popular variation of Class Criterion is 3r33839. Deep transfer metric learning

. As the name implies, it is based on metric learning, the essence of which is to teach such a vector representation, derived from a neural network, that representatives of one class will be close to each other in this representation according to a given metric (most often they use 3r3806. 3r3647. 3r3808. or cosine metrics). Article 3r35353. Deep transfer metric learning (DTML)

To implement this approach, a loss consisting of the sum of the following terms is used: 3r3-3864. 3r33861. 3r33876. 3r33844. 3r33876. 3r33851. The proximity of representatives of one class to each other (intraclass compactness); 3r33854. 3r33876. 3r33851. Increasing the distance between representatives of different classes (interclass separability); 3r33854. 3r33876. 3r33851. Metric Maximum Mean Discrepancy (MMD) between domains. This metric belongs to the statistical criterion family (see below), but is also used in class criterion. 3r33854. 3r33876. 3r33856. 3r33861. 3r33876. 3r33838. MMD between domains is written as

3r33861. 3r33876. 3r33838. 3r33838. 3r33838. 3r3r6806. 3r3808. 3r33838. 3r33861. 3r33876. 3r33838. where

3r3808. - this is some core, in our case - vector representation of the network,

3r3808. - data from the source domain, 3r3r6806. 3r3808. - data from the target domain. Thus, while minimizing the MMD metric during training, such a network is selected 3r3806. 3r3808. so that its mean vector representations on both domains are close. The basic idea of DTML:

3r33861. 3r33876. 3r33838. 3r3302. 3r33838. 3r33861. 3r33876. 3r33838. If unsupervised domain adaptation data is not assigned to the target domain, 3r33939. The method described in 3r3308. Mind the Class Weight Bias: Weighted Maximum Mean Discrepancy for Unsupervised Domain Adaptation

3r33840. , offers to train the model on the source domain and use it to get pseudo-labels on the target domain. Those. the data from the target domain is run through the network and the result is called pseudo-labels. Then they are used as markup for the target domain, which allows the MMD criterion to be applied to the loss function (with different weights for the components responsible for different domains). 3r33838. 3r33861. 3r33876.

Statistical criterion

3r33861. 3r33876. 3r33838. Methods related to this family are used to solve the problem of unsupervised domain adaptation. The case where the target domain is unallocated occurs in many problems, and all domain adaptation methods that will be discussed later in this article solve exactly this problem. 3r33838. 3r33861. 3r33876. 3r33838. Approaches based on statistical criteria try to measure the difference between the distributions of the vector representation of the network obtained from the data of the source and target domains. They then use the calculated difference to approximate the two distributions. 3r33838. 3r33861. 3r33876. 3r33838. One of these criteria is already described above 3r3-33839. Maximum Mean Discrepancy (MMD) 3r33840. . Its variants are used in several methods:

3r33861. 3r33876. 3r33844. 3r33876. 3r33851. 3r33333. Deep adaptation network (DAN) 3r3r5353. ; 3r33854. 3r33876. 3r33851. 3r33333. Joint adaptation network (JAN)

; 3r33854. 3r33876. 3r33851. 3r33333. Residual transfer network (RTN) 3r35353. . RTN shows good results for a pair of MNIST -> SVHN: ???% accuracy on the target domain. 3r33854. 3r33876. 3r33856. 3r33861. 3r33876. 3r33838. Schemes of these three methods are presented below. In them, MMD variants are used to determine the difference between the distributions on layers of a convolutional neural network applied to the source and target domains. Please note that each of them uses a MMD modification as a loss between layers of convolutional networks (yellow figures in the diagram). 3r33838. 3r33861. 3r33876. 3r33838. 3r33357. 3r33838. 3r33861. 3r33876. 3r33838. 3r33838. Criterion CORAL 3r35353. (CORrelation ALignment)

and its expansion with the help of deep networks

Deep CORAL 3r35353. aimed at then, to learn such a representation of the data so as to maximally coincide between second-order statistics between domains. For this, the covariance matrices of vector representations of the network are used. The convergence of second-order statistics on both domains in some cases allows for better adaptation results than for MMD. 3r33838. 3r33861. 3r33876. 3r33838. 3r33838. 3r33838. 3r3r6806. 3r33375. 3r3808. 3r33838. 3r33861. 3r33876. 3r33838. where

3r33382. 3r3808. - The square of the matrix norm of Frobenius, and 3r3806. 3r33385. 3r3808. and 3r3806. 3r33333. 3r3808. - covariance data matrices from the source and target domains, respectively, 3r3806. 3r3333391. 3r3808. - dimension of the vector representation. 3r33838. 3r33861. 3r33876. 3r33838. On Office data, the average quality of adaptation using Deep CORAL for pairs of Amazon and Webcam domains: 72.1%. On the Synth Signs -> GTSRB road sign domains, the result is also quite average: 86.9% accuracy on the target domain. 3r33838. 3r33861. 3r33876. 3r33838. The development of the ideas of MMD and CORAL is 3r3-33839. criterion 3r3402. Central Moment Discrepancy (CMD) 3-333853. 3r33840. which compares the central moments of data from the source and target domains of all orders up to

3r3752. 3r3808. inclusive (

- parameter of the algorithm). At Office, the average CMD adaptation quality for Amazon and Webcam domain pairs is 77.0%. 3r33838. 3r33861. 3r33876. 3r33414. Architecture Criterion

3r33861. 3r33876. 3r33838. Algorithms of this type are built on the assumption that the basic information that is responsible for adaptation to a new domain is embedded in the parameters of the neural network. 3r33838. 3r33861. 3r33876. 3r33838. In 3r33423. 3r3r5353 3r33434. Works 3r33853. when learning networks for the source and target domains using loss functions for each pair of layers, information invariant with respect to the domain is studied on the weights of these layers. An example of such architectures is given below. 3r33838. 3r33861. 3r33876. 3r33838. 3r33840. the idea was expressed that the weights of the network contain information related to the classes on which the network learns, and the domain information is embedded in the statistics (mean and standard deviation) of the Batch Normalization (BN) layers. Therefore, for adaptation, it is necessary to recalculate these statistics on data from the target domain. Using this technique in conjunction with CORAL can improve the quality of adaptation in the Office dataset for couples of the Amazon and Webcam domains to 95.0%. Then

3r35353 has been shown. that using the Instance Normalization (IN) layer instead of BN further improves the quality of adaptation. Unlike BN, which normalizes the input tensor in batch mode, IN calculates statistics for normalization in channels and, therefore, does not depend on the batch. 3r33838. 3r33861. 3r33876. 3r33434. Adversarial-Based Approaches

3r33861. 3r33876. 3r33838. In the last 1-2 years, most of the results in deep domain adaptation are related to the adversarial-based approach. This is largely due to the rapid development and growth of the popularity of r3-3450. Generative Adversarial Networks (GAN) 3r35353. because the adversarial-based approach to domain adaptation uses the same adversarial objective function in training as the GAN. By optimizing it, such methods of deep domain adaptation minimize the distance between the empirical distributions of vector data representations on the source and target domains. By training the network in this way, they try to make it invariant with respect to the domain. 3r33838. 3r33861. 3r33876. 3r33838. The GAN consists of two models: the 3r3r6806 generator. 3r33782. 3r3808. , at the output of which data is obtained from some target distribution; and discriminator 3r3806. 3r???. 3r3808. which determines whether the data from the training sample was input to it or generated using 3r3806. 3r33782. 3r3808. . These two models are trained with the help of adversarial objective function: 3r33864. 3r33861. 3r33876. 3r33838. 3r33838. 3r33838. 3r3r6806. 3r33838. 3r33861. 3r33876. 3r33838. With such training, the generator learns to "deceive" the discriminator, which allows to bring together the distribution of the target and source domains. 3r33838. 3r33861. 3r33876. 3r33838. There are two big approaches in adversarial-based domain adaptation, which differ in whether or not the 3r3806 generator is used. 3r33782. 3r3808. . 3r33838. 3r33861. 3r33876.

Non-Generative Models

3r33861. 3r33876. 3r33838. A key feature of the methods from this family is the training of a neural network with a vector representation invariant with respect to the source and target domains. Then the network trained in the marked source domain can be used on the target domain, ideally with practically no loss of classification quality. 3r33838. 3r33861. 3r33876. 3r33838. Presented in 201?

Algorithm

Domain-Adversarial Training of Neural Networks (DANN) 3r33840. ( Code ) Consists of 3 parts:

3r33861. 3r33876. 3r33844. 3r33876. 3r33851. The main network, with the help of which a vector representation (feature extractor) is obtained (the green part in the illustration below); 3r33854. 3r33876. 3r33851. "Heads" responsible for classification on the source domain (the blue part in the illustration); 3r33854. 3r33876. 3r33851. "Heads" which learns to distinguish data from the source domain from the target (the red part in the illustration). 3r33854. 3r33876. 3r33856. 3r33861. 3r33876. 3r33838. When learning using gradient descent (SGD) (arrows to the input in the illustration), the classification and domain loss is minimized. In addition, when the error propagates backward for the “head” responsible for the domains, the Gradient reversal layer is used (the black part in the illustration), which multiplies the gradient passing through it by a negative constant, increasing the domain loss. This ensures that the distribution of vector representations on both domains become close. 3r33838. 3r33861. 3r33876. 3r33838. 3r33525. 3r33838. 3r33861. 3r33876. 3r33838. DANN results on benchmarks: 3r3-3864. 3r33861. 3r33876. 3r33844. 3r33876. 3r33851. On a pair of digital domains Synth Numbers -> SVHN: ???%. 3r33854. 3r33876. 3r33851. On Synth Signs -> GTSRB road signs, it surpasses CORAL with a score of 88.7%. 3r33854. 3r33876. 3r33851. On Office data, average adaptation quality for Amazon and Webcam domain pairs: 73.0%. 3r33854. 3r33876. 3r33856. 3r33861. 3r33876. 3r33838. The next important representative of the non-generative models family is 3r33839. method 3r3-3549. Adversarial Discriminative Domain Adaptation (ADDA)

3r33840. (3r33552. Code 3r35353.), Which implies network separation for the source domain and network for the target domain. The algorithm consists of the following steps:

3r33861. 3r33876. 3r33557. 3r33876. 3r33851. First, we classify the classifying network on the source domain. We denote its vector representation as

3r33582. 3r3808. , and 3r3806. 3r33535. 3r3808. - source domain. 3r33854. 3r33876. 3r33851. Now we initialize the neural network for the target domain using the trained network from the previous step. Denote it as

3r3604. 3r3808. , and 3r3806. 3r33573. 3r3808. - target domain. 3r33854. 3r33876. 3r33851. Let us turn to the adversarial training: we will train the discriminator

3r???. 3r3808. with fixed 3r3806. 3r33582. 3r3808. and 3r3806. 3r3604. 3r3808. using the following objective function:

3r33876. 3r33861. 3r33876. 3r33838. 3r33838. 3r33838. 3r3r6806. 3r3808. 3r33838. 3r33861. 3r33876. 3r33854. 3r33876. 3r33851. Freeze the discriminator and add to training

3r3604. 3r3808. on target domain:

3r33876. 3r33838. 3r33838. 3r33838. 3r3r6806. 3r33838. 3r33861. 3r33876. 3r33854. 3r33876. 3r3619. 3r33861. 3r33876. 3r33838. Steps 3 and 4 are repeated several times. The essence of ADDA is that we first train a good classifier on the marked source domain, and then with the help of adversarial training, we adapt so that the vector representations of the classifier on both domains are close. Graphically, the algorithm can be represented as follows: 3r33861. 3r33876. 3r33838. MNIST ADDA showed a result of 90.1% accuracy on the target domain. 3r33838. 3r33861. 3r33876. 3r33838. 3r33838. The ADDA modification was presented this year at the ICML-2018 3-33637 conference. M-ADDA: Unsupervised Domain Adaptation with Deep Metric Learning

3r33840. (3r33640. Code 3r3-3533.). 3r33838. 3r33861. 3r33876. 3r33838. Since the main idea of the original algorithm is to bring the vector representations closer to different domains, the authors of M-ADDA use metric learning to make the classes better divided according to 3r3806. 3r3808. -metrics. To do this, in step ? ADDA uses 3r?649 to train the network on the source domain. Triplet loss 3r3r5353. (it simultaneously minimizes the distance between positive examples (from one class) and maximizes between negative ones). As a result of such training, vector representations of data tend to break down into 3r3806. 3r3752. 3r3808. clusters (where

- number of classes). For each cluster, its center is calculated

3r3658. 3r3808. . 3r33838. 3r33861. 3r33876. 3r33838. Then there is learning as in ADDA, i.e. Steps 2-4 are performed. Only after step 4 is a regularization added, which forces vector representations on the target domain to reach the nearest cluster 3r3806. 3r3665. 3r3808. , thus providing the best separability of classes in the target domain: 3r3-3864. 3r33861. 3r33876. 3r33838. 3r33838. 3r33838. 3r3r6806. 3r33838. 3r33861. 3r33876. 3r33838. The model training scheme on the target domain is presented below. 3r33838. 3r33861. 3r33876. 3r33838. MNIST to 94.0%. 3r33838. 3r33861. 3r33876. 3r33838. A rather atypical representative of a non-generative family is 3r33939. method Maximum Classifier Discrepancy for Unsupervised Domain Adaptation 3r33840. ( Code 3r3-3853.). He also teaches such vector representations (generator) to be as close as possible to each other on the source and target domains. However, as a discriminator, this method uses the differences in prediction between the two classifiers trained on the generator. 3r33838. 3r33861. 3r33876. 3r33838. Let generator 3r3806. 3r33782. 3r3808. - This is a kind of convolution network, 3r3806. 3r3808. and 3r3806. 3r33788. 3r3808. - two classifiers that use the generator output as an input feature vector. The idea of the method is that

3r33782. 3r3808. , 3r3806. 3r3808. and 3r3806. 3r33788. 3r3808. are trained on the source domain; then classifiers are trained so as to maximize their disagreement on the target domain; after that the generator is rebuilt so that the disagreement is minimized; and at the end updated

3r3808. and 3r3806. 3r33788. 3r3808. . 3r33838. 3r33861. 3r33876. 3r33838. As can be seen from the description, the algorithm is based on a minimax adversarial-procedure, the result of which should be the network 3r3806. 3r33782. 3r3808. invariant with respect to the domain. 3r33838. 3r33861. 3r33876. 3r33838. As a measure of disagreement (Discrepancy Loss) 3r36464 is used. 3r33861. 3r33876. 3r33838. 3r33838. 3r33838. 3r3r6806. 3r33737. 3r3808. 3r33838. 3r33861. 3r33876. 3r33838. where

3r3752. 3r3808. - number of classes,

3r3755. 3r3808. 3r3r6806. 3r3758. 3r3808. - values of softmax

3r3761. 3r3808. class 3 for classifiers

3r3808. and 3r3806. 3r33788. 3r3808. respectively. 3r33838. 3r33861. 3r33876. 3r33838. More formally, the method consists of 3 steps:

3r33861. 3r33876. 3r33844. 3r33876. 3r33851. 3r33838. A

.

are trained on the initial domain. 3r33782. 3r3808. , 3r3806. 3r3808. and 3r3806. 3r33788. 3r3808. . 3r33854. 3r33876. 3r33851. 3r33838. B 3r33840. . The generator is fixed, and the disagreement of classifiers is maximized on data from the target domain. 3r33854. 3r33876. 3r33851. 3r33838. C

. Now classifiers are fixed, and generator parameters are trained so as to minimizeDiscrepancy Loss. 3r33854. 3r33876. 3r33856. 3r33861. 3r33876. 3r33838. All three steps are repeated

3r3808. times (algorithm parameter). Steps B and C:

3r33861. 3r33876. 3r33838. MNIST: 94.1%. 3r33854. 3r33876. 3r33851. On road signs Synth Signs -> GTSRB method surpasses all previous ones: 94.4%. 3r33854. 3r33876. 3r33851. On the basis of VisDA, the average value of quality in 12 categories without the Unknown class: 71.9%. 3r33854. 3r33876. 3r33851. On the pair GTA 5 -> Cityscapes: Mean IoU = 39.7%, on Synthia -> Cityscapes: Mean IoU = 37.3%

3r33876. 3r33856. 3r33861. 3r33876. 3r33838. 3r33838. You can also pay attention

for the following interesting algorithms from the non-generative models family:

3r33861. 3r33876. 3r33844. 3r33876. 3r33851. 3r33847. Domain Separation Networks 3r33854. 3r33876. 3r33851. 3r33852. A DIRT-T Approach to Unsupervised Domain Adaptation

3r33854. 3r33876. 3r33856. 3r33861. 3r33876. 3r33838. At this point, let's interrupt. 3r33838. 3r33861. 3r33876. 3r33838. We considered the basic datasets for domain adaptation, discrepancy-based approaches: class criterion, statistical criteria and architecture criterion, as well as the first family of adversarial-based methods - non-generative. The models of these approaches show themselves well on benchmarks and are applicable for many adaptation tasks. In the next part, we will look at the most complex and effective approaches: generative models and mixed non-adversarial-based methods. 3r33838.

3r33876. 3r33876. 3r33876. 3r33869. ! function (e) {function t (t, n) {if (! (n in e)) {for (var r, a = e.document, i = a.scripts, o = i.length; o-- ;) if (-1! == i[o].src.indexOf (t)) {r = i[o]; break} if (! r) {r = a.createElement ("script"), r.type = "text /jаvascript", r.async =! ? r.defer =! ? r.src = t, r.charset = "UTF-8"; var d = function () {var e = a.getElementsByTagName ("script")[0]; e.parentNode.insertBefore (r, e)}; "[object Opera]" == e.opera? a.addEventListener? a.addEventListener ("DOMContentLoaded", d,! 1): e.attachEvent ("onload", d ): d ()}}} t ("//mediator.mail.ru/script/2820404/"""_mediator") () (); 3r33838. 3r33876.

3r33876. 3r33876. 3r33876. 3r33876.

Figures

3r33861. 3r33876. 3r33838. Apparently, according to the tradition, established 3r348. By Yann LeKun 3r3r5353. (one of the pioneers of deep learning, director of Facebook AI Research), in computer vision, the simplest datasets are associated with handwritten numbers or letters. There are several data sets with numbers that originally appeared to experiment with models for image recognition. In articles on domain adaptation, one can find their most diverse combinations in pairs of source - target domain. Among these:

3r33861. 3r33876. 3r33844. 3r33876. 3r33851. MNIST 3r33853. - handwritten numbers, does not need additional representation; 3r33854. 3r33876. 3r33851. 3r361. USPS 3r35353. - handwritten numbers in low resolution; 3r33854. 3r33876. 3r33851. 3r366. SVHN 3r33853. - house numbers with Google Street View; 3r33854. 3r33876. 3r33851.

Synth Numbers - synthetic numbers, as the name suggests. 3r33854. 3r33876. 3r33856. 3r33861. 3r33876. 3r33838. From the point of view of the task of learning on synthetic data for use in the "real" world, the pairs of the most interest are: 3r-3864. 3r33861. 3r33876. 3r33844. 3r33876. 3r33851. Source: MNIST, Target: SVHN; 3r33854. 3r33876. 3r33851. Source: USPS, Target: MNIST; 3r33854. 3r33876. 3r33851. Source: Synth Numbers, Target: SVHN. 3r33854. 3r33876. 3r33856. 3r33861. 3r33876. 3r33838. Office

3r33861. 3r33876. 3r33838. This Dataset contains 31 categories of different items, each of which is presented in 3 domains: an image from Amazon, a photo from a webcam and a photo from a digital camera. 3r33838. 3r33861. 3r33876. 3r33838. 3r3114. 3r33838. 3r33861. 3r33876. 3r33838. It is useful for checking how the model will react to the addition of the background and the quality of the shooting to the target domain. 3r33838. 3r33861. 3r33876. 3r3122. Road signs

3r33861. 3r33876. 3r33838. Another pair of datasets for teaching the model on synthetic data and applying it on “real” dаta: 3r36464. 3r33861. 3r33876. 3r33844. 3r33876. 3r33851. Source: Synth Signs - images of road signs generated so that they look like real signs on the street; 3r33854. 3r33876. 3r33851. Target: 3r3138. GTSRB 3r33853. - a fairly well-known base for recognition, containing signs from German roads. 3r33854. 3r33876. 3r33856. 3r33861. 3r33876. 3r33838. 3r3146. 3r33838. 3r33861. 3r33876. 3r33838. A feature of this pair of databases is that the data from Synth Signs are made quite similar to “real” data, so the domains are quite close. 3r33838. 3r33861. 3r33876. 3r3154. From the window of the car

3r33861. 3r33876. 3r33838. Datasets for segmentation. Quite an interesting couple, most close to real conditions. Baseline data is obtained using the game engine (GTA 5), and target data from real life. Similar approaches are used to train models that are used in autonomous cars. 3r33838. 3r33861. 3r33876. 3r33844. 3r33876. 3r33851. 3r3165. SYNTHIA

or the GTA 5 engine - pictures with city views from the car window, generated using the game engine; 3r33854. 3r33876. 3r33851. 3r33170. Cityscapes

- Photos of the car, made in 50 different cities. 3r33854. 3r33876. 3r33856. 3r33861. 3r33876. 3r33838. 3r3178. 3r33838. 3r33861. 3r33876. 3r3182. VisDA

3r33861. 3r33876. 3r33838. This dataset is used in competition 3r3187. Visual Domain Adaptation Challenge

which is held as part of the workshop at the ECCV and ICCV. In the source domain, there are 12 categories of labeled objects generated using CAD, such as an airplane, a horse, a person, etc. The target domain contains unallocated images from the same 12 categories taken from ImageNet. In the competition, which was held in 201? the 13th category was added: Unknown. 3r33838. 3r33861. 3r33876. 3r33838. 3r3193. 3r33838. 3r33861. 3r33876. 3r33838. As can be seen from all of the above, there are quite a lot of interesting and diverse datasets for domain adaptation, you can train and test models for various tasks (classification, segmentation, detection) and various conditions (synthetic data, photos, types of streets). 3r33838. 3r33861. 3r33876.

Deep Domain Adaptation

3r33861. 3r33876. 3r33838. There is a rather extensive and diverse classification of methods for domain adaptation (for example, you can get acquainted for example 3r3206. Here, 3r3r5353.). I will give in this article a simplified division of methods according to their key features. Modern deep domain adaptation methods can be divided into 3 large groups:

3r33861. 3r33876. 3r33844. 3r33876. 3r33851. 3r33838. Discrepancy-based

: approaches based on minimizing the distance between vector representations on the source and target domains by introducing this distance into the loss-function. 3r33854. 3r33876. 3r33851. 3r33838. Adversarial-based

: these approaches use a competitive (adversarial) loss function, which appeared in GANs, to train a network that is invariant with respect to the domain. The methods of this family have been actively developed in the last couple of years. 3r33854. 3r33876. 3r33851. 3r33838. Mixed methods

that do not use adversarial loss, but apply ideas from the discrepancy-based family, as well as recent developments from deep learning: self-ensembling, new layers, loss functions, etc. These approaches show the best results in the VisDA competition. 3r33854. 3r33876. 3r33856. 3r33861. 3r33876. 3r33838. From each section we will review several major, in my opinion, results obtained in the last 1-3 years. 3r33838. 3r33861. 3r33876.

Discrepancy-based

3r33861. 3r33876. 3r33838. When the problem arises of adapting the model to new data, the first thing that comes to mind is the use of fine-tuning, i.e. Additional model training on new data. To do this, you must take into account the measure of discrepancy between domains. This type of domain adaptation can be divided into three approaches: Class Criterion, Statistical criterion and Architecture Criterion. 3r33838. 3r33861. 3r33876.

Class Criterion

3r33861. 3r33876. 3r33838. Methods from this family are mainly used when tagged data from the target domain is available to us. One popular variation of Class Criterion is 3r33839. Deep transfer metric learning

. As the name implies, it is based on metric learning, the essence of which is to teach such a vector representation, derived from a neural network, that representatives of one class will be close to each other in this representation according to a given metric (most often they use 3r3806. 3r3647. 3r3808. or cosine metrics). Article 3r35353. Deep transfer metric learning (DTML)

To implement this approach, a loss consisting of the sum of the following terms is used: 3r3-3864. 3r33861. 3r33876. 3r33844. 3r33876. 3r33851. The proximity of representatives of one class to each other (intraclass compactness); 3r33854. 3r33876. 3r33851. Increasing the distance between representatives of different classes (interclass separability); 3r33854. 3r33876. 3r33851. Metric Maximum Mean Discrepancy (MMD) between domains. This metric belongs to the statistical criterion family (see below), but is also used in class criterion. 3r33854. 3r33876. 3r33856. 3r33861. 3r33876. 3r33838. MMD between domains is written as

3r33861. 3r33876. 3r33838. 3r33838. 3r33838. 3r3r6806. 3r3808. 3r33838. 3r33861. 3r33876. 3r33838. where

3r3808. - this is some core, in our case - vector representation of the network,

3r3808. - data from the source domain, 3r3r6806. 3r3808. - data from the target domain. Thus, while minimizing the MMD metric during training, such a network is selected 3r3806. 3r3808. so that its mean vector representations on both domains are close. The basic idea of DTML:

3r33861. 3r33876. 3r33838. 3r3302. 3r33838. 3r33861. 3r33876. 3r33838. If unsupervised domain adaptation data is not assigned to the target domain, 3r33939. The method described in 3r3308. Mind the Class Weight Bias: Weighted Maximum Mean Discrepancy for Unsupervised Domain Adaptation

3r33840. , offers to train the model on the source domain and use it to get pseudo-labels on the target domain. Those. the data from the target domain is run through the network and the result is called pseudo-labels. Then they are used as markup for the target domain, which allows the MMD criterion to be applied to the loss function (with different weights for the components responsible for different domains). 3r33838. 3r33861. 3r33876.

Statistical criterion

3r33861. 3r33876. 3r33838. Methods related to this family are used to solve the problem of unsupervised domain adaptation. The case where the target domain is unallocated occurs in many problems, and all domain adaptation methods that will be discussed later in this article solve exactly this problem. 3r33838. 3r33861. 3r33876. 3r33838. Approaches based on statistical criteria try to measure the difference between the distributions of the vector representation of the network obtained from the data of the source and target domains. They then use the calculated difference to approximate the two distributions. 3r33838. 3r33861. 3r33876. 3r33838. One of these criteria is already described above 3r3-33839. Maximum Mean Discrepancy (MMD) 3r33840. . Its variants are used in several methods:

3r33861. 3r33876. 3r33844. 3r33876. 3r33851. 3r33333. Deep adaptation network (DAN) 3r3r5353. ; 3r33854. 3r33876. 3r33851. 3r33333. Joint adaptation network (JAN)

; 3r33854. 3r33876. 3r33851. 3r33333. Residual transfer network (RTN) 3r35353. . RTN shows good results for a pair of MNIST -> SVHN: ???% accuracy on the target domain. 3r33854. 3r33876. 3r33856. 3r33861. 3r33876. 3r33838. Schemes of these three methods are presented below. In them, MMD variants are used to determine the difference between the distributions on layers of a convolutional neural network applied to the source and target domains. Please note that each of them uses a MMD modification as a loss between layers of convolutional networks (yellow figures in the diagram). 3r33838. 3r33861. 3r33876. 3r33838. 3r33357. 3r33838. 3r33861. 3r33876. 3r33838. 3r33838. Criterion CORAL 3r35353. (CORrelation ALignment)

and its expansion with the help of deep networks

Deep CORAL 3r35353. aimed at then, to learn such a representation of the data so as to maximally coincide between second-order statistics between domains. For this, the covariance matrices of vector representations of the network are used. The convergence of second-order statistics on both domains in some cases allows for better adaptation results than for MMD. 3r33838. 3r33861. 3r33876. 3r33838. 3r33838. 3r33838. 3r3r6806. 3r33375. 3r3808. 3r33838. 3r33861. 3r33876. 3r33838. where

3r33382. 3r3808. - The square of the matrix norm of Frobenius, and 3r3806. 3r33385. 3r3808. and 3r3806. 3r33333. 3r3808. - covariance data matrices from the source and target domains, respectively, 3r3806. 3r3333391. 3r3808. - dimension of the vector representation. 3r33838. 3r33861. 3r33876. 3r33838. On Office data, the average quality of adaptation using Deep CORAL for pairs of Amazon and Webcam domains: 72.1%. On the Synth Signs -> GTSRB road sign domains, the result is also quite average: 86.9% accuracy on the target domain. 3r33838. 3r33861. 3r33876. 3r33838. The development of the ideas of MMD and CORAL is 3r3-33839. criterion 3r3402. Central Moment Discrepancy (CMD) 3-333853. 3r33840. which compares the central moments of data from the source and target domains of all orders up to

3r3752. 3r3808. inclusive (

- parameter of the algorithm). At Office, the average CMD adaptation quality for Amazon and Webcam domain pairs is 77.0%. 3r33838. 3r33861. 3r33876. 3r33414. Architecture Criterion

3r33861. 3r33876. 3r33838. Algorithms of this type are built on the assumption that the basic information that is responsible for adaptation to a new domain is embedded in the parameters of the neural network. 3r33838. 3r33861. 3r33876. 3r33838. In 3r33423. 3r3r5353 3r33434. Works 3r33853. when learning networks for the source and target domains using loss functions for each pair of layers, information invariant with respect to the domain is studied on the weights of these layers. An example of such architectures is given below. 3r33838. 3r33861. 3r33876. 3r33838. 3r33840. the idea was expressed that the weights of the network contain information related to the classes on which the network learns, and the domain information is embedded in the statistics (mean and standard deviation) of the Batch Normalization (BN) layers. Therefore, for adaptation, it is necessary to recalculate these statistics on data from the target domain. Using this technique in conjunction with CORAL can improve the quality of adaptation in the Office dataset for couples of the Amazon and Webcam domains to 95.0%. Then

3r35353 has been shown. that using the Instance Normalization (IN) layer instead of BN further improves the quality of adaptation. Unlike BN, which normalizes the input tensor in batch mode, IN calculates statistics for normalization in channels and, therefore, does not depend on the batch. 3r33838. 3r33861. 3r33876. 3r33434. Adversarial-Based Approaches

3r33861. 3r33876. 3r33838. In the last 1-2 years, most of the results in deep domain adaptation are related to the adversarial-based approach. This is largely due to the rapid development and growth of the popularity of r3-3450. Generative Adversarial Networks (GAN) 3r35353. because the adversarial-based approach to domain adaptation uses the same adversarial objective function in training as the GAN. By optimizing it, such methods of deep domain adaptation minimize the distance between the empirical distributions of vector data representations on the source and target domains. By training the network in this way, they try to make it invariant with respect to the domain. 3r33838. 3r33861. 3r33876. 3r33838. The GAN consists of two models: the 3r3r6806 generator. 3r33782. 3r3808. , at the output of which data is obtained from some target distribution; and discriminator 3r3806. 3r???. 3r3808. which determines whether the data from the training sample was input to it or generated using 3r3806. 3r33782. 3r3808. . These two models are trained with the help of adversarial objective function: 3r33864. 3r33861. 3r33876. 3r33838. 3r33838. 3r33838. 3r3r6806. 3r33838. 3r33861. 3r33876. 3r33838. With such training, the generator learns to "deceive" the discriminator, which allows to bring together the distribution of the target and source domains. 3r33838. 3r33861. 3r33876. 3r33838. There are two big approaches in adversarial-based domain adaptation, which differ in whether or not the 3r3806 generator is used. 3r33782. 3r3808. . 3r33838. 3r33861. 3r33876.

Non-Generative Models

3r33861. 3r33876. 3r33838. A key feature of the methods from this family is the training of a neural network with a vector representation invariant with respect to the source and target domains. Then the network trained in the marked source domain can be used on the target domain, ideally with practically no loss of classification quality. 3r33838. 3r33861. 3r33876. 3r33838. Presented in 201?

Algorithm

Domain-Adversarial Training of Neural Networks (DANN) 3r33840. ( Code ) Consists of 3 parts:

3r33861. 3r33876. 3r33844. 3r33876. 3r33851. The main network, with the help of which a vector representation (feature extractor) is obtained (the green part in the illustration below); 3r33854. 3r33876. 3r33851. "Heads" responsible for classification on the source domain (the blue part in the illustration); 3r33854. 3r33876. 3r33851. "Heads" which learns to distinguish data from the source domain from the target (the red part in the illustration). 3r33854. 3r33876. 3r33856. 3r33861. 3r33876. 3r33838. When learning using gradient descent (SGD) (arrows to the input in the illustration), the classification and domain loss is minimized. In addition, when the error propagates backward for the “head” responsible for the domains, the Gradient reversal layer is used (the black part in the illustration), which multiplies the gradient passing through it by a negative constant, increasing the domain loss. This ensures that the distribution of vector representations on both domains become close. 3r33838. 3r33861. 3r33876. 3r33838. 3r33525. 3r33838. 3r33861. 3r33876. 3r33838. DANN results on benchmarks: 3r3-3864. 3r33861. 3r33876. 3r33844. 3r33876. 3r33851. On a pair of digital domains Synth Numbers -> SVHN: ???%. 3r33854. 3r33876. 3r33851. On Synth Signs -> GTSRB road signs, it surpasses CORAL with a score of 88.7%. 3r33854. 3r33876. 3r33851. On Office data, average adaptation quality for Amazon and Webcam domain pairs: 73.0%. 3r33854. 3r33876. 3r33856. 3r33861. 3r33876. 3r33838. The next important representative of the non-generative models family is 3r33839. method 3r3-3549. Adversarial Discriminative Domain Adaptation (ADDA)

3r33840. (3r33552. Code 3r35353.), Which implies network separation for the source domain and network for the target domain. The algorithm consists of the following steps:

3r33861. 3r33876. 3r33557. 3r33876. 3r33851. First, we classify the classifying network on the source domain. We denote its vector representation as

3r33582. 3r3808. , and 3r3806. 3r33535. 3r3808. - source domain. 3r33854. 3r33876. 3r33851. Now we initialize the neural network for the target domain using the trained network from the previous step. Denote it as

3r3604. 3r3808. , and 3r3806. 3r33573. 3r3808. - target domain. 3r33854. 3r33876. 3r33851. Let us turn to the adversarial training: we will train the discriminator

3r???. 3r3808. with fixed 3r3806. 3r33582. 3r3808. and 3r3806. 3r3604. 3r3808. using the following objective function:

3r33876. 3r33861. 3r33876. 3r33838. 3r33838. 3r33838. 3r3r6806. 3r3808. 3r33838. 3r33861. 3r33876. 3r33854. 3r33876. 3r33851. Freeze the discriminator and add to training

3r3604. 3r3808. on target domain:

3r33876. 3r33838. 3r33838. 3r33838. 3r3r6806. 3r33838. 3r33861. 3r33876. 3r33854. 3r33876. 3r3619. 3r33861. 3r33876. 3r33838. Steps 3 and 4 are repeated several times. The essence of ADDA is that we first train a good classifier on the marked source domain, and then with the help of adversarial training, we adapt so that the vector representations of the classifier on both domains are close. Graphically, the algorithm can be represented as follows: 3r33861. 3r33876. 3r33838. MNIST ADDA showed a result of 90.1% accuracy on the target domain. 3r33838. 3r33861. 3r33876. 3r33838. 3r33838. The ADDA modification was presented this year at the ICML-2018 3-33637 conference. M-ADDA: Unsupervised Domain Adaptation with Deep Metric Learning

3r33840. (3r33640. Code 3r3-3533.). 3r33838. 3r33861. 3r33876. 3r33838. Since the main idea of the original algorithm is to bring the vector representations closer to different domains, the authors of M-ADDA use metric learning to make the classes better divided according to 3r3806. 3r3808. -metrics. To do this, in step ? ADDA uses 3r?649 to train the network on the source domain. Triplet loss 3r3r5353. (it simultaneously minimizes the distance between positive examples (from one class) and maximizes between negative ones). As a result of such training, vector representations of data tend to break down into 3r3806. 3r3752. 3r3808. clusters (where

- number of classes). For each cluster, its center is calculated

3r3658. 3r3808. . 3r33838. 3r33861. 3r33876. 3r33838. Then there is learning as in ADDA, i.e. Steps 2-4 are performed. Only after step 4 is a regularization added, which forces vector representations on the target domain to reach the nearest cluster 3r3806. 3r3665. 3r3808. , thus providing the best separability of classes in the target domain: 3r3-3864. 3r33861. 3r33876. 3r33838. 3r33838. 3r33838. 3r3r6806. 3r33838. 3r33861. 3r33876. 3r33838. The model training scheme on the target domain is presented below. 3r33838. 3r33861. 3r33876. 3r33838. MNIST to 94.0%. 3r33838. 3r33861. 3r33876. 3r33838. A rather atypical representative of a non-generative family is 3r33939. method Maximum Classifier Discrepancy for Unsupervised Domain Adaptation 3r33840. ( Code 3r3-3853.). He also teaches such vector representations (generator) to be as close as possible to each other on the source and target domains. However, as a discriminator, this method uses the differences in prediction between the two classifiers trained on the generator. 3r33838. 3r33861. 3r33876. 3r33838. Let generator 3r3806. 3r33782. 3r3808. - This is a kind of convolution network, 3r3806. 3r3808. and 3r3806. 3r33788. 3r3808. - two classifiers that use the generator output as an input feature vector. The idea of the method is that

3r33782. 3r3808. , 3r3806. 3r3808. and 3r3806. 3r33788. 3r3808. are trained on the source domain; then classifiers are trained so as to maximize their disagreement on the target domain; after that the generator is rebuilt so that the disagreement is minimized; and at the end updated

3r3808. and 3r3806. 3r33788. 3r3808. . 3r33838. 3r33861. 3r33876. 3r33838. As can be seen from the description, the algorithm is based on a minimax adversarial-procedure, the result of which should be the network 3r3806. 3r33782. 3r3808. invariant with respect to the domain. 3r33838. 3r33861. 3r33876. 3r33838. As a measure of disagreement (Discrepancy Loss) 3r36464 is used. 3r33861. 3r33876. 3r33838. 3r33838. 3r33838. 3r3r6806. 3r33737. 3r3808. 3r33838. 3r33861. 3r33876. 3r33838. where

3r3752. 3r3808. - number of classes,

3r3755. 3r3808. 3r3r6806. 3r3758. 3r3808. - values of softmax

3r3761. 3r3808. class 3 for classifiers

3r3808. and 3r3806. 3r33788. 3r3808. respectively. 3r33838. 3r33861. 3r33876. 3r33838. More formally, the method consists of 3 steps:

3r33861. 3r33876. 3r33844. 3r33876. 3r33851. 3r33838. A

.

are trained on the initial domain. 3r33782. 3r3808. , 3r3806. 3r3808. and 3r3806. 3r33788. 3r3808. . 3r33854. 3r33876. 3r33851. 3r33838. B 3r33840. . The generator is fixed, and the disagreement of classifiers is maximized on data from the target domain. 3r33854. 3r33876. 3r33851. 3r33838. C

. Now classifiers are fixed, and generator parameters are trained so as to minimizeDiscrepancy Loss. 3r33854. 3r33876. 3r33856. 3r33861. 3r33876. 3r33838. All three steps are repeated

3r3808. times (algorithm parameter). Steps B and C:

3r33861. 3r33876. 3r33838. MNIST: 94.1%. 3r33854. 3r33876. 3r33851. On road signs Synth Signs -> GTSRB method surpasses all previous ones: 94.4%. 3r33854. 3r33876. 3r33851. On the basis of VisDA, the average value of quality in 12 categories without the Unknown class: 71.9%. 3r33854. 3r33876. 3r33851. On the pair GTA 5 -> Cityscapes: Mean IoU = 39.7%, on Synthia -> Cityscapes: Mean IoU = 37.3%

3r33876. 3r33856. 3r33861. 3r33876. 3r33838. 3r33838. You can also pay attention

for the following interesting algorithms from the non-generative models family:

3r33861. 3r33876. 3r33844. 3r33876. 3r33851. 3r33847. Domain Separation Networks 3r33854. 3r33876. 3r33851. 3r33852. A DIRT-T Approach to Unsupervised Domain Adaptation

3r33854. 3r33876. 3r33856. 3r33861. 3r33876. 3r33838. At this point, let's interrupt. 3r33838. 3r33861. 3r33876. 3r33838. We considered the basic datasets for domain adaptation, discrepancy-based approaches: class criterion, statistical criteria and architecture criterion, as well as the first family of adversarial-based methods - non-generative. The models of these approaches show themselves well on benchmarks and are applicable for many adaptation tasks. In the next part, we will look at the most complex and effective approaches: generative models and mixed non-adversarial-based methods. 3r33838.

3r33876. 3r33876. 3r33876. 3r33869. ! function (e) {function t (t, n) {if (! (n in e)) {for (var r, a = e.document, i = a.scripts, o = i.length; o-- ;) if (-1! == i[o].src.indexOf (t)) {r = i[o]; break} if (! r) {r = a.createElement ("script"), r.type = "text /jаvascript", r.async =! ? r.defer =! ? r.src = t, r.charset = "UTF-8"; var d = function () {var e = a.getElementsByTagName ("script")[0]; e.parentNode.insertBefore (r, e)}; "[object Opera]" == e.opera? a.addEventListener? a.addEventListener ("DOMContentLoaded", d,! 1): e.attachEvent ("onload", d ): d ()}}} t ("//mediator.mail.ru/script/2820404/"""_mediator") () (); 3r33838. 3r33876.

3r33876. 3r33876. 3r33876. 3r33876.