# A small study of the properties of a simple U-net, a classical convolutional network for segmentation

The article is written according to the analysis and study of materials for the competition to search for ships at sea. 3r3-3000.
3r3-3000. 3r3-3000.
3r3-3000.
Let's try to understand how and what the network is looking for and what it finds. This article is simply the result of curiosity and idle interest, nothing of it is encountered in practice, and for practical tasks there is nothing to copy and paste. But the result is not quite expected. The Internet is full of descriptions of the work of networks in which the authors beautifully and with pictures tell how networks determine primitives - corners, circles, whiskers, tails, etc., then they are searched for segmentation /classification. Many competitions are won using weights from other large and wide networks. It is interesting to understand and see how and what primitives the network builds. 3r3-3000.
3r3-3000.
Let's do a little research and consider the options - the author's reasoning and the code are set out, you can check /add /change everything yourself. 3r3-3000.
3r3-3000.
Recently ended kaggle competition to search for ships at sea. Airbus offered to analyze satellite images of the sea, both with and without vessels. A total of ???x768x3 pictures is ??? 960 bytes if uint8 and four times as many if float32 (by the way float32 is faster than float6? fewer memory accesses) and you need to find ships in 15606 pictures. As usual, all significant places were taken by people involved in ODS (ods.ai), which is natural and expected, and I hope that we will soon be able to study the train of thought and the code of winners and prize-winners. 3r3-3000.
3r3-3000.
We will consider a similar task, but we will simplify it significantly - take the sea np.random.sample * 0.? we do not need waves, wind, shores and other hidden patterns and faces. Let's make the image of the sea really random in the RGB range from 0.0 to 0.5. We will color the ships in the same color and in order to distinguish them from the sea we will place them in the range from 0.5 to 1.? and they will all be of the same shape - ellipses of different sizes and orientations. 3r3-3000.
3r3-3000. 3r3-3000.
3r3-3000.
Take a very common version of the network (you can take your favorite network) and all the experiments we will do with it. 3r3-3000.
3r3-3000.
Next, we will change the parameters of the image, interfere with and build hypotheses - so we select the main features by which the network finds ellipses. Perhaps the reader will make his conclusions and the author will refute. 3r3-3000.
3r3-3000.
3r3r6886. 3r3887. We load libraries, we define the sizes of the array of pictures [/b] 3r3889. 3r33915. ` import numpy as npimport pandas as pdimport matplotlib.pyplot as plt% matplotlib inlineimport mathfrom tqdm import tqdm_notebook, tqdmfrom skimage.draw import ellipse, polygonfrom keras import Modelfrom keras.callbacks import EarlyStopping, ModelCheckpoint, ReduceLROnPlateaufrom keras.models import load_modelfrom keras.optimizers import Adamfrom keras.layers import Input, Conv2D, Conv2DTranspose, MaxPooling2D, concatenate, Dropoutfrom keras.losses import binary_crossentropyimport tensorflow as tfimport keras as kerasfrom keras import backend as Kfrom tqdm import tqdm_notebookw_size = 256train_num = 8192 3r3-31008. train_x = np.zeros ((train_num, w_size, w_size, 3), dtype = 'float32')train_y = np.zeros ((train_num, w_size, w_size, 1), dtype = 'float32')img_l = np.random.sample ((w_size, w_size, 3)) * ???r3r31008. img_h = np.random.sample ((w_size, w_size, 3)) * 0.5 + ???r3r31008.radius_min = 10radius_max = 30 ` 3r33339. 3r3-3000.
3r3-3000.
3r3r6886. 3r3887. we determine the loss and accuracy functions of 3r3r888. 3r3889. 3r33915. ` def dice_coef (y_true, y_pred):y_true_f = K.flatten (y_true)y_pred = K.cast (y_pred, 'float32')y_pred_f = K.cast (K.greater (K.flatten (y_pred), 0.5), 'float32')intersection = y_true_f * y_pred_fscore = 2. * K.sum (intersection) /(K.sum (y_true_f) + K.sum (y_pred_f))return scoredef dice_loss (y_true, y_pred):smooth = 1.y_true_f = K.flatten (y_true)y_pred_f = K.flatten (y_pred)intersection = y_true_f * y_pred_fscore = (2. * K.sum (intersection) + smooth) /(K.sum (y_true_f) + K.sum (y_pred_f) + smooth)return 1. - scoredef bce_dice_loss (y_true, y_pred):return binary_crossentropy (y_true, y_pred) + dice_loss (y_true, y_pred)def get_iou_vector (A, B):# Numpy versionbatch_size = A.shapemetric = ???r3r31008. for batch in range (batch_size):t, p = A[batch], B[batch]true = np.sum (t)pred = np.sum (p) 3r3-31008.# deal with empty mask firstif true == 0:metric + = (pred == 0)continue# non empty mask case. Union is never empty# hence it is safe to divide byintersection = np.sum (t * p)union = true + pred - intersectioniou = intersection /union#iou metrric is a stepwise approximation of the real iou over ???r3r31008. iou = np.floor (max (? (iou - ???) * 20)) /10metric + = iou# teake the average over all images in batchmetric /= batch_sizereturn metricdef my_iou_metric (label, pred):# Tensorflow versionreturn tf.py_func (get_iou_vector,[label, pred > 0.5], tf.float64)from keras.utils.generic_utils import get_custom_objectsget_custom_objects (). update ({'bce_dice_loss': bce_dice_loss})get_custom_objects (). update ({'dice_loss': dice_loss})get_custom_objects (). update ({'dice_coef': dice_coef})get_custom_objects (). update ({'my_iou_metric': my_iou_metric})` 3r33339. 3r3-3000.
3r3-3000.
We use the classic metric in image segmentation, there are a lot of articles, a code with comments and text about the selected metric, on the same kaggle there are lots of options with comments and explanations. We will predict the pixel mask - this is the "sea" or "ship" and evaluate the truth or falsehood of the prediction. Those. the following four options are possible - we correctly predicted that a pixel is a “sea”, correctly predicted that a pixel is a “ship” or made a mistake in predicting a “sea” or “ship”. And so for all the pictures and all the pixels we estimate the number of all four options and calculate the result - this will be the result of the network. And the fewer erroneous predictions and the more true, the more accurate the result and the better the operation of the network. 3r3-3000.
3r3-3000.
And for research we will take a well-studied u-net, this is an excellent network for image segmentation. The network is very common in such competitions and there are many descriptions, subtleties of application, etc. A variant of the classic U-net was chosen and, of course, it was possible to modernize it, add residual blocks, etc. But "it is impossible to embrace the immense" and conduct all the experiments and tests at once. U-net performs a very simple operation with pictures - step by step reduces the dimension of the picture with some transformations and then tries to restore the mask from the compressed image. Those. the dimension of the picture in our case is brought to 32x32 and then we try to restore the mask using data from all previous compressions. 3r3-3000.
3r3-3000.
The picture shows the U-net diagram from the original article, but we have slightly altered it, but the essence remains the same - compress the picture → expand it into the mask. 3r3-3000.
3r3-3000.
3r3172. 3r3-3000.
3r3-3000.
3r3r6886. 3r3887. Just U-net [/b] 3r3889. 3r33915. ` def build_model (input_layer, start_neurons):conv1 = Conv2D (start_neurons * ? (?3), activation = "relu", padding = "same") (input_layer)conv1 = Conv2D (start_neurons * ? (?3), activation = "relu", padding = "same") (conv1)pool1 = MaxPooling2D ((? 2)) (conv1)pool1 = Dropout (???) (pool1)conv2 = Conv2D (start_neurons * ? (?3), activation = "relu", padding = "same") (pool1)conv2 = Conv2D (start_neurons * ? (?3), activation = "relu", padding = "same") (conv2)pool2 = MaxPooling2D ((? 2)) (conv2)pool2 = Dropout (0.5) (pool2)conv3 = Conv2D (start_neurons * ? (?3), activation = "relu", padding = "same") (pool2)conv3 = Conv2D (start_neurons * ? (?3), activation = "relu", padding = "same") (conv3)pool3 = MaxPooling2D ((? 2)) (conv3)pool3 = Dropout (0.5) (pool3)conv4 = Conv2D (start_neurons * ? (?3), activation = "relu", padding = "same") (pool3)conv4 = Conv2D (start_neurons * ? (?3), activation = "relu", padding = "same") (conv4)pool4 = MaxPooling2D ((? 2)) (conv4)pool4 = Dropout (0.5) (pool4)# Middleconvm = Conv2D (start_neurons * 1? (?3), activation = "relu", padding = "same") (pool4)convm = Conv2D (start_neurons * 1? (?3), activation = "relu", padding = "same") (convm)deconv4 = Conv2DTranspose (start_neurons * ? (? 3), strides = (? 2), padding = "same") (convm)uconv4 = concatenate ([deconv4, conv4])uconv4 = Dropout (0.5) (uconv4) 3r3-31008. uconv4 = Conv2D (start_neurons * ? (?3), activation = "relu", padding = "same") (uconv4)uconv4 = Conv2D (start_neurons * ? (?3), activation = "relu", padding = "same") (uconv4)deconv3 = Conv2DTranspose (start_neurons * ? (?3), strides = (? 2), padding = "same") (uconv4)uconv3 = concatenate ([deconv3, conv3])uconv3 = Dropout (0.5) (uconv3) 3r3-31008. uconv3 = Conv2D (start_neurons * ? (?3), activation = "relu", padding = "same") (uconv3)uconv3 = Conv2D (start_neurons * ? (?3), activation = "relu", padding = "same") (uconv3)deconv2 = Conv2DTranspose (start_neurons * ? (?3), strides = (? 2), padding = "same") (uconv3)uconv2 = concatenate ([deconv2, conv2])uconv2 = Dropout (0.5) (uconv2) 3r3-31008. uconv2 = Conv2D (start_neurons * ? (?3), activation = "relu", padding = "same") (uconv2)uconv2 = Conv2D (start_neurons * ? (?3), activation = "relu", padding = "same") (uconv2)deconv1 = Conv2DTranspose (start_neurons * ? (?3), strides = (? 2), padding = "same") (uconv2)uconv1 = concatenate ([deconv1, conv1])uconv1 = Dropout (0.5) (uconv1) 3r3-31008. uconv1 = Conv2D (start_neurons * ? (?3), activation = "relu", padding = "same") (uconv1)uconv1 = Conv2D (start_neurons * ? (?3), activation = "relu", padding = "same") (uconv1)uncov1 = Dropout (0.5) (uconv1) 3r3-31008. output_layer = Conv2D (? (?1), padding = "same", activation = "sigmoid") (uconv1)return output_layer` 3r33339. 3r3-3000.
3r3-3000.
3r33982. The first experiment. The easiest
3r3-3000.
The first version of our experiment was chosen especially for clarity, very simple - the sea is lighter, the vessels are darker. Everything is very simple and obvious, we hypothesize that the network will find the ships /ellipses without problems and with any accuracy. The next_pair function generates a picture /mask pair in which the location, size, angle of rotation are chosen randomly. Further, all changes will be made to this function - changing the coloring, shape, noise, etc. But now the easiest option, check the hypothesis of dark ships on a light background. 3r3-3000.
3r3-3000.
3r33915. ` def next_pair ():p = np.random.sample () - 0.5 # until we use# r, c - coordinates of the center of the ellipser = np.random.sample () * (w_size-2 * radius_max) + radius_maxc = np.random.sample () * (w_size-2 * radius_max) + radius_max# large and small radii of the ellipser_radius = np.random.sample () * (radius_max-radius_min) + radius_minc_radius = np.random.sample () * (radius_max-radius_min) + radius_minrot = np.random.sample () * 360 # slope of the ellipserr, cc = ellipse (3r3-31008. r, c, 3r3-31008. r_radius, c_radius,rotation = np.deg2rad (rot), 3r3-31008. shape = img_l.shape) # we get all points of ellapp.# paint the pixels of the sea /background in the noise from 0.5 to ???r3r31008. img = img_h.copy ()# paint the pixels of the ellipse with noise from 0.0 to ???r3r31008. img[rr, cc]= img_l[rr, cc]msk = np.zeros ((w_size, w_size, 1), dtype = 'float32')msk[rr, cc]= 1. # paint the pixels of the ellipse maskreturn img, msk` 3r33339. 3r3-3000.
We generate the whole train and see what happens. It is quite similar to the ships in the sea and nothing superfluous. Everything is clearly visible, clear and understandable. The arrangement is random, and there is only one ellipse in each picture. 3r3-3000.
3r3-3000.
3r33915. ` for k in range (train_num): # generate all img trainimg, msk = next_pair ()train_x[k]= imgtrain_y[k]= mskfig, axes = plt.subplots (? 1? figsize = (2? 5)) # look at the first 10 with masksfor k in range (10): 3r3-31008. axes[0,k].set_axis_off () 3r3-31008. axes[0,k].imshow (train_x[k]) 3r3-31008. axes[1,k].set_axis_off () 3r3-31008. axes[1,k].imshow (train_y[k].squeeze ()) ` 3r33339. 3r3-3000.
3r3305. 3r3-3000.
3r3-3000.
There is no doubt that the network will learn successfully and will find ellipses. But let's check our hypothesis that the network is learning to find ellipses /ships and at the same time with high accuracy. 3r3-3000.
3r3-3000.
3r33915. ` input_layer = Input ((w_size, w_size, 3))output_layer = build_model (input_layer, 16)model = Model (input_layer, output_layer)model.compile (loss = bce_dice_loss, optimizer = Adam (lr = 1e-3), metrics =[my_iou_metric])model.save_weights ('./keras.weights')while True:history = model.fit (train_x, train_y,batch_size = 3?epochs = ? 3r3-31008. verbose = ? 3r3-31008. validation_split = ???r3r31008.)if history.history['my_iou_metric'] > ???:break ` 3r33339. 3r3-3000.
3r33636. Train on 7372 samples, validate on 820 samples

Epoch 1/1 3-3331000.
7372/7372[==============================]- 55s 7ms /step - loss: ??? - my_iou_metric: ??? - val_loss: ??? - val_my_iou_metric: ???r3-31000.
Train on 7372 samples, validate on 820 samples

Epoch 1/1 3-3331000.
7372/7372[==============================]- 53s 7ms /step - loss: ??? - my_iou_metric: ??? - val_loss: ??? - val_my_iou_metric: ???r3-31000.
3r3-3000.
3r3-3000.
The net successfully finds ellipses. But it is not at all proven that it is looking for ellipses in the human understanding, as the area is bounded by the ellipse equation and filled with different content from the background, there is no certainty that there will be network weights similar to the coefficients of the quadratic ellipse equation. And it is obvious that the brightness of the ellipse is less than the brightness of the background and no secret and riddle - we assume that we just checked the code. Let's correct the obvious face, make the background and the color of the ellipse also random. 3r3-3000.
3r3-3000.
3r33982. The second option is 3r38383. 3r3-3000.
Now the same ellipses on the same sea, but the color of the sea and, accordingly, the ship is chosen randomly. If the sea is darker, the ship will be lighter and vice versa. Those. According to the brightness of the group of points, it is impossible to determine whether they are located outside the ellipse, i.e. the sea or it is the points inside the ellipse. And again we will test our hypothesis that the network will find ellipses regardless of color. 3r3-3000.
3r3-3000.
3r33915. ` def next_pair ():p = np.random.sample () - 0.5 # is the choice of the background /ellipse color 3r3-31008. r = np.random.sample () * (w_size-2 * radius_max) + radius_maxc = np.random.sample () * (w_size-2 * radius_max) + radius_maxr_radius = np.random.sample () * (radius_max-radius_min) + radius_minc_radius = np.random.sample () * (radius_max-radius_min) + radius_minrot = np.random.sample () * 360rr, cc = ellipse (3r3-31008. r, c, 3r3-31008. r_radius, c_radius,rotation = np.deg2rad (rot),shape = img_l.shape)if p> 0: # if you selected a darker background thanimg = img_l.copy ()img[rr, cc]= img_h[rr, cc]else: # if the selected background is lighter thanimg = img_h.copy ()img[rr, cc]= img_l[rr, cc]msk = np.zeros ((w_size, w_size, 1), dtype = 'float32')msk[rr, cc]= 1.return img, msk` 3r33339. 3r3-3000.
Now, by pixel and its surroundings, it is impossible to determine the background or ellipse. We also carry out the generation of images and masks and look at the first 10 screen.
3r3-3000.
3r3r6886. 3r3887. build picture masks [/b] 3r3889. 3r33915. ` for k in range (train_num):img, msk = next_pair ()train_x[k]= imgtrain_y[k]= mskfig, axes = plt.subplots (? 1? figsize = (2? 5))for k in range (10): 3r3-31008. axes[0,k].set_axis_off () 3r3-31008. axes[0,k].imshow (train_x[k]) 3r3-31008. axes[1,k].set_axis_off () 3r3-31008. axes[1,k].imshow (train_y[k].squeeze ())` 3r33339. 3r3-3000.
3r3-3000.
3r3-3000.
3r33434. 3r3-3000.
3r33915. ` input_layer = Input ((w_size, w_size, 3))output_layer = build_model (input_layer, 16)model = Model (input_layer, output_layer)model.compile (loss = bce_dice_loss, optimizer = Adam (lr = 1e-3), metrics =[my_iou_metric])model.load_weights ('./keras.weights', by_name = False)while True:history = model.fit (train_x, train_y,batch_size = 3?epochs = ? 3r3-31008. verbose = ? 3r3-31008. validation_split = ???r3r31008.)if history.history['my_iou_metric'] > ???:break ` 3r33339. 3r3-3000.
3r33636. Train on 7372 samples, validate on 820 samples

Epoch 1/1 3-3331000.
7372/7372[==============================]- 56s 8ms /step - loss: ??? - my_iou_metric: ??? - val_loss: ??? - val_my_iou_metric: ???r3r31000.
Train on 7372 samples, validate on 820 samples

Epoch 1/1 3-3331000.
7372/7372[==============================]- 55s 7ms /step - loss: ??? - my_iou_metric: ??? - val_loss: ??? - val_my_iou_metric: ???r3r3977. 3r3-3000.
3r3-3000.
The network easily copes and finds all ellipses. But even here there is a flaw in the implementation, and everything is obvious - the smaller of the two areas in the picture is the ellipse, another background. Perhaps this is an incorrect hypothesis, but still correct, add another polygon on the image of the same color as the ellipse. 3r3-3000.
3r3-3000.
3r33982. The third option
3r3-3000.
On each picture, we randomly select the color of the sea from two options and add an ellipse and a rectangle both different from the color of the sea. It turns out the same "sea", just painted "ship", but in the same picture we add a rectangle of the same color as the "ship" and also with a randomly selected size. Now our assumption is more complicated, in the picture there are two equally colored objects, but we hypothesize that the network will still learn how to choose the right object. 3r3-3000.
3r3-3000.
3r3r6886. 3r3887. drawing program of ellipses and rectangles [/b] 3r3889. 3r33915. ` def next_pair ():# choose the parameters of the ellipseas before. p = np.random.sample () - ???r3-31008. r = np.random.sample () * (w_size-2 * radius_max) + radius_maxc = np.random.sample () * (w_size-2 * radius_max) + radius_maxr_radius = np.random.sample () * (radius_max-radius_min) + radius_minc_radius = np.random.sample () * (radius_max-radius_min) + radius_minrot = np.random.sample () * 360rr, cc = ellipse (3r3-31008. r, c, 3r3-31008. r_radius, c_radius,rotation = np.deg2rad (rot),shape = img_l.shape)p1 = np.rint (np.random.sample () * (w_size-2 * radius_max) + radius_max)p2 = np.rint (np.random.sample () * (w_size-2 * radius_max) + radius_max)p3 = np.rint (np.random.sample () * (2 * radius_max - radius_min) + radius_min)p4 = np.rint (np.random.sample () * (2 * radius_max - radius_min) + radius_min)# select the rectangle /clutter parameters, set the four cornerspoly = np.array ((3r3-31008. (p? p2), 3r3-31008. (p? p2 + p4), 3r3-331008. (p1 + p? p2 + p4),(p1 + p? p2), 3r331018. (( p? p2), 3r3-31008.)) 3r3-31008. rr_p, cc_p = polygon (poly[:, 0], poly[:, 1], img_l.shape)in_sc = list (set (rr) & set (rr_p)) # make sure that the rectangle is# did not intersect with the ellipse# and move it to the side if necessaryif len (in_sc)> 0:if np.mean (rr_p)> np.mean (in_sc):poly + = np.max (in_sc) - np.min (in_sc)else:poly - = np.max (in_sc) - np.min (in_sc)rr_p, cc_p = polygon (poly[:, 0], poly[:, 1], img_l.shape)if p> 0:img = img_l.copy ()img[rr, cc]= img_h[rr, cc]img[rr_p, cc_p]= img_h[rr_p, cc_p]else:img = img_h.copy ()img[rr, cc]= img_l[rr, cc]img[rr_p, cc_p]= img_l[rr_p, cc_p]msk = np.zeros ((w_size, w_size, 1), dtype = 'float32')msk[rr, cc]= 1.return img, msk` 3r33339. 3r3-3000.
3r3-3000.
Just as before, we calculate pictures and masks and look at the first 10 pairs. 3r3-3000.
3r3-3000.
3r3r6886. 3r3887. we build picture masks ellipses and rectangles [/b] 3r3889. 3r33915. ` for k in range (train_num):img, msk = next_pair ()train_x[k]= imgtrain_y[k]= mskfig, axes = plt.subplots (? 1? figsize = (2? 5))for k in range (10): 3r3-31008. axes[0,k].set_axis_off () 3r3-31008. axes[0,k].imshow (train_x[k]) 3r3-31008. axes[1,k].set_axis_off () 3r3-31008. axes[1,k].imshow (train_y[k].squeeze ())` 3r33339. 3r3-3000.
3r3-3000.
3r3-3000.
3r33578. 3r3-3000.
3r33915. ` input_layer = Input ((w_size, w_size, 3))output_layer = build_model (input_layer, 16)model = Model (input_layer, output_layer)model.compile (loss = bce_dice_loss, optimizer = Adam (lr = 1e-3), metrics =[my_iou_metric])model.load_weights ('./keras.weights', by_name = False)while True:history = model.fit (train_x, train_y,batch_size = 3?epochs = ? 3r3-31008. verbose = ? 3r3-31008. validation_split = ???r3r31008.)if history.history['my_iou_metric'] > ???:break` 3r33339. 3r3-3000.
3r33636. Train on 7372 samples, validate on 820 samples

Epoch 1/1 3-3331000.
7372/7372[==============================]- 57s 8ms /step - loss: ??? - my_iou_metric: ??? - val_loss: ??? - val_my_iou_metric: ???r3r31000.
Train on 7372 samples, validate on 820 samples

Epoch 1/1 3-3331000.
7372/7372[==============================]- 55s 7ms /step - loss: ??? - my_iou_metric: ??? - val_loss: ??? - val_my_iou_metric: ???r3r3977. 3r3-3000.
3r3-3000.
It was not possible to confuse the network with rectangles and our hypothesis is confirmed. At the Airbus competition, everyone, judging by the examples and discussions, was a single vessel, and several vessels were quite close by. Ellipse from the rectangle - i.e. the vessel is different from the house on the shore, the network is different, although the polygons are the same color as the ellipses. The point is not in color, because both the ellipse and the rectangle are equally randomly colored. 3r3-3000.
3r3-3000.
3r33982. The fourth option
3r3-3000.
Perhaps the network is distinguished by rectangles - correct, distort them. Those. the network easily finds both closed regions regardless of the shape and discards the one that is a rectangle. This is the author's hypothesis - let's check it, for which we will add not rectangles, but quadrangular polygons of arbitrary shape. And again our hypothesis is that the network will distinguish an ellipse from an arbitrary quadrilateral polygon of the same coloring. 3r3-3000.
3r3-3000.
You can of course get into the inside of the network and look at the layers there and analyze the meaning of weights and shifts. The author is interested in the resulting behavior of the network, the judgment will be based on the result of the work, although it is always interesting to look inside. 3r3-3000.
3r3-3000.
3r3r6886. 3r3887. make changes to the generation of images 3r3888. 3r3889. 3r33915. ` def next_pair ():p = np.random.sample () - ???r3-31008. r = np.random.sample () * (w_size-2 * radius_max) + radius_maxc = np.random.sample () * (w_size-2 * radius_max) + radius_maxr_radius = np.random.sample () * (radius_max-radius_min) + radius_minc_radius = np.random.sample () * (radius_max-radius_min) + radius_minrot = np.random.sample () * 360rr, cc = ellipse (3r3-31008. r, c, 3r3-31008. r_radius, c_radius,rotation = np.deg2rad (rot),shape = img_l.shape)p0 = np.rint (np.random.sample () * (radius_max-radius_min) + radius_min)p1 = np.rint (np.random.sample () * (w_size-radius_max))p2 = np.rint (np.random.sample () * (w_size-radius_max))p3 = np.rint (np.random.sample () * 2. * radius_min - radius_min)p4 = np.rint (np.random.sample () * 2. * radius_min - radius_min)p5 = np.rint (np.random.sample () * 2. * radius_min - radius_min)p6 = np.rint (np.random.sample () * 2. * radius_min - radius_min)p7 = np.rint (np.random.sample () * 2. * radius_min - radius_min)p8 = np.rint (np.random.sample () * 2. * radius_min - radius_min)poly = np.array (((p? p2),(p1 + p? p2 + p4 + p0),(p1 + p5 + p? p2 + p6 + p0),(p1 + p7 + p? p2 + p8), 3r3-31008. (p? p2),)) 3r3-331008. rr_p, cc_p = polygon (poly[:, 0], poly[:, 1], img_l.shape)in_sc = list (set (rr) & set (rr_p))if len (in_sc)> 0:if np.mean (rr_p)> np.mean (in_sc):poly + = np.max (in_sc) - np.min (in_sc)else:poly - = np.max (in_sc) - np.min (in_sc)rr_p, cc_p = polygon (poly[:, 0], poly[:, 1], img_l.shape)if p> 0:img = img_l.copy ()img[rr, cc]= img_h[rr, cc]img[rr_p, cc_p]= img_h[rr_p, cc_p]else:img = img_h.copy ()img[rr, cc]= img_l[rr, cc]img[rr_p, cc_p]= img_l[rr_p, cc_p]msk = np.zeros ((w_size, w_size, 1), dtype = 'float32')msk[rr, cc]= 1.return img, msk` 3r33339. 3r3-3000.
3r3-3000.
We calculate pictures and masks and look at the first 10 pairs. 3r3-3000.
3r3-3000.
3r3r6886. 3r3887. We build picture masks ellipses and polygons [/b] 3r3889. 3r33915. ` for k in range (train_num):img, msk = next_pair ()train_x[k]= imgtrain_y[k]= mskfig, axes = plt.subplots (? 1? figsize = (2? 5))for k in range (10): 3r3-31008. axes[0,k].set_axis_off () 3r3-31008. axes[0,k].imshow (train_x[k]) 3r3-31008. axes[1,k].set_axis_off () 3r3-31008. axes[1,k].imshow (train_y[k].squeeze ())` 3r33339. 3r3-3000.
3r3-3000.
3r3-3000. 3r3-3000.
We start our network. Let me remind you that it is the same for all options. 3r3-3000.
3r3-3000.
3r33915. ` input_layer = Input ((w_size, w_size, 3))output_layer = build_model (input_layer, 16)model = Model (input_layer, output_layer)model.compile (loss = bce_dice_loss, optimizer = Adam (lr = 1e-3), metrics =[my_iou_metric])model.load_weights ('./keras.weights', by_name = False)while True:history = model.fit (train_x, train_y,batch_size = 3?epochs = ? 3r3-31008. verbose = ? 3r3-31008. validation_split = ???r3r31008.)if history.history['my_iou_metric'] > ???:break` 3r33339. 3r3-3000.
3r33636. Train on 7372 samples, validate on 820 samples

Epoch 1/1 3-3331000.
7372/7372[==============================]- 56s 8ms /step - loss: ??? - my_iou_metric: ??? - val_loss: ??? - val_my_iou_metric: ???r3-31000.
Train on 7372 samples, validate on 820 samples

Epoch 1/1 3-3331000.
7372/7372[==============================]- 53s 7ms /step - loss: ??? - my_iou_metric: ??? - val_loss: ??? - val_my_iou_metric: ???r3-31000.
Train on 7372 samples, validate on 820 samples

Epoch 1/1 3-3331000.
7372/7372[==============================]- 53s 7ms /step - loss: ??? - my_iou_metric: ??? - val_loss: ??? - val_my_iou_metric: ???r3r31000.
3r3-3000.
The hypothesis is confirmed, polygons and ellipses are easily distinguishable. The attentive reader will note here - of course they are different, a foolish question, any normal AI can distinguish a second-order curve from the first line. Those. the network easily determines the presence of a boundary in the form of a second order curve. We will not argue, replace the oval by the heptagon and check. 3r3-3000.
3r3-3000.
3r33982. The fifth experiment, the most difficult 3r33983. 3r3-3000.
There are no curves, only smooth faces of regular inclined and rotated heptagons and arbitrary quadrangular polygons. Let us change the image /mask generator to the change function - only the projections of regular heptagons and arbitrary quadrangular polygons of the same color. 3r3-3000.
3r3-3000.
3r3r6886. 3r3887. the final editing of the picture generation function 3r3888. 3r3889. 3r33915. ` def next_pair (_n = 7):p = np.random.sample () - ???r3-31008. c_x = np.random.sample () * (w_size-2 * radius_max) + radius_maxc_y = np.random.sample () * (w_size-2 * radius_max) + radius_maxradius = np.random.sample () * (radius_max-radius_min) + radius_mind = np.random.sample () * 0.5 + 1a_deg = np.random.sample () * 360a_rad = np.deg2rad (a_deg)poly =[]# build the points of the heptagonfor k in range (_n): 3r3-31008. # first points of the regular heptagon# с_х с_у - coordinates of the centerpoly.append (c_x + radius * math.sin (2. * k * math.pi /_n))poly.append (c_y + radius * math.cos (2. * k * math.pi /_n))# compress the Hexagon# an arbitrary from 0.5 to 1.5 magnitudepoly[-2]= (poly[-2]-c_x) /d + c_xpoly[-1]= (poly[-1]-c_y) + c_y# rotate to a random anglepoly[-2]= ((poly[-2]-c_x) * math.cos (a_rad)- (poly[-1]-c_y) * math.sin (a_rad)) + c_xpoly[-1]= ((poly[-2]-c_x) * math.sin (a_rad)+ (poly[-1]-c_y) * math.cos (a_rad)) + c_ypoly = np.rint (poly). fresh (-?2)rr, cc = polygon (poly[:, 0], poly[:, 1], img_l.shape)p0 = np.rint (np.random.sample () * (radius_max-radius_min) + radius_min)p1 = np.rint (np.random.sample () * (w_size-radius_max))p2 = np.rint (np.random.sample () * (w_size-radius_max))p3 = np.rint (np.random.sample () * 2. * radius_min - radius_min)p4 = np.rint (np.random.sample () * 2. * radius_min - radius_min)p5 = np.rint (np.random.sample () * 2. * radius_min - radius_min)p6 = np.rint (np.random.sample () * 2. * radius_min - radius_min)p7 = np.rint (np.random.sample () * 2. * radius_min - radius_min)p8 = np.rint (np.random.sample () * 2. * radius_min - radius_min)poly = np.array (((p? p2), 3r3-31008. (p1 + p? p2 + p4 + p0),(p1 + p5 + p? p2 + p6 + p0),(p1 + p7 + p? p2 + p8),(p? p2), 3r3-31008. ))rr_p, cc_p = polygon (poly[:, 0], poly[:, 1], img_l.shape)in_sc = list (set (rr) & set (rr_p))if len (in_sc)> 0:if np.mean (rr_p)> np.mean (in_sc):poly + = np.max (in_sc) - np.min (in_sc)else:poly - = np.max (in_sc) - np.min (in_sc)rr_p, cc_p = polygon (poly[:, 0], poly[:, 1], img_l.shape)if p> 0:img = img_l.copy ()img[rr, cc]= img_h[rr, cc]img[rr_p, cc_p]= img_h[rr_p, cc_p]else:img = img_h.copy ()img[rr, cc]= img_l[rr, cc]img[rr_p, cc_p]= img_l[rr_p, cc_p]msk = np.zeros ((w_size, w_size, 1), dtype = 'float32')msk[rr, cc]= 1.return img, msk` 3r33339. 3r3-3000.
3r3-3000.
Just as before we build arrays and look at the first 10. 3r3-331000.
3r3-3000.
3r3r6886. 3r3887. build picture masks [/b] 3r3889. 3r33915. ` for k in range (train_num):img, msk = next_pair ()train_x[k]= imgtrain_y[k]= mskfig, axes = plt.subplots (? 1? figsize = (2? 5))for k in range (10): 3r3-31008. axes[0,k].set_axis_off () 3r3-31008. axes[0,k].imshow (train_x[k]) 3r3-31008. axes[1,k].set_axis_off () 3r3-31008. axes[1,k].imshow (train_y[k].squeeze ())` 3r33339. 3r3-3000.
3r3-3000.
3r33912. 3r3-3000.
3r33915. ` input_layer = Input ((w_size, w_size, 3))output_layer = build_model (input_layer, 16)model = Model (input_layer, output_layer)model.compile (loss = dice_loss, optimizer = Adam (lr = 1e-3), metrics =[my_iou_metric])model.load_weights ('./keras.weights', by_name = False)while True:history = model.fit (train_x, train_y,batch_size = 3?epochs = ? 3r3-31008. verbose = ? 3r3-31008. validation_split = ???r3r31008.)if history.history['my_iou_metric'] > ???:break` 3r33339. 3r3-3000.
3r33636. Train on 7372 samples, validate on 820 samples

Epoch 1/1 3-3331000.
7372/7372[==============================]- 54s 7ms /step - loss: ??? - my_iou_metric: ??? - val_loss: ??? - val_my_iou_metric: ???r3r31000.
Train on 7372 samples, validate on 820 samples

Epoch 1/1 3-3331000.
7372/7372[==============================]- 52s 7ms /step - loss: ??? - my_iou_metric: ??? - val_loss: ??? - val_my_iou_metric: ???r3-31000.
Train on 7372 samples, validate on 820 samples

Epoch 1/1 3-3331000.
7372/7372[==============================]- 52s 7ms /step - loss: ??? - my_iou_metric: ??? - val_loss: ??? - val_my_iou_metric: ???r3-31000.
Train on 7372 samples, validate on 820 samples

Epoch 1/1 3-3331000.
7372/7372[==============================]- 52s 7ms /step - loss: ??? - my_iou_metric: ??? - val_loss: ??? - val_my_iou_metric: ???r3r31000.
Train on 7372 samples, validate on 820 samples

Epoch 1/1 3-3331000.
7372/7372[==============================]- 52s 7ms /step - loss: ??? - my_iou_metric: ??? - val_loss: ??? - val_my_iou_metric: ???r3r31000.
Train on 7372 samples, validate on 820 samples

Epoch 1/1 3-3331000.
7372/7372[==============================]- 53s 7ms /step - loss: ??? - my_iou_metric: ??? - val_loss: ??? - val_my_iou_metric: ???r3r31000.
Train on 7372 samples, validate on 820 samples

Epoch 1/1 3-3331000.
7372/7372[==============================]- 52s 7ms /step - loss: ??? - my_iou_metric: ??? - val_loss: ??? - val_my_iou_metric: ???r3r3977. 3r3-3000.
3r3-3000.
3r33982. Results 3r33983. 3r3-3000.
As we can see, the network distinguishes the projections of regular heptagons and arbitrary quadrilateral polygons with an accuracy of ??? on the test set. Network training is stopped at an arbitrary value of ??? and most likely the accuracy should be much better. If we proceed from the thesis that the network finds primitives and their combinations define an object, then in our case there are two areas with their average differing from the background, there are no primitives in the understanding of man. There are no lines, there are no monochrome lines, and there are no corners, respectively, only areas with very similar boundaries. Even if lines are to be drawn, then both objects in the picture are built from identical primitives. 3r3-3000.
3r3-3000.
A question for connoisseurs - what does the network consider to be a sign that distinguishes "ships" from "interference"? Obviously, this is not the color or shape of the borders of the ships. You can of course continue to continue the study of this abstract construction "sea" /"ships", we are not an Academy of Sciences and can conduct research solely out of curiosity. We can change heptagons to octagons or fill in a picture with regular five and six squares and see if the network will distinguish them or not. I leave it for the readers - although it also became interesting to me whether the network can count the number of corners of a polygon and, for a test, place in the picture not regular polygons, but their random projections. 3r3-3000.
3r3-3000.
There are other, no less interesting properties of such ships, and such experiments are useful in that we set all the probabilistic characteristics of the set under study and the unexpected behavior of well-studied networks will add knowledge and benefit. 3r3-3000.
3r3-3000.
The background is random, the color is random, the place of the ship /ellipse is random. There are no lines in the pictures, there are areas with different characteristics, but there are no monochrome lines! In this case, of course there are simplifications and the task can be complicated - for example, choosing colors like 0.0 0.9 and 0.1 1.0 - but there is no difference for the network. The network can and finds patterns that are different from those that a person clearly sees and finds. 3r3-3000.
3r3-3000.
If someone from readers is interested, you can continue research and tinkering in the networks, if something does not work or is not clear or suddenly a new and good idea appears and amazes with its beauty, then you can always share with us or ask the masters (and grandmasters too) and ask for qualified help in the ODS community.
+ 0 -

• • • 