A small study of the properties of a simple U-net, a classical convolutional network for segmentation

The article is written according to the analysis and study of materials for the competition to search for ships at sea. 3r3-3000.  
3r3-3000.  
A small study of the properties of a simple U-net, a classical convolutional network for segmentation 3r3-3000.  
3r3-3000.  
Let's try to understand how and what the network is looking for and what it finds. This article is simply the result of curiosity and idle interest, nothing of it is encountered in practice, and for practical tasks there is nothing to copy and paste. But the result is not quite expected. The Internet is full of descriptions of the work of networks in which the authors beautifully and with pictures tell how networks determine primitives - corners, circles, whiskers, tails, etc., then they are searched for segmentation /classification. Many competitions are won using weights from other large and wide networks. It is interesting to understand and see how and what primitives the network builds. 3r3-3000.  
3r3-3000.  
Let's do a little research and consider the options - the author's reasoning and the code are set out, you can check /add /change everything yourself. 3r3-3000.  
3r3-3000.  
Recently ended kaggle competition to search for ships at sea. Airbus offered to analyze satellite images of the sea, both with and without vessels. A total of ???x768x3 pictures is ??? 960 bytes if uint8 and four times as many if float32 (by the way float32 is faster than float6? fewer memory accesses) and you need to find ships in 15606 pictures. As usual, all significant places were taken by people involved in ODS (ods.ai), which is natural and expected, and I hope that we will soon be able to study the train of thought and the code of winners and prize-winners. 3r3-3000.  
3r3-3000.  
We will consider a similar task, but we will simplify it significantly - take the sea np.random.sample * 0.? we do not need waves, wind, shores and other hidden patterns and faces. Let's make the image of the sea really random in the RGB range from 0.0 to 0.5. We will color the ships in the same color and in order to distinguish them from the sea we will place them in the range from 0.5 to 1.? and they will all be of the same shape - ellipses of different sizes and orientations. 3r3-3000.  
3r3-3000.  
3r3-3000.  
3r3-3000.  
Take a very common version of the network (you can take your favorite network) and all the experiments we will do with it. 3r3-3000.  
3r3-3000.  
Next, we will change the parameters of the image, interfere with and build hypotheses - so we select the main features by which the network finds ellipses. Perhaps the reader will make his conclusions and the author will refute. 3r3-3000.  
3r3-3000.  
3r3r6886. 3r3887. We load libraries, we define the sizes of the array of pictures [/b] 3r3889. 3r33915. import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
% matplotlib inline
import math
from tqdm import tqdm_notebook, tqdm
from skimage.draw import ellipse, polygon
from keras import Model
from keras.callbacks import EarlyStopping, ModelCheckpoint, ReduceLROnPlateau
from keras.models import load_model
from keras.optimizers import Adam
from keras.layers import Input, Conv2D, Conv2DTranspose, MaxPooling2D, concatenate, Dropout
from keras.losses import binary_crossentropy
import tensorflow as tf
import keras as keras
from keras import backend as K
from tqdm import tqdm_notebook
w_size = 256
train_num = 8192 3r3-31008. train_x = np.zeros ((train_num, w_size, w_size, 3), dtype = 'float32')
train_y = np.zeros ((train_num, w_size, w_size, 1), dtype = 'float32')
img_l = np.random.sample ((w_size, w_size, 3)) * ???r3r31008. img_h = np.random.sample ((w_size, w_size, 3)) * 0.5 + ???r3r31008.
radius_min = 10
radius_max = 30
3r33339. 3r3-3000.  
3r3-3000.  
3r3r6886. 3r3887. we determine the loss and accuracy functions of 3r3r888. 3r3889. 3r33915. def dice_coef (y_true, y_pred):
y_true_f = K.flatten (y_true)
y_pred = K.cast (y_pred, 'float32')
y_pred_f = K.cast (K.greater (K.flatten (y_pred), 0.5), 'float32')
intersection = y_true_f * y_pred_f
score = 2. * K.sum (intersection) /(K.sum (y_true_f) + K.sum (y_pred_f))
return score
def dice_loss (y_true, y_pred):
smooth = 1.
y_true_f = K.flatten (y_true)
y_pred_f = K.flatten (y_pred)
intersection = y_true_f * y_pred_f
score = (2. * K.sum (intersection) + smooth) /(K.sum (y_true_f) + K.sum (y_pred_f) + smooth)
return 1. - score
def bce_dice_loss (y_true, y_pred):
return binary_crossentropy (y_true, y_pred) + dice_loss (y_true, y_pred)
def get_iou_vector (A, B):
# Numpy version
batch_size = A.shape[0]
metric = ???r3r31008. for batch in range (batch_size):
t, p = A[batch], B[batch]
true = np.sum (t)
pred = np.sum (p) 3r3-31008.
# deal with empty mask first
if true == 0:
metric + = (pred == 0)
continue
# non empty mask case. Union is never empty
# hence it is safe to divide by
intersection = np.sum (t * p)
union = true + pred - intersection
iou = intersection /union
#iou metrric is a stepwise approximation of the real iou over ???r3r31008. iou = np.floor (max (? (iou - ???) * 20)) /10
metric + = iou
# teake the average over all images in batch
metric /= batch_size
return metric
def my_iou_metric (label, pred):
# Tensorflow version
return tf.py_func (get_iou_vector,[label, pred > 0.5], tf.float64)
from keras.utils.generic_utils import get_custom_objects
get_custom_objects (). update ({'bce_dice_loss': bce_dice_loss})
get_custom_objects (). update ({'dice_loss': dice_loss})
get_custom_objects (). update ({'dice_coef': dice_coef})
get_custom_objects (). update ({'my_iou_metric': my_iou_metric})
3r33339. 3r3-3000.  
3r3-3000.  
We use the classic metric in image segmentation, there are a lot of articles, a code with comments and text about the selected metric, on the same kaggle there are lots of options with comments and explanations. We will predict the pixel mask - this is the "sea" or "ship" and evaluate the truth or falsehood of the prediction. Those. the following four options are possible - we correctly predicted that a pixel is a “sea”, correctly predicted that a pixel is a “ship” or made a mistake in predicting a “sea” or “ship”. And so for all the pictures and all the pixels we estimate the number of all four options and calculate the result - this will be the result of the network. And the fewer erroneous predictions and the more true, the more accurate the result and the better the operation of the network. 3r3-3000.  
3r3-3000.  
And for research we will take a well-studied u-net, this is an excellent network for image segmentation. The network is very common in such competitions and there are many descriptions, subtleties of application, etc. A variant of the classic U-net was chosen and, of course, it was possible to modernize it, add residual blocks, etc. But "it is impossible to embrace the immense" and conduct all the experiments and tests at once. U-net performs a very simple operation with pictures - step by step reduces the dimension of the picture with some transformations and then tries to restore the mask from the compressed image. Those. the dimension of the picture in our case is brought to 32x32 and then we try to restore the mask using data from all previous compressions. 3r3-3000.  
3r3-3000.  
The picture shows the U-net diagram from the original article, but we have slightly altered it, but the essence remains the same - compress the picture → expand it into the mask. 3r3-3000.  
3r3-3000.  
3r3172. 3r3-3000.  
3r3-3000.  
3r3r6886. 3r3887. Just U-net [/b] 3r3889. 3r33915. def build_model (input_layer, start_neurons):
conv1 = Conv2D (start_neurons * ? (?3), activation = "relu", padding = "same") (input_layer)
conv1 = Conv2D (start_neurons * ? (?3), activation = "relu", padding = "same") (conv1)
pool1 = MaxPooling2D ((? 2)) (conv1)
pool1 = Dropout (???) (pool1)
conv2 = Conv2D (start_neurons * ? (?3), activation = "relu", padding = "same") (pool1)
conv2 = Conv2D (start_neurons * ? (?3), activation = "relu", padding = "same") (conv2)
pool2 = MaxPooling2D ((? 2)) (conv2)
pool2 = Dropout (0.5) (pool2)
conv3 = Conv2D (start_neurons * ? (?3), activation = "relu", padding = "same") (pool2)
conv3 = Conv2D (start_neurons * ? (?3), activation = "relu", padding = "same") (conv3)
pool3 = MaxPooling2D ((? 2)) (conv3)
pool3 = Dropout (0.5) (pool3)
conv4 = Conv2D (start_neurons * ? (?3), activation = "relu", padding = "same") (pool3)
conv4 = Conv2D (start_neurons * ? (?3), activation = "relu", padding = "same") (conv4)
pool4 = MaxPooling2D ((? 2)) (conv4)
pool4 = Dropout (0.5) (pool4)
# Middle
convm = Conv2D (start_neurons * 1? (?3), activation = "relu", padding = "same") (pool4)
convm = Conv2D (start_neurons * 1? (?3), activation = "relu", padding = "same") (convm)
deconv4 = Conv2DTranspose (start_neurons * ? (? 3), strides = (? 2), padding = "same") (convm)
uconv4 = concatenate ([deconv4, conv4])
uconv4 = Dropout (0.5) (uconv4) 3r3-31008. uconv4 = Conv2D (start_neurons * ? (?3), activation = "relu", padding = "same") (uconv4)
uconv4 = Conv2D (start_neurons * ? (?3), activation = "relu", padding = "same") (uconv4)
deconv3 = Conv2DTranspose (start_neurons * ? (?3), strides = (? 2), padding = "same") (uconv4)
uconv3 = concatenate ([deconv3, conv3])
uconv3 = Dropout (0.5) (uconv3) 3r3-31008. uconv3 = Conv2D (start_neurons * ? (?3), activation = "relu", padding = "same") (uconv3)
uconv3 = Conv2D (start_neurons * ? (?3), activation = "relu", padding = "same") (uconv3)
deconv2 = Conv2DTranspose (start_neurons * ? (?3), strides = (? 2), padding = "same") (uconv3)
uconv2 = concatenate ([deconv2, conv2])
uconv2 = Dropout (0.5) (uconv2) 3r3-31008. uconv2 = Conv2D (start_neurons * ? (?3), activation = "relu", padding = "same") (uconv2)
uconv2 = Conv2D (start_neurons * ? (?3), activation = "relu", padding = "same") (uconv2)
deconv1 = Conv2DTranspose (start_neurons * ? (?3), strides = (? 2), padding = "same") (uconv2)
uconv1 = concatenate ([deconv1, conv1])
uconv1 = Dropout (0.5) (uconv1) 3r3-31008. uconv1 = Conv2D (start_neurons * ? (?3), activation = "relu", padding = "same") (uconv1)
uconv1 = Conv2D (start_neurons * ? (?3), activation = "relu", padding = "same") (uconv1)
uncov1 = Dropout (0.5) (uconv1) 3r3-31008. output_layer = Conv2D (? (?1), padding = "same", activation = "sigmoid") (uconv1)
return output_layer
3r33339. 3r3-3000.  
3r3-3000.  
3r33982. The first experiment. The easiest
3r3-3000.  
The first version of our experiment was chosen especially for clarity, very simple - the sea is lighter, the vessels are darker. Everything is very simple and obvious, we hypothesize that the network will find the ships /ellipses without problems and with any accuracy. The next_pair function generates a picture /mask pair in which the location, size, angle of rotation are chosen randomly. Further, all changes will be made to this function - changing the coloring, shape, noise, etc. But now the easiest option, check the hypothesis of dark ships on a light background. 3r3-3000.  
3r3-3000.  
3r33915. def next_pair ():
p = np.random.sample () - 0.5 # until we use
# r, c - coordinates of the center of the ellipse
r = np.random.sample () * (w_size-2 * radius_max) + radius_max
c = np.random.sample () * (w_size-2 * radius_max) + radius_max
# large and small radii of the ellipse
r_radius = np.random.sample () * (radius_max-radius_min) + radius_min
c_radius = np.random.sample () * (radius_max-radius_min) + radius_min
rot = np.random.sample () * 360 # slope of the ellipse
rr, cc = ellipse (3r3-31008. r, c, 3r3-31008. r_radius, c_radius,
rotation = np.deg2rad (rot), 3r3-31008. shape = img_l.shape
) # we get all points of ellapp.
# paint the pixels of the sea /background in the noise from 0.5 to ???r3r31008. img = img_h.copy ()
# paint the pixels of the ellipse with noise from 0.0 to ???r3r31008. img[rr, cc]= img_l[rr, cc]
msk = np.zeros ((w_size, w_size, 1), dtype = 'float32')
msk[rr, cc]= 1. # paint the pixels of the ellipse mask
return img, msk
3r33339. 3r3-3000.  
We generate the whole train and see what happens. It is quite similar to the ships in the sea and nothing superfluous. Everything is clearly visible, clear and understandable. The arrangement is random, and there is only one ellipse in each picture. 3r3-3000.  
3r3-3000.  
3r33915. for k in range (train_num): # generate all img train
img, msk = next_pair ()
train_x[k]= img
train_y[k]= msk
fig, axes = plt.subplots (? 1? figsize = (2? 5)) # look at the first 10 with masks
for k in range (10): 3r3-31008. axes[0,k].set_axis_off () 3r3-31008. axes[0,k].imshow (train_x[k]) 3r3-31008. axes[1,k].set_axis_off () 3r3-31008. axes[1,k].imshow (train_y[k].squeeze ())
3r33339. 3r3-3000.  
3r3305. 3r3-3000.  
3r3-3000.  
There is no doubt that the network will learn successfully and will find ellipses. But let's check our hypothesis that the network is learning to find ellipses /ships and at the same time with high accuracy. 3r3-3000.  
3r3-3000.  
3r33915. input_layer = Input ((w_size, w_size, 3))
output_layer = build_model (input_layer, 16)
model = Model (input_layer, output_layer)
model.compile (loss = bce_dice_loss, optimizer = Adam (lr = 1e-3), metrics =[my_iou_metric])
model.save_weights ('./keras.weights')
while True:
history = model.fit (train_x, train_y,
batch_size = 3?
epochs = ? 3r3-31008. verbose = ? 3r3-31008. validation_split = ???r3r31008.)
if history.history['my_iou_metric'] [0]> ???:
break
3r33339. 3r3-3000.  
3r33636. Train on 7372 samples, validate on 820 samples
 
Epoch 1/1 3-3331000.  
7372/7372[==============================]- 55s 7ms /step - loss: ??? - my_iou_metric: ??? - val_loss: ??? - val_my_iou_metric: ???r3-31000.  
Train on 7372 samples, validate on 820 samples
 
Epoch 1/1 3-3331000.  
7372/7372[==============================]- 53s 7ms /step - loss: ??? - my_iou_metric: ??? - val_loss: ??? - val_my_iou_metric: ???r3-31000.  
3r3-3000.  
3r3-3000.  
The net successfully finds ellipses. But it is not at all proven that it is looking for ellipses in the human understanding, as the area is bounded by the ellipse equation and filled with different content from the background, there is no certainty that there will be network weights similar to the coefficients of the quadratic ellipse equation. And it is obvious that the brightness of the ellipse is less than the brightness of the background and no secret and riddle - we assume that we just checked the code. Let's correct the obvious face, make the background and the color of the ellipse also random. 3r3-3000.  
3r3-3000.  
3r33982. The second option is 3r38383. 3r3-3000.  
Now the same ellipses on the same sea, but the color of the sea and, accordingly, the ship is chosen randomly. If the sea is darker, the ship will be lighter and vice versa. Those. According to the brightness of the group of points, it is impossible to determine whether they are located outside the ellipse, i.e. the sea or it is the points inside the ellipse. And again we will test our hypothesis that the network will find ellipses regardless of color. 3r3-3000.  
3r3-3000.  
3r33915. def next_pair ():
p = np.random.sample () - 0.5 # is the choice of the background /ellipse color 3r3-31008. r = np.random.sample () * (w_size-2 * radius_max) + radius_max
c = np.random.sample () * (w_size-2 * radius_max) + radius_max
r_radius = np.random.sample () * (radius_max-radius_min) + radius_min
c_radius = np.random.sample () * (radius_max-radius_min) + radius_min
rot = np.random.sample () * 360
rr, cc = ellipse (3r3-31008. r, c, 3r3-31008. r_radius, c_radius,
rotation = np.deg2rad (rot),
shape = img_l.shape
)
if p> 0: # if you selected a darker background than
img = img_l.copy ()
img[rr, cc]= img_h[rr, cc]
else: # if the selected background is lighter than
img = img_h.copy ()
img[rr, cc]= img_l[rr, cc]
msk = np.zeros ((w_size, w_size, 1), dtype = 'float32')
msk[rr, cc]= 1.
return img, msk
3r33339. 3r3-3000.  
Now, by pixel and its surroundings, it is impossible to determine the background or ellipse. We also carry out the generation of images and masks and look at the first 10 screen.  
3r3-3000.  
3r3r6886. 3r3887. build picture masks [/b] 3r3889. 3r33915. for k in range (train_num):
img, msk = next_pair ()
train_x[k]= img
train_y[k]= msk
fig, axes = plt.subplots (? 1? figsize = (2? 5))
for k in range (10): 3r3-31008. axes[0,k].set_axis_off () 3r3-31008. axes[0,k].imshow (train_x[k]) 3r3-31008. axes[1,k].set_axis_off () 3r3-31008. axes[1,k].imshow (train_y[k].squeeze ())
3r33339. 3r3-3000.  
3r3-3000.  
3r3-3000.  
3r33434. 3r3-3000.  
3r33915. input_layer = Input ((w_size, w_size, 3))
output_layer = build_model (input_layer, 16)
model = Model (input_layer, output_layer)
model.compile (loss = bce_dice_loss, optimizer = Adam (lr = 1e-3), metrics =[my_iou_metric])
model.load_weights ('./keras.weights', by_name = False)
while True:
history = model.fit (train_x, train_y,
batch_size = 3?
epochs = ? 3r3-31008. verbose = ? 3r3-31008. validation_split = ???r3r31008.)
if history.history['my_iou_metric'] [0]> ???:
break
3r33339. 3r3-3000.  
3r33636. Train on 7372 samples, validate on 820 samples
 
Epoch 1/1 3-3331000.  
7372/7372[==============================]- 56s 8ms /step - loss: ??? - my_iou_metric: ??? - val_loss: ??? - val_my_iou_metric: ???r3r31000.  
Train on 7372 samples, validate on 820 samples
 
Epoch 1/1 3-3331000.  
7372/7372[==============================]- 55s 7ms /step - loss: ??? - my_iou_metric: ??? - val_loss: ??? - val_my_iou_metric: ???r3r3977. 3r3-3000.  
3r3-3000.  
The network easily copes and finds all ellipses. But even here there is a flaw in the implementation, and everything is obvious - the smaller of the two areas in the picture is the ellipse, another background. Perhaps this is an incorrect hypothesis, but still correct, add another polygon on the image of the same color as the ellipse. 3r3-3000.  
3r3-3000.  
3r33982. The third option
3r3-3000.  
On each picture, we randomly select the color of the sea from two options and add an ellipse and a rectangle both different from the color of the sea. It turns out the same "sea", just painted "ship", but in the same picture we add a rectangle of the same color as the "ship" and also with a randomly selected size. Now our assumption is more complicated, in the picture there are two equally colored objects, but we hypothesize that the network will still learn how to choose the right object. 3r3-3000.  
3r3-3000.  
3r3r6886. 3r3887. drawing program of ellipses and rectangles [/b] 3r3889. 3r33915. def next_pair ():
# choose the parameters of the ellipse
as before. p = np.random.sample () - ???r3-31008. r = np.random.sample () * (w_size-2 * radius_max) + radius_max
c = np.random.sample () * (w_size-2 * radius_max) + radius_max
r_radius = np.random.sample () * (radius_max-radius_min) + radius_min
c_radius = np.random.sample () * (radius_max-radius_min) + radius_min
rot = np.random.sample () * 360
rr, cc = ellipse (3r3-31008. r, c, 3r3-31008. r_radius, c_radius,
rotation = np.deg2rad (rot),
shape = img_l.shape
)
p1 = np.rint (np.random.sample () * (w_size-2 * radius_max) + radius_max)
p2 = np.rint (np.random.sample () * (w_size-2 * radius_max) + radius_max)
p3 = np.rint (np.random.sample () * (2 * radius_max - radius_min) + radius_min)
p4 = np.rint (np.random.sample () * (2 * radius_max - radius_min) + radius_min)
# select the rectangle /clutter parameters, set the four corners
poly = np.array ((3r3-31008. (p? p2), 3r3-31008. (p? p2 + p4), 3r3-331008. (p1 + p? p2 + p4),
(p1 + p? p2), 3r331018. (( p? p2), 3r3-31008.)) 3r3-31008. rr_p, cc_p = polygon (poly[:, 0], poly[:, 1], img_l.shape)
in_sc = list (set (rr) & set (rr_p)) # make sure that the rectangle is
# did not intersect with the ellipse
# and move it to the side if necessary
if len (in_sc)> 0:
if np.mean (rr_p)> np.mean (in_sc):
poly + = np.max (in_sc) - np.min (in_sc)
else:
poly - = np.max (in_sc) - np.min (in_sc)
rr_p, cc_p = polygon (poly[:, 0], poly[:, 1], img_l.shape)
if p> 0:
img = img_l.copy ()
img[rr, cc]= img_h[rr, cc]
img[rr_p, cc_p]= img_h[rr_p, cc_p]
else:
img = img_h.copy ()
img[rr, cc]= img_l[rr, cc]
img[rr_p, cc_p]= img_l[rr_p, cc_p]
msk = np.zeros ((w_size, w_size, 1), dtype = 'float32')
msk[rr, cc]= 1.
return img, msk
3r33339. 3r3-3000.  
3r3-3000.  
Just as before, we calculate pictures and masks and look at the first 10 pairs. 3r3-3000.  
3r3-3000.  
3r3r6886. 3r3887. we build picture masks ellipses and rectangles [/b] 3r3889. 3r33915. for k in range (train_num):
img, msk = next_pair ()
train_x[k]= img
train_y[k]= msk
fig, axes = plt.subplots (? 1? figsize = (2? 5))
for k in range (10): 3r3-31008. axes[0,k].set_axis_off () 3r3-31008. axes[0,k].imshow (train_x[k]) 3r3-31008. axes[1,k].set_axis_off () 3r3-31008. axes[1,k].imshow (train_y[k].squeeze ())
3r33339. 3r3-3000.  
3r3-3000.  
3r3-3000.  
3r33578. 3r3-3000.  
3r33915. input_layer = Input ((w_size, w_size, 3))
output_layer = build_model (input_layer, 16)
model = Model (input_layer, output_layer)
model.compile (loss = bce_dice_loss, optimizer = Adam (lr = 1e-3), metrics =[my_iou_metric])
model.load_weights ('./keras.weights', by_name = False)
while True:
history = model.fit (train_x, train_y,
batch_size = 3?
epochs = ? 3r3-31008. verbose = ? 3r3-31008. validation_split = ???r3r31008.)
if history.history['my_iou_metric'] [0]> ???:
break
3r33339. 3r3-3000.  
3r33636. Train on 7372 samples, validate on 820 samples
 
Epoch 1/1 3-3331000.  
7372/7372[==============================]- 57s 8ms /step - loss: ??? - my_iou_metric: ??? - val_loss: ??? - val_my_iou_metric: ???r3r31000.  
Train on 7372 samples, validate on 820 samples
 
Epoch 1/1 3-3331000.  
7372/7372[==============================]- 55s 7ms /step - loss: ??? - my_iou_metric: ??? - val_loss: ??? - val_my_iou_metric: ???r3r3977. 3r3-3000.  
3r3-3000.  
It was not possible to confuse the network with rectangles and our hypothesis is confirmed. At the Airbus competition, everyone, judging by the examples and discussions, was a single vessel, and several vessels were quite close by. Ellipse from the rectangle - i.e. the vessel is different from the house on the shore, the network is different, although the polygons are the same color as the ellipses. The point is not in color, because both the ellipse and the rectangle are equally randomly colored. 3r3-3000.  
3r3-3000.  
3r33982. The fourth option
3r3-3000.  
Perhaps the network is distinguished by rectangles - correct, distort them. Those. the network easily finds both closed regions regardless of the shape and discards the one that is a rectangle. This is the author's hypothesis - let's check it, for which we will add not rectangles, but quadrangular polygons of arbitrary shape. And again our hypothesis is that the network will distinguish an ellipse from an arbitrary quadrilateral polygon of the same coloring. 3r3-3000.  
3r3-3000.  
You can of course get into the inside of the network and look at the layers there and analyze the meaning of weights and shifts. The author is interested in the resulting behavior of the network, the judgment will be based on the result of the work, although it is always interesting to look inside. 3r3-3000.  
3r3-3000.  
3r3r6886. 3r3887. make changes to the generation of images 3r3888. 3r3889. 3r33915. def next_pair ():
p = np.random.sample () - ???r3-31008. r = np.random.sample () * (w_size-2 * radius_max) + radius_max
c = np.random.sample () * (w_size-2 * radius_max) + radius_max
r_radius = np.random.sample () * (radius_max-radius_min) + radius_min
c_radius = np.random.sample () * (radius_max-radius_min) + radius_min
rot = np.random.sample () * 360
rr, cc = ellipse (3r3-31008. r, c, 3r3-31008. r_radius, c_radius,
rotation = np.deg2rad (rot),
shape = img_l.shape
)
p0 = np.rint (np.random.sample () * (radius_max-radius_min) + radius_min)
p1 = np.rint (np.random.sample () * (w_size-radius_max))
p2 = np.rint (np.random.sample () * (w_size-radius_max))
p3 = np.rint (np.random.sample () * 2. * radius_min - radius_min)
p4 = np.rint (np.random.sample () * 2. * radius_min - radius_min)
p5 = np.rint (np.random.sample () * 2. * radius_min - radius_min)
p6 = np.rint (np.random.sample () * 2. * radius_min - radius_min)
p7 = np.rint (np.random.sample () * 2. * radius_min - radius_min)
p8 = np.rint (np.random.sample () * 2. * radius_min - radius_min)
poly = np.array ((
(p? p2),
(p1 + p? p2 + p4 + p0),
(p1 + p5 + p? p2 + p6 + p0),
(p1 + p7 + p? p2 + p8), 3r3-31008. (p? p2),
)) 3r3-331008. rr_p, cc_p = polygon (poly[:, 0], poly[:, 1], img_l.shape)
in_sc = list (set (rr) & set (rr_p))
if len (in_sc)> 0:
if np.mean (rr_p)> np.mean (in_sc):
poly + = np.max (in_sc) - np.min (in_sc)
else:
poly - = np.max (in_sc) - np.min (in_sc)
rr_p, cc_p = polygon (poly[:, 0], poly[:, 1], img_l.shape)
if p> 0:
img = img_l.copy ()
img[rr, cc]= img_h[rr, cc]
img[rr_p, cc_p]= img_h[rr_p, cc_p]
else:
img = img_h.copy ()
img[rr, cc]= img_l[rr, cc]
img[rr_p, cc_p]= img_l[rr_p, cc_p]
msk = np.zeros ((w_size, w_size, 1), dtype = 'float32')
msk[rr, cc]= 1.
return img, msk
3r33339. 3r3-3000.  
3r3-3000.  
We calculate pictures and masks and look at the first 10 pairs. 3r3-3000.  
3r3-3000.  
3r3r6886. 3r3887. We build picture masks ellipses and polygons [/b] 3r3889. 3r33915. for k in range (train_num):
img, msk = next_pair ()
train_x[k]= img
train_y[k]= msk
fig, axes = plt.subplots (? 1? figsize = (2? 5))
for k in range (10): 3r3-31008. axes[0,k].set_axis_off () 3r3-31008. axes[0,k].imshow (train_x[k]) 3r3-31008. axes[1,k].set_axis_off () 3r3-31008. axes[1,k].imshow (train_y[k].squeeze ())
3r33339. 3r3-3000.  
3r3-3000.  
3r3-3000.  
3r3-3000.  
We start our network. Let me remind you that it is the same for all options. 3r3-3000.  
3r3-3000.  
3r33915. input_layer = Input ((w_size, w_size, 3))
output_layer = build_model (input_layer, 16)
model = Model (input_layer, output_layer)
model.compile (loss = bce_dice_loss, optimizer = Adam (lr = 1e-3), metrics =[my_iou_metric])
model.load_weights ('./keras.weights', by_name = False)
while True:
history = model.fit (train_x, train_y,
batch_size = 3?
epochs = ? 3r3-31008. verbose = ? 3r3-31008. validation_split = ???r3r31008.)
if history.history['my_iou_metric'] [0]> ???:
break
3r33339. 3r3-3000.  
3r33636. Train on 7372 samples, validate on 820 samples
 
Epoch 1/1 3-3331000.  
7372/7372[==============================]- 56s 8ms /step - loss: ??? - my_iou_metric: ??? - val_loss: ??? - val_my_iou_metric: ???r3-31000.  
Train on 7372 samples, validate on 820 samples
 
Epoch 1/1 3-3331000.  
7372/7372[==============================]- 53s 7ms /step - loss: ??? - my_iou_metric: ??? - val_loss: ??? - val_my_iou_metric: ???r3-31000.  
Train on 7372 samples, validate on 820 samples
 
Epoch 1/1 3-3331000.  
7372/7372[==============================]- 53s 7ms /step - loss: ??? - my_iou_metric: ??? - val_loss: ??? - val_my_iou_metric: ???r3r31000.  
3r3-3000.  
The hypothesis is confirmed, polygons and ellipses are easily distinguishable. The attentive reader will note here - of course they are different, a foolish question, any normal AI can distinguish a second-order curve from the first line. Those. the network easily determines the presence of a boundary in the form of a second order curve. We will not argue, replace the oval by the heptagon and check. 3r3-3000.  
3r3-3000.  
3r33982. The fifth experiment, the most difficult 3r33983. 3r3-3000.  
There are no curves, only smooth faces of regular inclined and rotated heptagons and arbitrary quadrangular polygons. Let us change the image /mask generator to the change function - only the projections of regular heptagons and arbitrary quadrangular polygons of the same color. 3r3-3000.  
3r3-3000.  
3r3r6886. 3r3887. the final editing of the picture generation function 3r3888. 3r3889. 3r33915. def next_pair (_n = 7):
p = np.random.sample () - ???r3-31008. c_x = np.random.sample () * (w_size-2 * radius_max) + radius_max
c_y = np.random.sample () * (w_size-2 * radius_max) + radius_max
radius = np.random.sample () * (radius_max-radius_min) + radius_min
d = np.random.sample () * 0.5 + 1
a_deg = np.random.sample () * 360
a_rad = np.deg2rad (a_deg)
poly =[]# build the points of the heptagon
for k in range (_n): 3r3-31008. # first points of the regular heptagon
# с_х с_у - coordinates of the center
poly.append (c_x + radius * math.sin (2. * k * math.pi /_n))
poly.append (c_y + radius * math.cos (2. * k * math.pi /_n))
# compress the Hexagon
# an arbitrary from 0.5 to 1.5 magnitude
poly[-2]= (poly[-2]-c_x) /d + c_x
poly[-1]= (poly[-1]-c_y) + c_y
# rotate to a random angle
poly[-2]= ((poly[-2]-c_x) * math.cos (a_rad)
- (poly[-1]-c_y) * math.sin (a_rad)) + c_x
poly[-1]= ((poly[-2]-c_x) * math.sin (a_rad)
+ (poly[-1]-c_y) * math.cos (a_rad)) + c_y
poly = np.rint (poly). fresh (-?2)
rr, cc = polygon (poly[:, 0], poly[:, 1], img_l.shape)
p0 = np.rint (np.random.sample () * (radius_max-radius_min) + radius_min)
p1 = np.rint (np.random.sample () * (w_size-radius_max))
p2 = np.rint (np.random.sample () * (w_size-radius_max))
p3 = np.rint (np.random.sample () * 2. * radius_min - radius_min)
p4 = np.rint (np.random.sample () * 2. * radius_min - radius_min)
p5 = np.rint (np.random.sample () * 2. * radius_min - radius_min)
p6 = np.rint (np.random.sample () * 2. * radius_min - radius_min)
p7 = np.rint (np.random.sample () * 2. * radius_min - radius_min)
p8 = np.rint (np.random.sample () * 2. * radius_min - radius_min)
poly = np.array ((
(p? p2), 3r3-31008. (p1 + p? p2 + p4 + p0),
(p1 + p5 + p? p2 + p6 + p0),
(p1 + p7 + p? p2 + p8),
(p? p2), 3r3-31008. ))
rr_p, cc_p = polygon (poly[:, 0], poly[:, 1], img_l.shape)
in_sc = list (set (rr) & set (rr_p))
if len (in_sc)> 0:
if np.mean (rr_p)> np.mean (in_sc):
poly + = np.max (in_sc) - np.min (in_sc)
else:
poly - = np.max (in_sc) - np.min (in_sc)
rr_p, cc_p = polygon (poly[:, 0], poly[:, 1], img_l.shape)
if p> 0:
img = img_l.copy ()
img[rr, cc]= img_h[rr, cc]
img[rr_p, cc_p]= img_h[rr_p, cc_p]
else:
img = img_h.copy ()
img[rr, cc]= img_l[rr, cc]
img[rr_p, cc_p]= img_l[rr_p, cc_p]
msk = np.zeros ((w_size, w_size, 1), dtype = 'float32')
msk[rr, cc]= 1.
return img, msk
3r33339. 3r3-3000.  
3r3-3000.  
Just as before we build arrays and look at the first 10. 3r3-331000.  
3r3-3000.  
3r3r6886. 3r3887. build picture masks [/b] 3r3889. 3r33915. for k in range (train_num):
img, msk = next_pair ()
train_x[k]= img
train_y[k]= msk
fig, axes = plt.subplots (? 1? figsize = (2? 5))
for k in range (10): 3r3-31008. axes[0,k].set_axis_off () 3r3-31008. axes[0,k].imshow (train_x[k]) 3r3-31008. axes[1,k].set_axis_off () 3r3-31008. axes[1,k].imshow (train_y[k].squeeze ())
3r33339. 3r3-3000.  
3r3-3000.  
3r33912. 3r3-3000.  
3r33915. input_layer = Input ((w_size, w_size, 3))
output_layer = build_model (input_layer, 16)
model = Model (input_layer, output_layer)
model.compile (loss = dice_loss, optimizer = Adam (lr = 1e-3), metrics =[my_iou_metric])
model.load_weights ('./keras.weights', by_name = False)
while True:
history = model.fit (train_x, train_y,
batch_size = 3?
epochs = ? 3r3-31008. verbose = ? 3r3-31008. validation_split = ???r3r31008.)
if history.history['my_iou_metric'] [0]> ???:
break
3r33339. 3r3-3000.  
3r33636. Train on 7372 samples, validate on 820 samples
 
Epoch 1/1 3-3331000.  
7372/7372[==============================]- 54s 7ms /step - loss: ??? - my_iou_metric: ??? - val_loss: ??? - val_my_iou_metric: ???r3r31000.  
Train on 7372 samples, validate on 820 samples
 
Epoch 1/1 3-3331000.  
7372/7372[==============================]- 52s 7ms /step - loss: ??? - my_iou_metric: ??? - val_loss: ??? - val_my_iou_metric: ???r3-31000.  
Train on 7372 samples, validate on 820 samples
 
Epoch 1/1 3-3331000.  
7372/7372[==============================]- 52s 7ms /step - loss: ??? - my_iou_metric: ??? - val_loss: ??? - val_my_iou_metric: ???r3-31000.  
Train on 7372 samples, validate on 820 samples
 
Epoch 1/1 3-3331000.  
7372/7372[==============================]- 52s 7ms /step - loss: ??? - my_iou_metric: ??? - val_loss: ??? - val_my_iou_metric: ???r3r31000.  
Train on 7372 samples, validate on 820 samples
 
Epoch 1/1 3-3331000.  
7372/7372[==============================]- 52s 7ms /step - loss: ??? - my_iou_metric: ??? - val_loss: ??? - val_my_iou_metric: ???r3r31000.  
Train on 7372 samples, validate on 820 samples
 
Epoch 1/1 3-3331000.  
7372/7372[==============================]- 53s 7ms /step - loss: ??? - my_iou_metric: ??? - val_loss: ??? - val_my_iou_metric: ???r3r31000.  
Train on 7372 samples, validate on 820 samples
 
Epoch 1/1 3-3331000.  
7372/7372[==============================]- 52s 7ms /step - loss: ??? - my_iou_metric: ??? - val_loss: ??? - val_my_iou_metric: ???r3r3977. 3r3-3000.  
3r3-3000.  
3r33982. Results 3r33983. 3r3-3000.  
As we can see, the network distinguishes the projections of regular heptagons and arbitrary quadrilateral polygons with an accuracy of ??? on the test set. Network training is stopped at an arbitrary value of ??? and most likely the accuracy should be much better. If we proceed from the thesis that the network finds primitives and their combinations define an object, then in our case there are two areas with their average differing from the background, there are no primitives in the understanding of man. There are no lines, there are no monochrome lines, and there are no corners, respectively, only areas with very similar boundaries. Even if lines are to be drawn, then both objects in the picture are built from identical primitives. 3r3-3000.  
3r3-3000.  
A question for connoisseurs - what does the network consider to be a sign that distinguishes "ships" from "interference"? Obviously, this is not the color or shape of the borders of the ships. You can of course continue to continue the study of this abstract construction "sea" /"ships", we are not an Academy of Sciences and can conduct research solely out of curiosity. We can change heptagons to octagons or fill in a picture with regular five and six squares and see if the network will distinguish them or not. I leave it for the readers - although it also became interesting to me whether the network can count the number of corners of a polygon and, for a test, place in the picture not regular polygons, but their random projections. 3r3-3000.  
3r3-3000.  
There are other, no less interesting properties of such ships, and such experiments are useful in that we set all the probabilistic characteristics of the set under study and the unexpected behavior of well-studied networks will add knowledge and benefit. 3r3-3000.  
3r3-3000.  
The background is random, the color is random, the place of the ship /ellipse is random. There are no lines in the pictures, there are areas with different characteristics, but there are no monochrome lines! In this case, of course there are simplifications and the task can be complicated - for example, choosing colors like 0.0 0.9 and 0.1 1.0 - but there is no difference for the network. The network can and finds patterns that are different from those that a person clearly sees and finds. 3r3-3000.  
3r3-3000.  
If someone from readers is interested, you can continue research and tinkering in the networks, if something does not work or is not clear or suddenly a new and good idea appears and amazes with its beauty, then you can always share with us or ask the masters (and grandmasters too) and ask for qualified help in the ODS community.
+ 0 -

Add comment