How to understand Tensorflow and not die, but even teach a machine something
3r33434. Hi, Habrazhiteli. Today's post will be about how not to get lost in the wilds of the variety of options for using TensorFlow for machine learning and achieve your goal. The article is designed to ensure that the reader knows the basics of the principles of machine learning, but has not yet tried to do it with his own hands. As a result, we get a working demo on Android, which recognizes something with fairly high accuracy. But first things first. 3r33464. 3rr3465. 3r33490. 3r33434. 3r33464. 3r311.
3rr3465. 3r33490. 3r33434. After seeing the latest materials - it was decided zayuzat 3r316. Tensorflow
which is now gaining high momentum, and the articles in English and Russian seem to be enough not to dig into it all and be able to figure out what's what. 3r33464. 3rr3465. 3r33490. 3r33434. After spending two weeks, studying the articles and numerous ekzamply at the office. site, I realized that I did not understand. TOO much information and options on how Tensorflow can be used. My head is already plump on how much they offer different solutions and what to do with them in relation to my task. 3r33464. 3rr3465. 3r33490. 3r33434. 3r33464. 3rr3465. 3r33490. 3r33434. Then I decided to try everything, from the simplest and almost ready-made options (in which I was required to register a dependency in the gradle and add a couple of lines of code) to more complex ones (in which I would have to create and train graph models myself and learn how to use them in mobile application). 3r33464. 3rr3465. 3r33490. 3r33434. In the end, I had to use a complicated version, which will be described in more detail below. In the meantime, I have compiled for you a list of simpler options that are no less effective, just everyone is suited to their goal. 3r33464. 3rr3465. 3r33490. 3r3338. 1. ML KIT
3rr3465. 3r33490. 3r33434. 3r33464. 3rr3465. 3r33490. 3r33434. The easiest solution to use is to use a couple of lines of code:
3rr3465. 3r33490. 3r3306. 3r33490. 3r33479. Text recognition (text, Latin characters)
3r33490. 3r33479. Face detection (faces, emotions)
3r33490. 3r33479. Barcode scanning (barcode, qr-code)
3r33490. 3r33479. Image labeling (limited number of object types in the image)
3r33490. 3r33479. Landmark recognition (sights)
3r33490. 3r33375. 3rr3465. 3r33490. 3r33434. With this solution, it is also slightly more difficult to use your own TensorFlow Lite model, but converting to this format has caused difficulties, so this item has not been tried. 3r33464. 3rr3465. 3r33490. 3r33434. As the creators of this creation write, a large part of the tasks can be solved using these developments. But if this does not apply to your task - you will have to use custom models. 3r33464. 3rr3465. 3r33490.
2. Custom Vision
3rr3465. 3r33490. 3r33434. 3r388. 3r33464. 3rr3465. 3r33490. 3r33434. Very handy tool for creating and training your custom models using images. 3rr3465. 3r33490. From the Pros - there is a free version that allows you to keep one project. 3rr3465. 3r33490. Of the Minuses - the free version limits the number of "incoming" images in 3000 pcs. To try and make an average network of accuracy - that's enough. For more accurate tasks, you need more. 3rr3465. 3r33490. All that is required from the user is to add images with a mark (for example, image1 is "racoon", image2 - "sun"), train and export the graph for further use. 3r33464. 3rr3465. 3r33490. 3r33434. 3r33464. 3rr3465. 3r33490. 3r33434. Caring Microsoft even offers its own sample , with which you can try out your resulting graph. 3rr3465. 3r33490. For those who are already "in the subject" - the graph is generated already in the Frozen state, i.e. with him additionally do not need to convert anything. 3rr3465. 3r33490. This solution is good when you have a large sample and (attention) a LOT of different classes when learning. Since otherwise, there will be many false definitions in practice. For example, you have trained on raccoons and suns, and if there is a person at the entrance, then he can with equal probability be defined by such a system as this or that. Although in fact - Nothing else. 3r33464. 3rr3465. 3r33490. 3r3117. 3. Creating a model manually
3rr3465. 3r33490. 3r33434. 3r3122. 3r33464. 3rr3465. 3r33490. 3r33434. When you need to fine-tune the model yourself for image recognition, more complex manipulations with the input sample of images come into play. 3rr3465. 3r33490. For example, we do not want to have restrictions on the size of the input sample (as in the preceding paragraph), or we want to train the model more precisely by adjusting the number of epoch and other learning parameters ourselves. 3rr3465. 3r33490. In this approach, there are several examples from Tensorflow that describe the order of actions and the final result. 3rr3465. 3r33490. Here are a few such examples:
3rr3465. 3r33490. 3r3306. 3r33490. 3r33479. Cool code lab Tensorflow for Poets . 3rr3465. 3r33490.
3r33490. 3r33375. 3rr3465. 3r33490. 3rr3465. 3r33490. 3r33434. It provides an example of how to create a color type categorizer based on an open ImageNet image base — prepare images, and then train the model. Also mentioned a bit is how you can work with a rather interesting tool - TensorBoard. Of its simplest functions, it clearly demonstrates the structure of your finished model, as well as the learning process in many ways. 3r33464. 3rr3465. 3r33490. 3r3306. 3r33490. 3r33479. 3r33434. Codelab Tensorflow for Poets 2 - continued work with the color classifier. Shows how if there are graph files and its labels (which were obtained in the previous Kodlab), you can run the application on android. One of the points of the Kodlab is the conversion from the "usual" graph format ".pb" into the Tensorflow lite format (which provides for some file optimizations to reduce the total size of the graph file, because mobile devices are demanding for this). 3r33464. 3rr3465. 3r33490.
3r33490. 3r33479. 3r33434. Recognition of handwritten characters MNIST . 3rr3465. 3r33490. 3r33464. 3rr3465. 3r33490. 3r3174.
3r33490. 3r33375. 3rr3465. 3r33490. 3rr3465. 3r33490. 3r33434. The turnip contains the original model (which is already prepared for this task), instructions on how to train it, convert it, and how to launch the project for Android at the end to check how it all works 3r36464. 3rr3465. 3r33490. 3r33434. Based on these examples, you can figure out how to work with custom models in Tensorflow and try to make your own, or take one of the already pre-trained models that are assembled on a githaba:
3r33490. 3r3189. Models from Tensorflow
3r33464. 3rr3465. 3r33490. 3r33434. Speaking of pre-trained models. Interesting nuances when using them:
3rr3465. 3r33490. 3r3306. 3r33490. 3r33479. Their structure is already prepared for a specific task 3r3482. 3r33490. 3r33479. They have already been trained on large sample sizes 3–3–3465. 3r33490. Therefore, if your sample is not sufficiently full, you can take a pre-trained model that is close in scope to your task. Using such a model, adding your own learning rules, you will get a better result, rather than trying to train a model from scratch.
3r33490. 3r33375. 3rr3465. 3r33490.
4. Object Detection API + creating a model manually 3r33232. 3rr3465. 3r33490. 3r33434. However, all previous points did not give the desired result. From the very beginning it was difficult to understand what needs to be done and with the help of which approach. Then a cool article was found on Object Detection API , which tells how you can find several categories in one image, as well as several copies of the same category. In the process of working on this sample, the original articles and video tutorials on recognizing custom objects turned out to be more convenient (links will be at the end). 3r33464. 3rr3465. 3r33490. 3r33434. But the work could not have been completed without 3r33222. articles about recognizing Pikachu
- because there was indicated a very important nuance, which for some reason is not mentioned anywhere in one guide or example. And without it, all the work done would be in vain. 3r33464. 3rr3465. 3r33490. 3r33434. So, now at last about what still had to do and what happened at the output. 3r33464. 3rr3465. 3r33490.
3r33490. 3r33479. First of all - flour installation Tensorflow. Who will not be able to install it, or use the standard creation scripts, model training - just have patience and google. Almost every problem has already been written to issues on a githheb or on stackoverflow. 3rr3465. 3r33490.
3rr3465. 3r33490. According to the instructions for object recognition, we need to prepare an input sample before training the model. These articles describe in detail how to do this with the help of a handy tool - labelImg. The only difficulty here is to do a very long and scrupulous work on identifying the boundaries of the objects we need. In this case, stamps on the images of documents. 3rr3465. 3r33490. 3rr3465. 3r33490. 3rr3465. 3r33490. With the help of ready-made scripts, the next step is to export the data from step 2 first to csv files, then to TFRecords - the input data format Tensorflow. There should be no difficulties. 3rr3465. 3r33490. The choice of a pre-trained model, based on which we will pre-train the graph, as well as the training itself. Here there can appear the most huge number of unknown errors, the cause of which is unidentified (or crookedly set) packages necessary for work. But you will succeed, do not despair, the result is worth it. 3rr3465. 3r33490. 3rr3465. 3r33490. 3rr3465. 3r33490. Export the resulting file after learning to the 'pb' format. Simply select the last 'ckpt' file and export it. 3rr3465. 3r33490. Run an example of work on Android. 3rr3465. 3r33490. We download the official sample for object recognition from the Tensorflow github - TF Detect . We put in there our model and the file with the labels. But. Nothing will work. 3rr3465. 3r33490. 3rr3465. 3r33490. 3rr3465. 3r33490. 3rr3465. 3r33490. 3r33434. It was here that the biggest plug-in in all the work arose, oddly enough - well, Tensorflow samples didn’t want to work at all. Everything fell. Only the mighty Pikachu with his article managed to help bring everything to work. 3rr3465. 3r33490. In the labels.txt file, the first line must be the inscription "???", because By default, in Object Detection API, object id numbers do not begin with 0 as usual, but from 1. Due to the fact that the null class is reserved - and you need to specify magic questions. Those. Your tag file will look something like this:
3r33448. ??? 3r33490. stamp
3rr3465. 3r33490. 3r33434. And then - run the sample and see the recognition of objects and the level of confidience with which it was obtained. 3r33464. 3rr3465. 3r33490. 3r33434. 3r33464. 3rr3465. 3r33490. 3r33434. Thus, the result was a simple application that, when you hover the camera, recognizes the borders of the stamp on the document and points them along with the accuracy of recognition. 3rr3465. 3r33490. And if we exclude the time that was spent on searching for the right approach and trying to launch it, then on the whole, the work turned out to be quite fast and not really difficult. You just need to know the nuances before proceeding to work. 3r33464. 3rr3465. 3r33490. 3r33434. Already as an additional section (here you can already close the article if you are tired of the information), I would like to write a couple of life hacks that helped in working with all this. 3r33464. 3rr3465. 3r33490. 3r3306. 3r33490. 3r33479. 3r33434. quite often tensorflow scripts did not work because they were run from the wrong directories. And on different PCs it was different: someone needed to run from the directory "3r33448. Tensroflowmodels /models /research ", And to someone - a level deeper - from "3r33448. Tensroflowmodels /models /research /object -detection "
3r33490. 3r33479. 3r33434. Remember that for each open terminal, you need to re-export the path with the command
3r33448. export PYTHONPATH = /your local path /tensroflowmodels /models /research /slim: $ PYTHONPATH
3r33490. 3r33479. 3r33434. if you are not using your graph and want to find out information about it (for example, "3r3343448. input_node_name ", which is required later in the work), execute two commands from the root folder: 3r3343464. 3rr3465. 3r33490.
3r33448. bazel build tensorflow /tools /graph_transforms: summarize_graph
bazel-bin /tensorflow /tools /graph_transforms /summarize_graph --in_graph = "/your local path /frozen_inference_graph.pb"
3rr3465. 3r33490. 3r33434. where "
/your local path /frozen_inference_graph.pb " is the path to the column about which you want to find out information 3rr3465. 3r33490. 3r33490. 3r33479. 3r33434. To view information about the graph, you can use Tensorboard 3rr3465. 3r33490.
3r33448. python import_pb_to_tensorboard.py --model_dir = output /frozen_inference_graph.pb --log_dir = training3rr3465. 3r33490. 3r33434. where you need to specify the path to the column (3r33448. model_dir ) and the path to the files that were obtained in the learning process (3r33448. log_dir ). Then just open localhost in the browser and watch what interests you. 3r33464. 3rr3465. 3r33490. 3r33490. 3r33375. 3rr3465. 3r33490. 3r33434. And the last part - on working with python scripts in the instructions for the Object Detection API - for you prepared a little cheat sheet below with commands and hints. 3r33464. 3rr3465. 3r33490. 3r33382. 3r33383. Cheat Sheet [/b] 3r33385. 3r33434. Export from labelimg to csv (from the object_detection of the directory) 3rr3465. 3r33490.
3r33448. python xml_to_csv.py3rr3465. 3r33490. 3r33434. Then all the steps that are listed below must be performed from the same Tensorflow folder ("3r33448. Tensroflowmodels /models /research /object-detection " Or one level up - depending on how you do it) - i.e. all images of the input sample, TFRecords and other files before startingt need to copy inside this directory. 3r33464. 3rr3465. 3r33490. 3r33434. Export from csv to tfrecord 3rr3465. 3r33490.
3r33448. python generate_tfrecord.py --csv_input = data /train_labels.csv --output_path = data /train.record3rr3465. 3r33490. 3r33434. * Do not forget to change in the file itself (generate_tfrecord.py) the lines of ‘train’ and ‘test’ in the paths, as well as 3r3343465. 3r33490. The name of the recognizable classes in function
python generate_tfrecord.py --csv_input = data /test_labels.csv --output_path = data /test.record
class_text_to_int(which should be duplicated in
pbtxtfile that you create before learning the graph). 3r33464. 3rr3465. 3r33490. 3r33434. Training 3rr3465. 3r33490.
3r33448. python legacy /train.py —logtostderr --train_dir = training /--pipeline_config_path = training /ssd_mobilenet_v1_coco.config3rr3465. 3r33490. 3r33434. ** Before training, do not forget to check the file "3r33448. Training /object-detection.pbtxt " - all recognized classes and the file "3r33434. Training /ssd_mobilenet_v1_coco.config should be listed " - there you need to change the parameter "3rrr.config " num_classes "On the number of your classes. 3r33464. 3rr3465. 3r33490. 3r33434. Export model in pb 3rr3465. 3r33490.
3r33448. python export_inference_graph.py
--input_type = image_tensor
--pipeline_config_path = training /pipeline.config
--trained_checkpoint_prefix = training /model.ckpt-110
--output_directory = output
3rr3465. 3r33490. 3r33434. Thank you for your interest in this topic! 3r33464. 3rr3465. 3r33490. 3r33434. References
3r33490. 3r33479. 3r33470. The original article on object recognition
3r33490. 3r33479. 3r33475. Cycle video to the article on the recognition of objects in English
3r33490. 3r33479. 3r38080. A set of scripts that were used in the original article
It may be interesting