Examples of calculating the "availability ratio" for sets of network equipment

Examples of calculating the "availability ratio" for sets of network equipment
 
 
The theory and the main points on the methodology for calculating the "readiness factor" were described by me earlier in this article .
 
 
In this publication, we calculate the "availability ratio" of two sets of carrier-class network equipment installed in each telecommunications cabinet and compare it with the calculation of the "availability factor" for a set of equipment without duplicated elements.
 
 
Why do you need to do calculations of the "availability ratio" for different cases of equipment layout?
 
 
We have data on calculating the "availability ratio" in the final results may be incorrect, too ideal, overstated and understated. And where is there an error or everything is correctly calculated, can be understood only when it is possible to see all the elements of the system together, their use and location options.
 
 
An example of an "ideal" calculation of the "availability factor".
 
 
The main components of the set 1 of network equipment:
 
- Cisco ASR 9010 - 2 pieces;
 
- Cisco ASR 9000v - 2 pieces;
 
- switchboard power distribution board «48В» ЩРЗ-10-2К - 2 pcs.
 
 
Completeness of Cisco ASR 9010 equipment:
 

 
 
The layout of the cabinet with installed kit No. 1 looks like this:
 

 
 
Calculation of the availability ratio of the equipment of the set No. 1:
 

 
 
(*) - the initial data for the MTBF parameter are estimated, provided by the manufacturer's equipment items or their analogs.
 
The Cisco ASR 9000 Series Routers are designed to have a high Mean Time Between Failures (MTBF) and low Mean Time To Resolve (MTTR) rates, thus providing a reliable platform that minimizes outages or downtime and maximizes availability. The MTBF is calculated based on the Ground Benign condition. The values ​​may be adjusted based on the different router usage.
 
 
Final settlement data for the set №1:
 
- probability of system failure during the year: ???;
 
- MTBF equipment system (years): 1246 (10918609 hours);
 
- average time of elimination of malfunction (hours): 24;
 
- availability factor of the system equipment (%): ???;
 
- average idle time per year (hours): ??? (??? minutes).
 
 
What is wrong in this calculation?
 
 
To calculate the availability factor, you need to understand how and where the equipment is installed, what is its functionality and the ability to hot-swap and duplicate elements, the complexity of mounting and replacing components, without disabling the main systems of the complex.
 
 
In an ideal calculation, all the elements are duplicated (which is rarely the case in fact), it is assumed that the spare parts are at hand, and we can carry out the work on live equipment on nearby equipment without problems.
 
 
And if the physical layout is at variance with the logical scheme of the system, then already individual parts of the system can not duplicate each other.
 
 
In the "ideal" case, we have a set of two halves composed of which duplicate each other. But if there is no such logical duplication, then we are already moving away from the "ideal" calculation to the more correct one and we get a plausible result.
 
 
And let's be realistic, add in the calculation 60 minutes per year for the "RestartShutdown procedure". Download the new chassis, set up and start in the normal mode of this time should be enough from the moment you press the power switch on the housing. For 60 minutes of downtime, the probability of failure for a year is ???. This will be the lowest line in the calculations below.
 
 
An example of a "real" calculation of the "availability factor".
 
 
Calculation of the equipment availability ratio of the set №1 without duplication:
 

 
 
Final settlement data for the set №1 without duplication:
 
- probability of system failure during the year: ?5001666;
 
- MTBF equipment system (years): ??? (17514 hours);
 
- average time of elimination of malfunction (hours): 24;
 
- availability factor of the system equipment (%): ???;
 
- average idle time per year (hours): ??? (719 minutes).
 
 
The difference between the two above examples of calculations is huge. And this moment must always be remembered and analyzed.
 
 
In the best case, even if we have duplicated elements in the system, we need to ignore the possibility of using them as a replacement, in case these elements contain other components. That is, we see that we have two chassis and two power shields. These components are duplicated, but they have other elements inside that can stop functioning when the "mother" component fails.
 
 
If for the chassis this is essential, then for the shield it is less problematic, because there simple electronics only for testing and current load mapping is used, even if this board fails, the shield will function in the normal mode.
 
 
An example of a "standard" calculation of the "availability factor".
 
 
The main components of the set 2 of network equipment:
 
- Cisco ASR 9006 - 2 pieces;
 
- Cisco ASR 9000v - 2 pieces;
 
- switchboard for power supply «48V» ЩРЗ-48-5 - 2 pcs.
 
 
Completeness of Cisco ASR 9006:
equipment.  

 
 
The layout of the cabinet with installed kit number 2 looks like this:
 

 
 
Calculation of the availability factor of the equipment of the set №2 taking into account the non-duplication of the chassis and power supply panels:
 

 
 
The final calculation data for the set №2:
 
- probability of system failure during the year: ???;
 
- MTBF equipment system (years): 4.7 (40410 hours);
 
- average time of elimination of malfunction (hours): 24;
 
- availability factor of the system equipment (%): ???;
 
- average idle time per year (hours): 5.2 (311 minutes).
 
 
It turns out that it is necessary to understand when calculating the availability factor what the largest element in the system can be replaced even within 24 hours. And how much the replacement of this element will affect the functioning of the remaining components.
 
 
For example, when replacing the chassis, we will dismantle the entire set of boards and adapters from this chassis, and this can take time and more than 2-3 hours. And to dismantle the elements when the equipment is nearby in the rack is a big risk for the occurrence of an additional contingency situation.
 
 
For an ideal option - two cabinets with equipment, each with 2 chassis - one working, the second empty for quick activation with the transfer of elements from the out of order. But this is too ideal situation.
+ 0 -

Add comment