This article presents an analytical comparison between constituency parsing and dependency parsing – two types of parsing used in the field of natural language processing (NLP). The study introduces an algorithm to enhance keyword extraction, employing the noun phrase extraction feature of the parser to filter out unsuitable phrases. This algorithm is implemented using three different parsers: Spacy, AllenNLP and Stazna. The effectiveness of this algorithm was compared with two popular methods (Yake, Rake) on a dataset of English texts. Experimental results show that the proposed algorithm with the SpaCy parser is superior to other keyword extraction algorithms in terms of accuracy and speed. For the AllenNLP and Stanza parsers, our algorithm is also more accurate, but requires much longer execution time. The results obtained allow us to evaluate in more detail the advantages and disadvantages of the parsers studied in the work, as well as to determine directions for further research. The running time of the SpaCy parser is significantly less than the other two parsers because parsers that use transitions for deterministic or machine-learned set of actions to build the dependency tree step by step. They are typically faster and require less memory than graph-based parsers, making them more efficient for parsing large amounts of text. On the other hand, AllenNLP and Stanza use graph-based parsing models that rely on millions of features, which limits their ability to generalize and slows down the speed of analysis compared to transition-based parsers. The task of achieving a balance between the accuracy and speed of a linguistic parser is an open topic that requires further research due to the importance of this problem for improving the efficiency of text analysis, especially in applications that require real-time accuracy. To this end, the authors plan to conduct further research into possible solutions to achieve this balance.
Hydrocephalus is a central nervous system disorder which most commonly affects infants and toddlers. It starts as an abnormal build-up of cerebrospinal fluid in the ventricular system of the brain. Hence, early diagnosis becomes vital, which may be performed by Computed Tomography (CT), one of the most effective diagnostic methods for diagnosing Hydrocephalus (CT), where the enlarged ventricular system becomes apparent. However, most disease progression assessments rely on the radiologist's evaluation and physical measures, which are subjective, time-consuming, and inaccurate. This paper develops an automatic prediction utilizing the H-detect framework for enhanced accurate hydrocephalus prediction. This paper uses a pre-processing step to normalize the input image and remove unwanted noises, which can help extract valuable features easily. The feature extraction is done by segmenting the image based on edge detection using triangular fuzzy rules. Thereby, the exact information on the nature of CSF inside the brain is highlighted. These segmented images are saved and again given to the CatBoost algorithm. The Categorical feature processing allows for quicker training. When necessary, the overfitting detector will stop model training and thus efficiently predicts Hydrocephalus. The outcomes demonstrate that the new H-detect strategy outperforms the traditional approaches.
Automatic syntactic analysis of a sentence is an important computational linguistics task. At present, there are no syntactic structure parsers for Russian that are publicly available and suitable for practical applications. Ground-up creation of such parsers requires building of a treebank annotated according to a given formal grammar, which is quite a cumbersome task. However, since there are several syntactic dependency parsers for Russian, it seems reasonable to employ dependency parsing results for syntactic structure analysis. The article introduces an algorithm that allows to construct the constituency tree of a Russian sentence by a syntactic dependency tree. The formal grammar used by the algorithm is based on the D.E. Rosenthal’s classic reference. The algorithm was evaluated on 300 Russian-language sentences. 200 of them were selected from the aforementioned reference, and 100 from OpenCorpora, an open corpus of sentences extracted from Russian news and periodicals. During the evaluation, the sentences were passed to syntactic dependency parsers from Stanza, SpaCy, and Natasha packages, then the resulted dependency trees were processed by the proposed algorithm. The obtained constituency trees were compared with the trees manually annotated by experts in linguistics. The best performance was achieved using the Stanza parser: the constituency parsing F1–score was 0.85, and the sentence parts tagging accuracy was 0.93, that would be sufficient for many practical applications, such as event extraction, information retrieval and sentiment analysis.
Many Digital Signal Processing (DSP) applications and electronic gadgets today require digital filtering. Different optimization algorithms have been used to obtain fast and improved results. Several researchers have used Enhanced Slime Mould Algorithm for designing the 2D IIR filter. However, it is observed that the Enhanced Slime Mould Algorithm did not achieve a better solution structure and had a slower convergence rate. In order to overcome the issue a fused ESMA-pelican Optimization Algorithm (FEPOA) is utilized for designing the 2D IIR filter which incorporates the pelican Optimization Algorithm with the Enhanced slime Mould Algorithm (ESMA). At first, the Chaotic Approach is utilized to initialize the population which provides the high-quality population with excellent population diversity, after that the position of population members is to identify and correct the individual in the boundary search region. After that, by the pelican Tactical Approach is to examine the search space and exploration power of the FEPOA, then the Fitness is calculated randomly, and the best solution will be upgraded and then moved towards the iterations. It repeats the FEPOA phases until the execution completes. Then the best solution gives the optimal solution, which enhances the speed of convergence, convergence accuracy and the performances of FEPOA. The FEPOA is then implemented in the IIR filter to improve the overall filter design. The results provided by FEPOA accomplish the necessary fitness and best solution for 200 iterations, and the amplitude response will achieve the maximum value for =2,4,8 as well as the execution time of 3.0158s, which is much quicker than the other Genetic Algorithms often used for 2D IIR filters.
This article is devoted to the problem of automation of the stage of combining wells into clusters, considered as part of the process of designing the development of oil fields. The solution to the problem of combining wells into clusters is to determine the best location of well pads and the distribution of wells into clusters, in which the costs of developing and maintaining an oil field will be minimized, and the expected flow rate will be maximized. One of the currently used approaches to solving this problem is the use of optimization algorithms. At the same time, this task entails taking into account technological limitations when searching for the optimal option for the development of an oil field, justified, among other things, by the regulations in force in the industry, namely, the minimum and maximum allowable number of wells in a pad, as well as the minimum allowable distance between two well pads. The use of optimization algorithms does not always guarantee an optimal result, in which all specified constraints are met. Within the framework of this study, an algorithm is proposed that allows us to work out the resulting design solutions in order to eliminate the violated restrictions at the optimization stage. The algorithm consistently solves the following problems: violation of restrictions on the ultra-small and ultra-large number of wells in a pad; discrepancy between the number of pads with a given one; violation of the restriction of the ultra-close arrangement of pads. To study the effectiveness of the developed approach, a computational experiment was conducted on three generated synthetic oil fields with different geometries. As part of the experiment, the quality of the optimization method and the proposed algorithm, which is a raise to the optimization method, were compared. The comparison was carried out on different values of optimization power, which denotes the maximum number of runs of the target function. The evaluation of the quality of the work of the compared approaches is determined by the amount of the fine, which indicates the degree of violation of the values of the main restrictions. The efficiency criteria in this work are: the average value, the standard deviation, the median, and the minimum and maximum values of the penalty. Due to the use of this algorithm, the value of the penalty for the first and third oil fields is reduced on average to 0.04 and 0.03 respectively, and for the second oil field, the algorithm allowed to obtain design solutions without violating restrictions. Based on the results of the study, a conclusion was made regarding the effectiveness of the developed approach in solving the problem of oil field development.
Machine learning and digital signal processing methods are used in various industries, including in the analysis and classification of seismic signals from surface sources. The developed wave type analysis algorithm makes it possible to automatically identify and, accordingly, separate incoming seismic waves based on their characteristics. To distinguish the types of waves, a seismic measuring complex is used that determines the characteristics of the boundary waves of surface sources using special molecular electronic sensors of angular and linear oscillations. The results of the algorithm for processing data obtained by the method of seismic observations using spectral analysis based on the Morlet wavelet are presented. The paper also describes an algorithm for classifying signal sources, determining the distance and azimuth to the point of excitation of surface waves, considers the use of statistical characteristics and MFCC (Mel-frequency cepstral coefficients) parameters, as well as their joint application. At the same time, the following were used as statistical characteristics of the signal: variance, kurtosis coefficient, entropy and average value, and gradient boosting was chosen as a machine learning method; a machine learning method based on gradient boosting using statistical and MFCC parameters was used as a method for determining the distance to the signal source. The training was conducted on test data based on the selected special parameters of signals from sources of seismic excitation of surface waves. From a practical point of view, new methods of seismic observations and analysis of boundary waves make it possible to solve the problem of ensuring a dense arrangement of sensors in hard-to-reach places, eliminate the lack of knowledge in algorithms for processing data from seismic sensors of angular movements, classify and systematize sources, improve prediction accuracy, implement algorithms for locating and tracking sources. The aim of the work was to create algorithms for processing seismic data for classifying signal sources, determining the distance and azimuth to the point of excitation of surface waves.
This article proposes algorithms for planning and controlling the movement of a mobile robot in a two-dimensional stationary environment with obstacles. The task is to reduce the length of the planned path, take into account the dynamic constraints of the robot and obtain a smooth trajectory. To take into account the dynamic constraints of the mobile robot, virtual obstacles are added to the map to cover the unfeasible sectors of the movement. This way of accounting for dynamic constraints allows the use of map-oriented methods without increasing their complexity. An improved version of the rapidly exploring random tree algorithm (multi-parent nodes RRT – MPN-RRT) is proposed as a global planning algorithm. Several parent nodes decrease the length of the planned path in comprise with the original one-node version of RRT. The shortest path on the constructed graph is found using the ant colony optimization algorithm. It is shown that the use of two-parent nodes can reduce the average path length for an urban environment with a low building density. To solve the problem of slow convergence of algorithms based on random search and path smoothing, the RRT algorithm is supplemented with a local optimization algorithm. The RRT algorithm searches for a global path, which is smoothed and optimized by an iterative local algorithm. The lower-level control algorithms developed in this article automatically decrease the robot’s velocity when approaching obstacles or turning. The overall efficiency of the developed algorithms is demonstrated by numerical simulation methods using a large number of experiments.
Modern methods for solving problems of planning of task packages execution in multi-stage systems are characterized by the presence of restrictions on their dimension, the impossibility of obtaining guaranteed best results in comparison with fixed packages for different values of the input parameters of tasks. The problem of optimizing the composition of task packages executed in multi-stage systems using the method of branches and borders is solved in the paper. Studies of various ways of forming the order of execution of task packages in multi-stage systems (heuristic rules for ordering task packages in the sequences of their execution on MS devices) have been carried out. The method of ordering packets in the sequence of their execution (a heuristic rule), which minimizes the total time for implementing actions with them on the devices, is defined. The method of ordering the types of tasks, according to which their packages are considered in the procedure of the method of branches and borders, is formulated on the basis of the obtained rule. A mathematical model of the process of implementing actions with packages on the system devices, which provides the calculation of its parameters, has been built. The construction of a method for forming all possible solutions for the composition of task packages for a given number of them has been completed. Decisions on the composition of task packages of different types are interpreted in the procedure of the method of branches and borders in order to build the optimal combination of them. To implement the method of branches and borders, a branching (splitting) procedure is formulated, which assumes the formation of subsets of solutions that include packages of different compositions of tasks of the same type. Expressions for calculating the lower and upper estimates of the values of the optimization criterion for the composition of packages for subsets formed in the branching procedure are constructed. The dropout procedure involves the exclusion of subsets whose lower estimate is not less than the record. To find optimal solutions, a breadth-first search strategy is applied, which provides for the study of all subsets of solutions that include various packages of tasks of the same type obtained as a result of the procedure for splitting subsets of tasks that are not excluded from consideration after the implementation of the dropout procedure. The developed algorithms are implemented programmatically, which allowed to obtain the results of planning the execution of task packages in a multi-stage system, which are on average 30 % better than fixed packages.
On the basis of the tracking multi-loop target angle coordinate system, the article has selected and proposed a interactive multi-model adaptive filter algorithm to improve the quality of the target phase coordinate filter. In which, the 3 models selected to design the line of sight angle coordinate filter; Constant velocity (CV) model, Singer model and constant acceleration model, characterizing 3 different levels of maneuverability of the target. As a result, the evaluation quality of the target phase coordinates is improved because the evaluation process has redistribution of the probabilities of each model to suit the actual maneuvering of the target. The structure of the filters is simple, the evaluation error is small and the maneuvering detection delay is significantly reduced. The results are verified through simulation, ensuring that in all cases the target is maneuvering with different intensity and frequency, the line of sight angle coordinate filter always accurately determines the target angle coordinates.
The functionality of any system can be represented as a set of commands that lead to a change in the state of the system. The intrusion detection problem for signature-based intrusion detection systems is equivalent to matching the sequences of operational commands executed by the protected system to known attack signatures. Various mutations in attack vectors (including replacing commands with equivalent ones, rearranging the commands and their blocks, adding garbage and empty commands into the sequence) reduce the effectiveness and accuracy of the intrusion detection. The article analyzes the existing solutions in the field of bioinformatics and considers their applicability for solving the problem of identifying polymorphic attacks by signature-based intrusion detection systems. A new approach to the detection of polymorphic attacks based on the suffix tree technology applied in the assembly and verification of the similarity of genomic sequences is discussed. The use of bioinformatics technology allows us to achieve high accuracy of intrusion detection at the level of modern intrusion detection systems (more than 0.90), while surpassing them in terms of cost-effectiveness of storage resources, speed and readiness to changes in attack vectors. To improve the accuracy indicators, a number of modifications of the developed algorithm have been carried out, as a result of which the accuracy of detecting attacks increased by up to 0.95 with the level of mutations in the sequence up to 10%. The developed approach can be used for intrusion detection both in conventional computer networks and in modern reconfigurable network infrastructures with limited resources (Internet of Things, networks of cyber-physical objects, wireless sensor networks).
This paper focuses on capturing the meaning of Natural Language Understanding (NLU) text features to detect the duplicate unsupervised features. The NLU features are compared with lexical approaches to prove the suitable classification technique. The transfer-learning approach is utilized to train the extraction of features on the Semantic Textual Similarity (STS) task. All features are evaluated with two types of datasets that belong to Bosch bug and Wikipedia article reports. This study aims to structure the recent research efforts by comparing NLU concepts for featuring semantics of text and applying it to IR. The main contribution of this paper is a comparative study of semantic similarity measurements. The experimental results demonstrate the Term Frequency–Inverse Document Frequency (TF-IDF) feature results on both datasets with reasonable vocabulary size. It indicates that the Bidirectional Long Short Term Memory (BiLSTM) can learn the structure of a sentence to improve the classification.
In modern conditions, in the field of the creation and use of existing and advanced space vehicles (SV), the issues of autonomy and survivability acquire particular relevance in the development and operation of small-mass spacecraft (SMS) for Earth remote sensing (ERS). The specificity of the small spacecraft lies in the fact that it is difficult to directly apply to the process of their creation the standard practice of using the system for ensuring the reliability of the rocket and space industry due to the lack of the ability to provide full structural redundancy of its onboard systems (OBS) associated with mass-dimensional and other restrictions. In this case, the tasks of developing model-algorithmic methods and approaches to ensuring the required level of indicators of structural reliability, survivability and, in general, the effectiveness of the functioning of the MCA OBS become of particular relevance. The problem of increasing the level of indicators of autonomy, survivability, efficiency of functioning of complex technical objects (CTO), which, in particular, SMS belong, is considered in the scientific literature in conjunction with solving problems of control, assessment and technical diagnostics of the state of the CTO reconfiguration (structural, functional, structural-functional reconfiguration) of CTO structures, management of its reserves, alternative and multi-mode control, analysis of fault tolerance and disaster recovery of CTO. However, all of these studies are fragmented, both at the methodological and methodological and technological levels. The article provides a generalized description of the combined methods and algorithms developed by the authors for solving the problems of synthesis of technologies and programs for controlling the OS reconfiguration to increase the survivability of the SMS. At the same time, these tasks are solved not in isolation, but in a comprehensive manner within the framework of the general problem of proactive management of the structural dynamics of SMS with or without the use of GCC tools, which ensures the efficiency, validity, completeness, isolation and consistency of synthesized management decisions. The novelty of the approach proposed in the article is that its authors, based on the concepts of integrated (system) modeling, proactive control of the structural dynamics of the OS SMS, as well as the intellectualization of the processes of proactive control of the OS SMS, developed methods and algorithms for the synthesis of technologies and programs. Control of the reconfiguration of the MCS BS, providing, firstly, the situational choice of the optimal sequence of operations and the allocation of SMS resources with and without the use of GCC facilities, and, secondly, effective parrying not only of the calculated ones, but also off-design emergency flight situations (EFS), as well as the operational restoration of the operability of its OS. The constructiveness of the proposed approach is illustrated by the example of solving the problem of flexible redistribution of information processing tasks between the OS SMS and the SMS GCC.
In the context of the ongoing forth industrial revolution and fast computer science development the amount of textual information becomes huge. So, prior to applying the seemingly appropriate methodologies and techniques to the above data processing their nature and characteristics should be thoroughly analyzed and understood. At that, automatic text processing incorporated in the existing systems may facilitate many procedures. So far, text classification is one of the basic applications to natural language processing accounting for such factors as emotions’ analysis, subject labeling etc. In particular, the existing advancements in deep learning networks demonstrate that the proposed methods may fit the documents’ classifying, since they possess certain extra efficiency; for instance, they appeared to be effective for classifying texts in English. The thorough study revealed that practically no research effort was put into an expertise of the documents in Vietnamese language. In the scope of our study, there is not much research for documents in Vietnamese. The development of deep learning models for document classification has demonstrated certain improvements for texts in Vietnamese. Therefore, the use of long short term memory network with Word2vec is proposed to classify text that improves both performance and accuracy. The here developed approach when compared with other traditional methods demonstrated somewhat better results at classifying texts in Vietnamese language. The evaluation made over datasets in Vietnamese shows an accuracy of over 90%; also the proposed approach looks quite promising for real applications.
The main factors that determine the expansion of capabilities and increase the effectiveness of network intelligence to identify the composition and structure of client-server computer networks due to the stationarity of their structural and functional characteristics are analyzed. The substantiation of an urgent problem of dynamic management of structurally-functional characteristics of the client-server computer networks functioning in the conditions of network reconnaissance is resulted on the grounds of the revealed protection features of client-server computer networks at the present stage that is based on realization of principles of spatial safety maintenance, and also formalization and introduction of forbidding regulations.
The mathematical model allowing to find optimum modes for dynamic configuration of structurally-functional characteristics of client-server computer networks for various situations is presented. Calculation results are given. An algorithm is presented that makes it possible to solve the problem of dynamic configuration of the structural and functional characteristics of a client-server computer network, which reduces the reliability time of data obtained by network intelligence. The results of practical tests of software developed on the basis of the dynamic configuration algorithm of client-server computer networks are shown. The obtained results show that the use of the presented solution for the dynamic configuration of client-server computer networks allows to increase the effectiveness of protection by changing the structural and functional characteristics of client-server computer networks within several subnets without breaking critical connections through time intervals that are adaptively changed depending on the functioning conditions and the attacker’s actions.
The novelty of the developed model lies in the application of the mathematical apparatus of the Markov’s theory of random processes and Kolmogorov’s solution of equations to justify the choice of dynamic configuration modes for the structural and functional characteristics of client-server computer networks. The novelty of the developed algorithm is the use of a dynamic configuration model for the structural and functional characteristics of client-server computer networks for the dynamic control of the structural and functional characteristics of a client-server computer network in network intelligence.
A problem of reducing a linear time-invariant dynamic system is considered as a problem of approximating its initial rational transfer function with a similar function of a lower order. The initial transfer function is also assumed to be rational. The approximation error is defined as the standard integral deviation of the transient characteristics of the initial and reduced transfer function in the time domain. The formulations of two main types of approximation problems are considered: a) the traditional problem of minimizing the approximation error at a given order of the reduced model; b) the proposed problem of minimizing the order of the model at a given tolerance on the approximation error.
Algorithms for solving approximation problems based on the Gauss-Newton iterative process are developed. At the iteration step, the current deviation of the transient characteristics is linearized with respect to the coefficients of the denominator of the reduced transfer function. Linearized deviations are used to obtain new values of the transfer function coefficients using the least-squares method in a functional space based on Gram-Schmidt orthogonalization. The general form of expressions representing linearized deviations of transient characteristics is obtained.
To solve the problem of minimizing the order of the transfer function in the framework of the least squares algorithm, the Gram-Schmidt process is also used. The completion criterion of the process is to achieve a given error tolerance. It is shown that the sequence of process steps corresponding to the alternation of coefficients of polynomials of the numerator and denominator of the transfer function provides the minimum order of transfer function.
The paper presents an extension of the developed algorithms to the case of a vector transfer function with a common denominator. An algorithm is presented with the approximation error defined in the form of a geometric sum of scalar errors. The use of the minimax form for error estimation and the possibility of extending the proposed approach to the problem of reducing the irrational initial transfer function are discussed.
Experimental code implementing the proposed algorithms is developed, and the results of numerical evaluations of test examples of various types are obtained.
This paper proposes an algorithm for calculating approximate values of roots of algebraic equations with a specified limit of absolute errors. A mathematical basis of the algorithm is an analytical-numerical method of solving nonlinear integral-differential equations with non-stationary coefficients. The analytical-numerical method belongs to the class of one-step continuous methods of variable order with an adaptive procedure for choosing a calculation step, a formalized estimate of the error of the performed calculations at each step and the error accumulated during the calculation. The proposed algorithm for calculating the approximate values of the roots of an algebraic equation with specified limit absolute errors consists of two stages. The results of the first stage are numerical intervals containing the unknown exact values of the roots of the algebraic equation. At the second stage, the approximate values of these roots with the specified limit absolute errors are calculated. As an example of the use of the proposed algorithm, defining the roots of the fifth-order algebraic equation with three different values of the limiting absolute error is presented.
The obtained results allow drawing the following conclusions. The proposed algorithm enables to select numeric intervals that contain unknown exact values of the roots. Knowledge of these intervals facilitates the calculation of the approximate root values under any specified limiting absolute error. The algorithm efficiency, i.e., the guarantee of achieving the goal, does not depend on the choice of initial conditions. The algorithm is not iterative, so the number of calculation steps required for extracting a numerical interval containing an unknown exact value of any root of an algebraic equation is always restricted. The algorithm of determining a certain root of the algebraic equation is computationally completely autonomous.
Determination of the nucleotide sequence of DNA or RNA containing from several hundred to hundreds of millions of monomers units allows to obtain detailed information about the genome of humans, animals and plants. The deciphering of nucleic acids’ structure was learned quite a long time ago, but initially the decoding methods were low-performing, inefficient and expensive. Methods for decoding nucleotide nucleic acid sequences are usually called sequencing methods. Instruments designed to implement sequencing methods are called sequencers.
Sequencing new generation (SNP), mass parallel sequencing are related terms that describe the technology of high-performance DNA sequencing in which the entire human genome can be sequenced within a day or two. The previous technology used to decipher the human genome required more than ten years to get final results.
A hardware-software complex (HSC) is being developed to decipher the nucleic acid sequence (NA) of pathogenic microorganisms using the method of NGS in the Institute for Analytical Instrumentation of the Russian Academy of Sciences.
The software included in the HSC plays an essential role in solving genome deciphering problems. The purpose of this article is to show the need to create algorithms for the software of the HSC for processing signals obtained in the process of genetic analysis when solving genome deciphering problems, and also to demonstrate the capabilities of these algorithms.
The paper discusses the main problems of signal processing and methods for solving them, including: automatic and semi-automatic focusing, background correction, detection of cluster images, estimation of the coordinates of their positions, creation of templates of clusters of NA molecules on the surface of the reaction cell, correction of influence neighboring optical channels for intensities of signals and the assessment of the reliability of the results of genetic analysis
The scientific research of reliability of combinatorial-metric algorithm for multi-dimensional group point objects recognition in hierarchically organized features space is considered in the paper. The nature of reliability indicator change is examined, as an example, using multilevel descriptions of simulated and real objects under the condition that recognition results obtained at one hierarchy level are used as input data at next level.
A priori uncertainty of a view angle, composition incompleteness and coordinate noise of objects determine the combinatorial procedures of quantifiable estimation of proximity of multidimensional GPO, presenting the object of recognition to a particular class.
The stability of the recognition algorithm is achieved by the possibility of changing strategy of making a classification decision. For this purpose, we use the representation of a group point object at the lowest level of the hierarchy in the form of: sample, composition of sample elements or a complex a priori indicator. In order to increase the recognition accuracy, it was proposed to use the search of recognition results at low levels of the hierarchy. The experimental dependences of a priori and a posteriori reliability indicators for various conditions for measurements and states of recognition objects are provided in the paper.
Path planning for autonomous mobile robots is an important task within robotics field. It is common to use one of the two classical approaches in path planning: a global approach when an entire map of a working environment is available for a robot or local methods, which require the robot to detect obstacles with a variety of onboard sensors as the robot traverses the environment.
In our previous work, a multi-criteria spline algorithm prototype for a global path construction was developed and tested in Matlab environment. The algorithm used the Voronoi graph for computing an initial path that serves as a starting point of the iterative method. This approach allowed finding a path in all map configurations whenever the path existed. During the iterative search, a cost function with a number of different criteria and associated weights was guiding further path optimization. A potential field method was used to implement some of the criteria.
This paper describes an implementation of a modified spline-based algorithm that could be used with real autonomous mobile robots. Equations of the characteristic criteria of a path optimality were further modified. The obstacle map was previously presented as intersections of a finite number of circles with various radii. However, in real world environments, obstacles’ data is a dynamically changing probability map that could be based on an occupancy grid. Moreover, the robot is no longer a geometric point.
To implement the spline algorithm and further use it with real robots, the source code of the Matlab environment prototype was transferred into C++ programming language. The testing of the method and the multi criteria cost function optimality was carried out in ROS/Gazebo environment, which recently has become a standard for programming and modeling robotic devices and algorithms.
The resulting spline-based path planning algorithm could be used on any real robot, which is equipped with a laser rangefinder. The algorithm operates in real time and the influence of the objective function criteria parameters are available for dynamic tuning during a robot motion.
As a method for providing security of the messages sent via a public channel in the case of potential coercive attacks there had been proposed algorithms and protocols of deniable encryption. The lasts are divided on the following types: 1) schemes with public key, 2) schemes with shares secret key, and 3) no-key schemes. There are introduced pseudo-probabilistic symmetric ciphers that represent a particular variant of implementing deniable encryption algorithms. It is discussed application of the pseudo-probabilistic encryption for constructing special mechanisms of the information protection including steganographic channels hidden in ciphertexts. There are considered methods for designing stream and block pseudo-probabilistic encryption algorithms that implement simultaneous ciphering fake and secret messages so that the generated ciphertext is computationally indistinguishable from the ciphertext obtained as output of the probabilistic encryption of the fake message. The requirement of the ciphertext indistinguishability from the probabilistic encryption has been used as one of the design criteria. To implement this criterion in the construction scheme of the pseudo-probabilistic ciphers it is included step of bijective mapping pairs of intermediate ciphertext blocks of the fake and secret messages into a single expanded block of the output ciphertext. Implementations of the pseudo-probabilistic block ciphers in which algorithms for recovering the fake and secret messages coincide completely are also considered. There are proposed general approaches to constructing no-key encryption protocols and randomized pseudo-probabilistic block ciphers. Concrete implementations of the cryptoschemes of such types are presented.
The article deals with the applied aspects of the preliminary vertices ranking for oriented weighted graph. In this paper, the authors observed the widespread use of this technique in developing heuristic discrete optimization algorithms. The ranking problem is directly related to the problem of social networks centrality and large real world data sets but as shown in the article ranking is explicitly or implicitly used in the development of algorithms as the initial stage of obtaining a solution for solving applied problems. Examples of such ranking application are given. The examples demonstrate the increase of efficiency for solving some optimization applied problems, which are widely used in mathematical methods of optimization, decision-making not only from the theoretical development point of view but also their applications. The article describes the structure of the first phase of the computational experiment, which is associated with the procedure of obtaining test data sets. The obtained data are presented by weighted graphs that correspond to several groups of the social network Vkontakte with the number of participants in the range from 9000 to 24 thousand. It is shown that the structural characteristics of the obtained graphs differ significantly in the number of connectivity components. Characteristics of centrality (degree's sequences), as shown, have exponential distribution. The main attention is given to the analysis of three approaches to graph vertices ranking. We propose analysis and comparison of the obtained set of ranks by the nature of their distribution. The definition of convergence for graph vertex ranking algorithms is introduced and the differences of their use in considering the data of large dimension and the need to build a solution in the presence of local changes are discussed.
Service robots are intended to help humans in non-industrial environments such as houses or offices. To accomplish their goal, service robots must have several skills such as object recognition and manipulation, face detection and recognition, speech recognition and synthesis, task planning and, one of the most important, navigation in dynamic environments. This paper describes a fully implemented motion-planning system which comprehends from motion and path planning algorithms to spatial representation and behavior-based active navigation. The proposed system is implemented in Justina, a domestic service robot whose design is based on the ViRBot, an architecture to operate virtual and real robots that encompasses serveral layers of abstraction, from low-level control to symbolic planning. We evaluated our proposal both in simulated and real environments and compared it to classical implementations. For the tests, we used maps obtained from real environments (the Biorobotics Laboratory and the Robocup@Home arena) and maps generated from obstacles with random positions and shapes. Several parameters were used for comparison: the total traveled distance, the number of collisions, the number of reached goal points and the average execution speed. Our proposal performed significantly better both in real and simulated tests. Finally, we show our results in the context of the RoboCup@Home competition, where the system was successfully tested.
An extremely simple and high-performance genome-wide association study (GWAS) algorithm for estimating the main and epistatic effects of markers or single nucleotide polymorphisms (SNPs) is proposed. The main idea underlying the algorithm is based on comparison of genotypes of pairs of individuals and comparison of the corresponding phenotype values. It is used the intuitive assumption that changes of alleles corresponding to important SNPs in a pair of individuals lead to a large difference of phenotype values of these individuals. In other words, the algorithm is based on considering pairs of individuals instead of SNPs or pairs of SNPs. The main advantage of the algorithm is that it weakly depends on the number of SNPs in a genotype matrix. It mainly depends on the number of individuals, which is typically very small in comparison with the number of SNPs. Another important advantage of the algorithm is that it can detect the epistatic effect viewed as gene-gene interaction without additional computations. The algorithm can also be used when the phenotype takes only two values (the case-control study). Moreover, it can be simply extended from the analysis of binary genotype matrices to the microarray gene expression data analysis. Numerical experiments with real data sets consisting of populations of double haploid lines of barley illustrate the outperformance of the proposed algorithm in comparison with standard GWAS algorithms from the computation point of view especially for detecting the gene-gene interactions. The ways for improving the proposed algorithm are discussed in the paper.
The article deals with a tracking control system synthesis for a nonlinear plant functioning under bounded external disturbances which are not available for measurement. A proposed solution is a robust modification of the backstepping approach with the similar controller design structure. The main changes are based on plant model transformations that make it possible to use the only one filer in control system and, along with it, an auxiliary loop method usage to disturbances evaluation and compensation. High-gain observers are used for unknown signals measuring together with their derivatives. Tracking errors and observation errors convergence with the adjustable accuracy during the finite transient time is proved. Efficiency of the algorithm is demonstrated using computer modeling in comparison with an analogue.
In this paper, we consider the problem of mutual reconstruction of face image pairs. We addressed this problem in our previous article, where the proposed solutions were discussed in connection with Heterogeneous Face Recognition and Cross-Modal Multimedia Retrieval problems. Those solutions are based on one-dimensional and two-dimensional Principal Component Analysis performed over two original face images followed by their projection on independent eigenspaces, estimation of a transformation matrix and mutual reconstruction of the face image by means of one-dimensional and two-dimensional Karhunen-Loève Transform.
In this article, we propose new approaches and solutions, which are based solely on the two-dimensional eigenspace projection methods, and two regression models — Multiple Linear Regression and Partial Least Squares regression.
We present the experiments on mutual reconstruction of face images in sketch/photo pairs, in pairs of face images with age-related changes, and in pairs of 2D/3D face images. In order to conduct the experiments, we selected two variants of the proposed approach. First one is based on two-dimensional Principal Component Analysis and Partial Least Squares regression, and the second one is based on two-dimensional Partial Least Squares and Multiple Linear Regression. Both variants showed acceptable performance for practical applications involving the mutual reconstruction of face images. Furthermore, we consider the method to improve the quality of reconstructed face images in the case of mixed datasets. This method involves classification of the dataset by means of two-dimensional Linear Discriminant Analysis and fitting of a separate regression model for each class.
In addition, we show that generally, mutual reconstruction of face images is also achievable in conditions when original images are not a part of training sets of face images.
Nature inspired algorithm based Load balancing of tasks on virtual machines (VMs) has become an area of greater research interest. Honey Bee Behavior Based Load Balancing (HBB-LB) was introduced to balance the load with a maximum throughput. This approach also balances the priorities of the tasks on the VM to minimize the waiting time of the tasks. However, HBB-LB considers only the VM load for balancing the load, which might not be sufficiently effective. This paper proposes an Improved Honey Bee Behavior Based Load Balancing (IHBB-LB), taking into consideration a few more QoS parameters of VM, such as service response time, availability, reliability, cost and throughput to enhance load balancing. Response time is vital in determining the instant activity of a VM while availability determines available resource and state of VM (idle or active) and Reliability determines the level of trust in a VM. Most importantly, Cost for utilizing a VM and Throughput (capability of VM) are also essential in determining the VM efficiency. But, the inclusion of multiple QoS parameters results in multi-objective optimization problem. As a number of QoS parameters are computed, the Fuzzification of the QoS values was performed through the generated fuzzy rules and multi-objective optimization problem was eliminated. The experiments were performed in terms of makespan, response time, degree of imbalance and the number of tasks migrated and results indicate that the IHBB-LB provides a better level of performance.
The paper presents algorithms for objective evaluation of speech quality are considered, based on the measurement of dynamic and static characteristics of speech signals at the source codec output. The functional scheme of carrying out of experimental researches is proved. The results of the analysis of the correlation of objective and subjective evaluation of speech quality are given. Modifications of the objective quality assessment are proposed on the basis of the correlation of the excitation of the MESC spectrum and modification of the exponent on the basis of the calculation of the sensation function of the spectral dynamics of MFOSD. The algorithm of regression curve formation is proposed, which allows to perform the transformation of objective evaluation to the scale of subjective evaluation of speech quality.
Based on the use of the most accurate modifications of the speech quality assessment indicators for reconstructed speech signals, a complex algorithm for objective hardware evaluation of the speech quality is proposed when the broadband and low-frequency stationary and nonstationary acoustic waves are applied to the microphone. It is shown that the use of a complex algorithm makes it possible to obtain an objective evaluation of the quality of speech according to GOST R 50840-95 with an average error of no more than 0.35 points for signal-to-noise ratios of 30 dB to -10 dB.
In the article, an algorithm for constructing deformable face models, based on the use of Active Shape Model method, Shepard method of landscape surfaces restoring and set of 3D particular face models, is described. Alternative to the EER, the assessment of accuracy in the task of the person recognition using their face image based on an anchored value of FAR is offered. The results of testing the algorithm are presented. We demonstrate the results of using the obtained models within the framework of recognition algorithm performance on a large base of several thousand images (FERET image database by 2000 year),which contains photographs of people at angles of 0, 45 and 90 degrees relative to the optical axis of the camera. Analysis of the results showed that the use of deformable face models does not reduce the quality of the person recognition by face image even under difficult initial conditions and in some cases leads to improving recognition results.
It is known that the implementation technology of recognition problems, based on the classic neural network, has a number of difficulties such as the need to have a large training set; the duration and complexity of learning algorithms; difficulty with the choice of such network design parameters as the number of neurons, layers, links, as well as ways to connect neurons; there may be no successful learning, with the need to re-change the network settings and re-training. In this paper we consider the possibility of creating a multi-layer perceptron with a full system of connections and with a threshold activation function on the basis of algorithms metric methods of recognition and in particular the nearest neighbor algorithm. It is shown that this method allows you to create a fully connected multilayer perceptron, such parameters of which as the number of neurons, layers, as well as the value of the weights and thresholds, are determined analytically. The distribution of weight and threshold values for the second and third layer is also discussed. On this basis, we have proposed an algorithm for calculating the thresholds and weights of a multilayer perceptron and showed an example of its implementation. The possible applications of the network for different tasks are considered.
In the paper, we have formulated the invariant description form for geometry of a spatial, kinematically redundant manipulator with the orthogonal non-coplanar axes of rotation of the joints. We have obtained the explicit equations for determining the angular coordinates from the condition that points of joints belong to the smooth parametrically given curve. Inequality constraints on the relative position of neighboring parts of the manipulator have been formulated. We have proposed an algorithm for solving equations and the method of planning changes for hinge coordinates for the movement of joints points along the spatial curve that is formed by incremental addition of target points for the head link positions of the manipulator. The method has been applied for planning movements of a hyper-redundant manipulator with a fixed root link and a snakelike robot when moving along the path built on the basis of current and forecasted positions of joints in the Cartesian space.
The algorithm of classification of multidimensional group pointwise objects samples is presented. Search is carried out on the basis of combinatorial search of proportionate fragments of matrixes of pairwise relations on a set of templates. The decision on assignment of the sample to this or that template is made according to criterion of the minimum Euclidean distance. The presented approach to recognition allows one to synthesize invariant (concerning rotation, scaling or offset of system of co-ordinates) descriptions of secondary signs and to use quite a powerful toolkit of the theory of multidimensional and metric scaling in compensating distortions of the recognized group pointwise objects images. The algorithm implements a procedure of statistical tests of Monte-Carlo, within the frames of which each point, allocated in a random way in a prospective neighborhood of required coordinates, is checked by condition of the minimum of the quadratic similarity measure. The paper gives an example and the results of using the algorithm for identification and recovery of the distorted radio images exposed to coordinate noises and presented by sampling of templates of "bril-liant" points.
Recovery of a dynamic system from its functioning is a problem of current interest in the theory of control systems. As a behavior model of gene network regulatory circuit, a discrete dynamic system has been proposed, where coordinates correspond to the concentration of substances, while special functions, which depend on the system value in the previous moment, account for their increase or decrease. Pseudo-polynomial discrete dynamic system recovery algorithms with additive and multiplicative functions have been obtained earlier. The generalized case of arbitrary threshold functions is considered in this article. Algorithms for significant variables recovery and threshold functions weight regulation, having pseudo-polynomial testing complexity, are given. These algorithms allow one either to recover the system completely, or to lower the threshold function dimension.
The paper presents a classification algorithm of group point objects (GPO) based on the comparative analysis of fragments of distorted images and the GPO templates. The sequences of the GPO elements of different lengths are used as fragments. The paired and angular interdot distances are used as classification signs. The probability measure of closeness, set by the expert by means of the membership function and the distribution law of probability of discrete values of classified objects signs, is used in solving a classification task.
The algorithm includes the following stages: search and comparison of fragments composition of distorted images and the GPO templates; formation of a probable assessment of closeness of GPO distorted image and each template in space of the considered signs according to the analysis of each fragment; accumulation of the received probabilities on the basis of analysis results of all distorted image fragments; ranging of the received probabilities of classifying the distorted image as the GPO template; determination of the most probable template. The algorithm provides the possibility of specifying a GPO distorted image class using logical rules and analytical expressions of the considered data domain. The example and results of the algorithm application for solving a classification task of real GPO on the basis of the analysis of their fragments in the form of sequences from two and three elements are given.
The problem of estimating the vulnerability of the speech information of a confidential nature is currently topical. However, in the use of means of acoustic protection, i.e. in conditions of strong noise, the existing instrumental and computational methods give greater accuracy when compared with the extremely labor intensive methods of articulation.
In the paper we study the method of estimating the security of voice data based on the Pearson correlation coefficient. This ratio has poor sensitivity to the spectral properties of the acoustic signals. Therefore, the author suggests an approach to the definition of the security indicator of voice data based on the mathematical apparatus of the coher-ence function of source and noisy signals.
We propose to split the entire speech frequency range of the coherence function into separate octaves. We also offer to calculate the expectation of the coherence function components in octaves and on the basis of convolution function obtain an expression for calculating the index of the vulnerability of speech.
The proposed algorithm for determining the vulnerability index of voice data allows improving the assessment accuracy.
Currently the tasks of computations speed-up and/or their optimization are actual ones. Among the ways to solve these tasks is a method of parallelization and asynchroniza-tion of a sorting algorithm, which is considered in the article. We offer a sorting method that is based on the principle of dividing an array into a set of independent pairs of numbers and their parallel and asynchronous comparison, which distinguishes the offered method from the traditional sorting algorithms (like quick sorting, merge sorting, insertion sorting and others). The algorithm is realized with the use of Petri nets as the most suitable tool for describing asynchronous systems. Examples of its work are given. The performance of the algorithm is evaluated for the best and the worst cases. In the best case the algorithm is executed for 2 or 3 conditional tacks depending on an array partition into the pairs of adjacent elements. In the worst case –for n or 3n/2, where n is the number of elements. Parallelization and asynchronization principles, used during the algorithm construction, can also be used for different algorithms.
An improvement of existing navigation algorithms for a generic polygonal linkage is presented. Our algorithm constructs a path between two arbitrary configurations of a polygonal linkage. This path contains att most eight steps
In the paper, an adaptive algorithm for time series forecasting based on the selection of an analogue period is proposed. A distinctive feature of the algorithm is the use of training sample of forecasts for the automatic selection of optimal parameters of its work. The algorithm was employed for prediction of the hydrological time series of inflow to Novosibirsk Reservoir (the Ob River). The efficiency of its use (an increase in the accuracy of forecasts) is demonstrated compared with the basic algorithm.
In the paper, an algorithm for calculating the underwater autonomous vehicle position built on the triangulation method and post triangulation correction is proposed. A distinctive feature of the algorithm is that it uses as input multiple sets of distances vehicle-beacon calculated with different values of the speed of sound in water. The research of the developed algorithm has found that the accuracy of the proposed solution is twice higher than the accuracy of the triangulation method.
The paper discusses a task of sources consumption management in the process of deploying information support systems for complicated technical complexes (CTC). Application of CTC as well as the process of appropriate information support are usually limited by the exact prescribed terms, so any delay is not allowable. The delay may be eliminated more often only by involving additional information sources at later stages. The developed algorithm is based on Bellman principle of optimality that allows one to define not the final correction schedule, but to operate a flexible program of control actions, depending on the concrete result at every stage, duration of which exceeds a defined norm. This program can be used in the appropriate decision support systems and can be included in the simulation models of CTC deploying and applying. The paper describes a detailed algorithm for optimal correction, corresponding to the normal distribution of stages durations.
We propose an approach to the automatic categorization of text documents based on the joint application of the method of latent semantic analysis (LSA) and fuzzy inference Mamdani algorithm. Method LSA is used for the semantic analysis of information in electronic document management systems by identifying semantic relationships between terms of documents and receipt of the compliance rate of the compared vectors. The rule base is proposed for fuzzy inference algorithm of Mamdani implementing the automatic rubrication of documents for a variety of given topics enabling automated monitoring of the distribution of documents not relevant to the specified topics, or having similarities in several thematic categories on the basis of the results of latent semantic analysis.
In the paper, we consider a task of uniting graphs with a common part. The graphs were received as the result of series of simulations of a Petri net using a program package Colored Petri Nets Tools, where a process address space is restricted by 232 bytes starting from different vertices and with different initial conditions. To solve this task, it is necessary to determine the graphs common part, to perform graphs cutting in such a way that their common part remains in only one of the initial graphs, and compose a table of accordance (transitions) between the graphs vertices for making transitions between them. Firstly, we assume that the graphs are represented in the form of adjacency lists. During algorithm’s work they are converted into hash tables for fast determination of the common part of the graphs that is implemented with the help of traversing one of the graphs and testing for the presence of nodes in the second graph. A transition table is created with the help of graph traversal by “parents-child” vertex pairs, during which it is checked whether one of the nodes of a pair can be added to the table. The algorithm for solving the problem of uniting the parts of a di-rected graph is offered, and an example of its use is given.
Non-contact methods for measuring angular and linear geometric parameters in textile structures are considered. In this paper we have developed the following algorithms: a diffraction pattern modeling algorithm based on the fast Fourier transform; an algorithm for determining the yarn twist angle by a digital photography of its structure; an algorithm for determining the skewness of the weft thread in fabric; an algorithm for measuring the distance between adjacent elements of the structure using the method of the Double Fourier transform.
In the article we consider the problem of vulnerabilities detection in machine code. In this paper, disadvantages of current solutions in case of possibility to detect vulnerabilities in view of threats to confidential information that is processed in vulnerable software are highlighted. To solve this problem, we propose original model of vulnerabilities detection in program trace, its algorithmic support and software implementation. The model provides formal criteria to distinct bug from vulnerability taking into account distribution of protected information in the memory of software under test. We use tainted data analysis technique to highlight such memory regions. In addition, we conduct experimental evaluation of developed system efficiency which demonstrates that our solution allows detecting 5 types of Windows software vulnerabilities more and 4 types Linux software vulnerabilities more than existing analogs.
The rapid development of information technology in recent decades has led to a substantial increase in the amount of the source code of software, as well as its complexity. This fact points to the high complexity of the software analysis, carried out with the aim of understanding the logic of its functioning. Implementation of this analysis is important when conducting computer forensics The article deals with an approach to automate the identification of standardized algorithms for data conversion in executable software modules in the absence of source code by taking into account their internal data connections in order to facilitate understanding of the programs.
Normal
0
false
false
false
RU
X-NONE
X-NONE
/* Style Definitions */
table.MsoNormalTable
{mso-style-name:"Обычная таблица";
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-priority:99;
mso-style-parent:"";
mso-padding-alt:0cm 5.4pt 0cm 5.4pt;
mso-para-margin:0cm;
mso-para-margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:10.0pt;
font-family:"Times New Roman","serif";}
The algorithm for constructing a flexible program for technical object diagnosing on the criterion of received information value is presented. Diagnostic signs that have a continuous form of representation are considered. A numerical example of the algorithm implementation is given
Article is devoted to investigation of the controlled substitution-permutation network based on managed elements F 4/2 as a primitive block encryption algorithms. Relevance of research is related to their focus on the design of high-speed hardware ciphers. The scientific and practical significance of the results is to improve the efficiency of high-speed hardware implementation of encryption algorithms, designed to protect information in information and telecommunication systems and networks.
The paper describes the original algorithm of a heterogeneous data clustering is based on complex application of a set of measures of distances and clustering methods and multi-stage clustering. In the algorithm we use ranging of attributes the object on their importance for group and a choice of an optimum attributes set, ensemble approach to get the final clustering solution. The algorithm is realized in MixDC (Mixed Data Clustering) software system. The technique and results of the solution of a real problem of a medical data clustering in software system are described.
In this paper influence of multi-iterative hashing with several modifiers algorithm's parameters on its cryptographic persistence is considered. Relevance of multi-iterative hashing with several modifiers algorithm’s application and need of research of its parameters are justified, the description of algorithm is provided. Cryptographic persistence of hash function to attacks which are not depends on algorithm is caused by its bitness, i.e. actually on the amount of unique hash values that hash function is able to generate. For an estimation of algorithm’s persistence to dictionary attacks and attacks by methods of "brute force" and "birthdays" the algorithm of multi-iterative hashing with several modifiers is considered as independent hash function. Estimation of the algorithm’s persistence for a given number of iterations is offered to produce by calculating the average bitness of equivalently persistent hash function for the algorithm. The description of estimation method of algorithm’s persistence is provided. The experiments are performed using a truncated cryptographically persistent hash function. The results of experiments allow to compare the algorithm’s persistence metrics of under different values of its parameters. Besides, the results of the experiments allow to understand how the values of certain parameters, and combinations of values for these parameters affect for the algorithm’s cryptographic persistence to dictionary attacks and attacks by methods of "brute force" and "birthdays". On the basis of the received results it is possible to draw conclusions about the values of the parameters recommended for practical application of this algorithm. In conclusion, the paper presents the main results of the work. Authors of the article believe that the algorithm can find application in authentication subsystems of information systems, and also in systems where the most important requirement is persistence for a long time.
Creation of the language model is one of the stages of training of a continuous speech recognition system. In the paper, the developed software for creation of syntactic-statistical Russian language model based on a text corpus is described. The main stages of the algorithm are preliminary text material processing, creation of statistical n-gram language model, extension of the statistical model by n-grams obtained by syntactical analysis. Syntactical analysis permits to increase the quantity of different bigrams created during text processing and to improve the quality of the language model by extracting grammatically-connected word pairs. The results of the testing of the language models created with the help of the software module are presented.
In the paper the original algorithm of solution of an applied task of the graph theory on finding of k maximum flows between the two set count's tops is proposed. The approach described represents a complex application of Ford-Fulkerson (Edmonds-Karp or Dinitz) algorithm and the algorithm of creation of a truncated tree of states in width in an indivisible optimizing cycle.
In article new approaches to the assessment of people health based on integrated indicator are considered. Mathematical model of integrated indicator calculation is described. The testing of the proposed model and the indicator was conducted with the patients of medical health care institutions.
Merits and demerits of straight and iterative methods for BD LAES are shown. In article is offered the new «direct» method (algorithm) for solution of BD LES with varied parameters. It effectively uses basic solution LAES and matrix sparseness information and allows in the tasks using BD LAES, which need to be solved repeatedly, significantly increase speed of settlement algorithms due to reduction of number of computing operations, to lower requirements for random access memory volumes of computers.
In the paper is considered the task of control the process of information interaction in heterogeneous virtual network of cyber-objects. We propose the infrastructure model that allows using various technologies of OSI transport layer, including multi-protocol wireless data exchange tools. The simulation results of access to telematics services confirm the possibility of creating sustainable delay-tolerant virtual channels.
Optimization of working with multimedia resources on purpose reduce transmitted amount of data between users is one of the problems videoconferencing applications. The paper describes algorithms and software, that allows to made optimization of cross-platform videoconferencing application. The main stages of videoconferencing application are: creation and deletion of audio and video streams of data, their transmission from the server to the client and back, creation chains of streams and their search on the server. The above stages are presents in any videoconferencing application and they have to be optimized due to the presence that they contain key processes and the complexity of the architecture of the application. Therefore, in the course of work has been done simplification of the client part of the application and the reorganization of the structure of the server-side application. In transmit-receive mode of data developed application after optimization, compared with the program "Skype" was consuming almost 10 times less RAM and 2 times less CPU.
During automatic speech processing a number of problems appear, and among them are such as speech variation and different kinds of speech disfluences. In this article different types of speech disfluencies and their causes are presented, as well as the algorithm for their automatic detection based on the analysis of acoustical parameters. The method of cross-correlation was used to deteсt voiced hesitation phenomena and a method of band-filtering was used to detect unvoiced hesitation phenomena and artefacts. The experiments were performed on a specially collected corpus of spontaneous Russian map-task and appointment-task dialogs. Experiments showed that voiced hesitation phenomena are detected with 80% accuracy and devoiced hesitation phenomena and artefacts – with 66% accuracy.
The main goal of this paper is to create algorithm of synonyms thesaurus generation. Modern search engines use such thesauri for query expansion. Such approach allows to return not only documents containing words from query, but also ones containing their synonyms or semantically similar terms. Semi-automatic method of named entity recognizer training was developed as a part of this work. Semi-automatic method of extracted entities validation is also given.
The plan operation correction problem of ground based space monitoring information system is considered in this paper. The generalized algorithm of the positional control construction is proposed and illustrated on the numerical example.
One of the approaches to the detection of network anomalies is the analyses of parameters of functioning of a network. Characteristics, calculated on a wavelet coefficients, indeed, are more sensitive to changes in the number, than the characteristics calculated directly in a row, but this requires more calculations, the spectral-time algorithms, of course, subject to optimize for application in real-time systems. In addition, there are different approaches to the implementation of wavelet expansions, each of which has its place on the informative value (the number of qualifying ratios), the authentic values, the computational complexity of the transformations. The article offers a reasonable approach to the implementation of these algorithms for use in real-time anomaly detection systems.
Two-stage scheme for minimal join graph subsets synthesis that involves the construction of three sets (stereoseparators, their possessions and necessary edges) over given subalphabet set with subsequent construction a set of a certain kind of sinews for each stereoseparator. Systematized algorithms, implementing both stages, are systematized and their complexity is estimated.
In the theory of algebraic Bayesian networks (logical probabilistic graphical models using the form of interval estimations of the probabilistic truth values of propositional formulas for knowledge with uncertainty representation) the notion of consistency contained in the knowledge system has been formalized. This paper analyzes the algorithm for propagation of received evidences from the point of view of preserving consistency in the course of its execution. An improvement of an existing algorithm providing consistency preservation is proposed.
The problem of algebraic Bayesian network's (ABN) secondary structure construction based on known primary structure was outlined in the ABN's theory. Logic-probabilistic inference can be performed only when secondary structure is a join graph. The algorithm for randomized synthesis of minimal join graph is formulated in the paper. The theorem was proved that the selection of every probable for the ABN's primary structure minimal join graph has positive probability.
The software ability to analyze current operating conditions, including a current state of a user, a physical environment, computing resources, as well as dynamically adaptation of the scenario of interaction to the user, is a one of the main requirements in the development of smart space prototypes. Control of a set of software and hardware modules involved in the smart space is difficult with increasing of tasks and number of users, so mathematical and software tools, which implement control of distributed modules, are required. The paper describes the structure of the model for distributed modules control in a smart meeting room as well as used multimodal interfaces for natural human-computer interaction.
The condition for algebraic Bayesian networks (ABN) global logical-probabilistic inference algorithms performance is the absence of cycles in its secondary structure. The primary structure, on which an acyclic secondary structure can be synthesized is called an acyclic one. The goal of the work is to propose an algorithm to detect primary structure acyclicity based on estimates of the number of edges in its secondary structure without the direct construction of the secondary structure, and estimation of the algorithm complexity. The algorithm for detection ABN primary structure acyclicity based on number of minimal join graph edges estimating via brute force is formulated, its correctness is proven, its complexity is estimated, improvement in the speed of this algorithm is proposed, the improved algorithm correctness if proven and its performance time are estimated . Also the possibility for improving the algorithm performance speed through the use of algorithms for ABN tertiary polystructure elements synthesis is discussed.
The generalized dynamic model of the control of corporate information system is considered. The scheme and the generalized algorithm of the positional control construction using the position optimization method are proposed.
The goal of this paper is to obtain a wavelet decomposition (wavelet renement) of the chain of embedded spaces of splines for an arbitrary renement of a nonuniform grid, and to derive the corresponding decomposition and reconstruction formulas, to construct wavelet decompositions and decomposition and reconstruction algorithms in the case of an innite ow for a grid on an open interval and a nite ow for a grid on a segment.
The paper is devoted to the proof of upper bounds of steps of logic-objectiv algorithms for recognition of a complicated image situated on a display screen. It is proved that the problem of separation and recognition of an etalon object from a complicated scene has a polynomial algorithm. The problem of separation and recognition of an object from a class the description of which contains only distinctive attributes of this class belongs to. To decrease the algorithm number of steps a notion of "fuzzy image"' is introduced. The problem of invariant (under rescaling) image recognition is regarded.
Algebraic Bayesian networks (ABN) belong to a class of logical and probabilistic graphical models of systems of knowledge with uncertainty. ABN allows to use interval probability estimates to represent uncertainty in knowledge. One of the most important conditions for ABN performance capability is the absence of cycles in its secondary structure. The primary structure, on which an acyclic ABN can be synthesized, is called acyclic primary structure. The goal of the work is to propose an algorithm for detection of the primary structure acyclicity on the basis of analysis of the quaternary structure of the ABN, as well as evaluation of the algorithm complexity. The algorithm for acyclicity detection is formulated, its correctness is proven, its complexity is estimated and a number of improvements for the acceleration of this algorithm are proposed.
The role of algebraic Bayesian network (ABN) polystructure has increased significantly. It had been originally introduced as an auxiliary object for the secondary structure synthesis, but then the tertiary polystructure has found its use in the analysis of secondary structure cyclicity without its direct synthesis. Now the tertiary polystructure is expected to be used for global inference in ABN. The goal of the work is the selection (and subsequent systematization and complicity estimation) the existing algorithms for tertiary polystructure elements synthesis. The existing algorithms for tertiary polystructure elements synthesis are overviewed and the algorithm complicity is estimated in the paper. Four algorithms for synthesizing the empty graph over useful cliques subsets and two algorithms for synthesizing the parent graph over a stereoclique set are presented.
The paper presents algorithmical complexity estimates for local posteriori inference in algebraic Bayesian networks. We consider the ways of implementing the inference for three types of evidence (deterministic, stochastic, and imprecise). If we need to solve linear programming tasks for inference, the comlexity estimations are given in numbers of such tasks and numbers of variables and constraints in each task. In other cases, complexity estimates are given in numbers of arithmetic operations.
Algebraic Bayesian network (ABN) tertiary structure represented as a clique graph is important for synthesizing and analyzing the ABN secondary structure, as well as for an analysis of the ABN primary structure. Two tertiary structure synthesis algorithms are pro-posed: clique tree synthesis descendants algorithm, and clique tree synthesis bottom-up algorithm, their validity is proved and their computational complicity is estimated in the article. Both the algorithms synthesize two ordered sets containing the set of vertices and the set of sons of each clique for given maximal knowledge pattern set Examples of the ABN primary structures for which the first algorithm works faster than the second one, and vice versa, the second one works faster than the first one, are given. Also ABN tertiary structure existence and uniqueness for every ABN primary structure are stated
Article has scientifically-methodical character and is devoted the formalized statement of the situation analysis problem in the trade markets in interests of effective trading operations carrying out. The major factors defining the highest level uncertainty at financial markets state estimation formation are presented. The basic efficiency criteria of the situation analysis from the system theory point of view are considered. The traditional approach to the market state analysis on the basis statistical system synthesis is considered. The construction of statistically steady algorithms basic methods for trading situations estimation is represented.
The effective minimal join graph set synthesis algorithm (of self-managing cliques)from a given maximal knowledge pattern set and two its improvements, each of which is actualized in an algorithm, exists, however there is no algorithm that implements the two improvements. The goal of the work is to designing a new algorithm that actualizes all known improvements of the basic algorithm, and thus, is the most effective one. Such an algorithm implementing suggested improvement is designed and correctness of its work is proven.algebraic Bayesian networksalgebraic Bayesian networks
The paper describes an algorithms lgebraic bayesian networks reconciliation. We prove correctnes of this algorithms and provide computational comlexeti estimates.
The effective minimal join graph set synthesis algorithm (of self-managing cliques) from a given maximal knowledge pattern set exists, however it can be improved by engaging results of theory of algebraic Bayesian network global structure that is under development. The goal of the work is to improve the known minimal join graph set algorithm by perfecting designing set of vertices that cliques include: instead of full search of all the pairs of a vertex and a clique weight, particular search for each clique its descendants is suggested. A new algorithm of self-managed possessions cliques implementing suggested improvement is designed.
After designing testing scripts, sometimes it is needed to analyze their properties. To make script specification formal, one can use left context terminal grammars. The article gives proof of left context terminal grammars and context free grammars equivalence and develops algorithms for analyzing properties of such grammars that are useful in analyzing testing script properties.
The effective minimal join graph set synthesis algorithm (of self-managing cliques) from a given maximal knowledge pattern set is known, but it can be improved by engaging the developed theory of algebraic Bayesian network global structure. The goal of the work is to improve the known minimal join graph set algorithm by perfecting designing possessions (connection components of strong narrowings) — the key objects for the set synthesis: to design themnot with the straight search but with the corresponding cliques children’s intersections analysis. A newalgorithmimplementing suggested improvements is designed.
A multiple-model description of interaction between a ground-based control complex GCC and orbital system (OrS) of navigation spacecrafts (NS) is presented. A dynamic interpretation of operations and control processes is implemented. The proposed approach lets use fundamental scientific results of the modern control theory for new applied problems. In particular, a scheduling problem for GCC ground-based technical facilities was reduced to a boundary problem with the help of the local section method. Scheduling problems of the considered class are usually solved via methods of discrete programming, but when the dimensionality is high, the optimal solution is not provided and heuristic algorithms are needed. This paper introduces an original approach, based on models and methods of optimal control theory, to scheduling problems of high dimensionality.
The scheme of an algorithm allowing to design minimal join graph set from a given maximal knowledge pattern set is known, but it can be improved by engaging the developed theory of algebraic Bayesian network global structure. The goal of the research is to improve the known minimal join graph set algorithm. Three improvements of the known algorithm are suggested and justified: 1) exclusion of not significant narrowing, 2) exclusion of cliques with the only possession and 3) a prior childless cliques with the only edge processing. A new algorithm implementing suggested improvements is designed.
The paper describes an algorithm for synthesis of the secondary structure of an algebraic Bayesian network (ABN) by its first structure. The second structure of ABN is a joint graph with minimal number of edges. A proof of algorithm’s correctness is given.
Image registration is one of the basic problems of computer vision. It arises in optical flow estimation, stereo vision, and tracking problems. One of the classical approaches proposed by B. Lucas and T. Kanade is based on optimization of some cost function. In this article image registration algorithm based on Lucas–Kanade approach is proposed (random sampling algorithm). It shows high performance results.
A priori inference is a key feature in intellectual decision support systems based on imperfect probabilistic knowledge. The paper describes a set of local a priori inference algorithms in algebraic Bayesian networks.
Formal algorithm of recognition of the chain and cyclic rules for the context free grammars is presented and substantiated.
The article is devoted to DM (Data Mining) algorithms which are the basis of new type of automatic control of multivariate dynamic processes. DM methods combine immediate control decisions with deep numerical analysis of retrospective data. New conceptual principles of analytic control allow DM to be separate part of information technology.
The article is devoted to APC (Advanced Process Control) applications in processes automated control. Developed program complex “Matrix” is offered to elaborate forecasting models and, in turn, to optimize multiple process control according to the chosen criterion of efficiency and available restrictions.
Now the important problems of structure dynamic control (SDC) theory are the evaluation and estimation of stability, goal, and informational-technological abilities for modern information system (IS). Our investigation shows that this problem can be solved by attainability set construction of SDC models. Preliminary evaluation of attainability sets for the SDC models lets allows to reduce the time (needed) required for solution of IS SDC tasks. Therefore in practice the different approximation of attainability sets are used. Methods and algorithms of evaluation of attainability sets for the SDC models are proposed.
In the article the graphical editor realization of effluence and confluence operators widely used in the practice of building algorithmic networks models is considered. In the previous version of the graphical editor these operators were realized only in the restricted form. The program realization full form operator difficulties and their possible ways to solve are considered.
Formal algorithm of recognition of the infinite rules for the context free grammars is presented and substantiated.
Collective recognition is a task in which multiple classifiers are used for making decisions concerning the class of the same entity (object, situation, etc.) with the subsequent combining and agreement of individual decisions based on some special algorithm. Currently this research direction in the pattern recognition and classification scope is of topmost interest within information technology community. The reason of this fact is that methods and techniques of collective recognition demonstrate new capabilities in regard to pattern recognition and classification accuracy, on the one hand, and they are increasingly used in complex large scale applications. This paper surveys state of the art of the research on the scope in question covering the period from 1950-th and till now. — Bibl. 70 items.
In the state-space (SS) of nonlinear non-stationary dynamic objects are investigated numerical algorithms of their optimum control with restrictions on phase coordinates. The control actions limit by a class of step functions, as positive and negative impulses generating in the SS binary trees. The dynamic process is interpreted as growth binary tree. In accordance with growth binary tree, it the knots get into various areas of the SS (clusters). The purpose of control is the get, during growth binary tree, one or several knots in a specific cluster (target set). In the present work is considered the new numerical search method of optimum control with adaptation to restrictions of an external environment, called by a method of binary trees.
The enhancement of algorithmic networks is offered through comparing with classic programming languages.
The problems, which are solved during project development performance of a program product by using of mathematical models, are defined. Diagrams of technological and information interaction between models are shown. The project basic characteristics are defined. The description of the models, which are included in a complex (evaluating and tracking models) is given.
Algorithm of calculation of polygon square on coordinates of its its vertexes is presented in this paper for polygon with not intersect edges.
The new program solutions using real possibilities to create self-instructing and analytical systems are offered. The possibilities to create intellectual support to decision making on the basis of the predicate construction usage, are discussed.
The researches of the Multiphase Method and the robast control algorithm of the fuzzy logic, based on the Model Memory of the Shape for assembly robots with reference to the task of measurement of coordinates for the autonomous assembly robots and the space manipulators is submitted. Also, there are submitted the software and the instrumentation of the optical television system used for 6-coordinates assigning glove for the control of the robot in a real time mode, and outcomes of experimental researches of dynamic parameters and estimations of accuracy of the optic-television system, and also, the experiments on usage of a method of the teaching by a show for the assembly manipulators. This paper is supported by the Project INTAS № 96-049 "Assembly robotics, robast forcetorque control, optimal planing based on the fuzzy logic".
The article discussed the possibility of applying the formalism of algorithmic nets for automation of creation concurrent program from usual one.
The methods of structural synthesis and common principles of the invariant analysis of complicated nonlinear mathematical models are considered. The analytic form of representation of models looks like the multiparameter differential equations or dynamic systems with control. The concepts about formal models of type polynomial, formal integral manifolds and differential complexes are introduced. The examples of algorithms are reduced. The models polynomial and singular type converted and controlled, ecological minimum type are considered. The examples of the mathematical Handbook are reduced.
On the Internet, "fake news" is a common phenomenon that frequently disturbs society because it contains intentionally false information. The issue has been actively researched using supervised learning for automatic fake news detection. Although accuracy is increasing, it is still limited to identifying fake information through channels on social platforms. This study aims to improve the reliability of fake news detection on social networking platforms by examining news from unknown domains. Especially, information on social networks in Vietnam is difficult to detect and prevent because everyone has equal rights to use the Internet for different purposes. These individuals have access to several social media platforms. Any user can post or spread the news through online platforms. These platforms do not attempt to verify users or the content of their locations. As a result, some users try to spread fake news through these platforms to propagate against an individual, a society, an organization, or a political party. In this paper, we proposed analyzing and designing a model for fake news recognition using Deep learning (called AAFNDL). The method to do the work is: 1) First, we analyze the existing techniques such as Bidirectional Encoder Representation from Transformer (BERT); 2) We proceed to build the model for evaluation; and finally, 3) We approach some Modern techniques to apply to the model, such as the Deep Learning technique, classifier technique and so on to classify fake information. Experiments show that our method can improve by up to 8.72% compared to other methods.
There is a constant need to create methods for improving the quality indicators of information processing. In most practical cases, the ranges of target variables and predictors are formed under the influence of external and internal factors. Phenomena such as concept drift cause the model to lose its completeness and accuracy over time. The purpose of the work is to improve the processing data samples quality based on multi-level models for classification and regression problems. A two-level data processing architecture is proposed. At the lower level, the analysis of incoming information flows and sequences takes place, and the classification or regression tasks are solved. At the upper level, the samples are divided into segments, the current data properties in the subsamples are determined, and the most suitable lower-level models are assigned according to the achieved qualitative indicators. A formal description of the two-level architecture is given. In order to improve the quality indicators for classification and regression solving problems, a data sample preliminary processing is carried out, the model’s qualitative indicators are calculated, and classifiers with the best results are determined. The proposed solution makes it possible to implement constantly learning data processing systems. It is aimed at reducing the time spent on retraining models in case of data properties transformation. Experimental studies were carried out on several datasets. Numerical experiments have shown that the proposed solution makes it possible to improve the quality processing indicators. The model can be considered as an improvement of ensemble methods for processing information flows. Training a single classifier, rather than a group of complex classification models, makes it possible to reduce computational costs.
We apply a machine learning model to determine the optimal strategy in an online auction for the rent of computing resources using the best-choice model. The best-choice model allows clients to minimize the expected cost of renting a computing resource based on the spot price distribution function. The spot price dynamics platform is investigated. The most suitable price distributions in an auction are the normal distribution and its mixtures. In this case, the problems of determining the number of components in the mixture and estimating its parameters arise. One of the well-known methods for determining the number of components in a mixture of normal distributions is the BIC criterion. The EM algorithm is a basic tool for estimating the parameters of a mixture of distributions if we know the number of components. However, parameter estimation by this method takes more time when both the sample size and the number of components of the mixture increase. To automate and expedite the process of determining the number of components for a mixture of normal distributions and estimating its parameters, a classification machine learning model based on a convolutional neural network is developed. The results of the model training and validation are presented. The suggested model is compared with other algorithms which do not use neural networks. The results show that the suggested model performs well in determining the most appropriate number of components for a mixture of normal distributions and in reducing the time spent on applying the EM algorithm to estimate its parameters. This model can be used in different arias, for example, in finance or for determination of the optimal strategy in an online auction for the rent of computing resources.
The use of various types of heuristic algorithms based on soft computing technologies for the distribution of tasks in groups of mobile robots performing monosyllabic operations in a single workspace is considered: genetic algorithms, ant algorithms and artificial neural networks. It is shown that this problem is NP-complex and its solution by direct iteration for a large number of tasks is impossible. The initial problem is reduced to typical NP-complete problems: the generalized problem of finding the optimal group of closed routes from one depot and the traveling salesman problem. A description of each of the selected algorithms and a comparison of their characteristics are presented. A step-by-step algorithm of operation is given, taking into account the selected genetic operators and their parameters for a given population volume. The general structure of the developed algorithm is presented, which makes it possible to solve a multi-criteria optimization problem efficiently enough, taking into account time costs and the integral criterion of robot efficiency, taking into account energy costs, functional saturation of each agent of the group, etc. The possibility of solving the initial problem using an ant algorithm and a generalized search for the optimal group of closed routes is shown. For multi-criteria optimization, the possibility of linear convolution of the obtained vector optimality criterion is shown by introducing additional parameters characterizing group control: the overall efficiency of the functioning of all robots, the energy costs for the functioning of the support group and the energy for placing one robot on the work field. To solve the task distribution problem using the Hopfield neural network, its representation is made in the form of a graph obtained during the transition from the generalized task of finding the optimal group of closed routes from one depot to the traveling salesman problem. The quality indicator is the total path traveled by each of the robots in the group.
The accumulation of data on project management processes and standard solutions has made relevant research related to the use of knowledge engineering methods for a multi-criteria search for options that set optimal settings for project environment parameters. Purpose: Development of a method for searching and visualizing groups of projects that can be evaluated based on the concept of dominance and interpreted in terms of project variables and performance indicators. Methods: The enrichment of the sample while maintaining an implicit link between the project variables and performance indicators is carried out using a predictive neural network model. A set of genetic algorithms is used to detect the Pareto front in the multidimensional criterion space. The ontology of projects is determined after clustering options in the solution space and transforming the cluster structure into the criterion space. Automation of the search in the multidimensional space of the Pareto front greatest curvature zone, which determines the equilibrium design solutions, their visualization and interpretation are carried out using a tree map. Results: A tree map is constructed at any dimension of the criterion space and has a structure that has a topological correspondence with projections of shared cluster images from a multidimensional space onto a plane. For various types of transformations and correlations between performance indicators and project variables, it is shown that the areas of the Pareto front greatest curvature are determined either by the contents of the whole cluster or by part of the variants representing the "best" cluster. If an undivided rectangle of a cluster is adjacent to the upper right corner of a tree map, then its representatives in the criterion space are well separated from the rest of the clusters and, when maximizing performance indicators, are closest to the ideal point. All representatives of such a cluster are effective solutions. If the winning cluster contains dominant options inside the decision tree, then the ”best" cluster is represented by the remaining options that set the optimal settings for the project variables. Practical relevance: The proposed methods of searching and visualizing groups of projects can be used when choosing the conditions of resource and organizational and economic modeling of the project environment, ensuring the optimization of risks, cost, functional, and time criteria.
The question of behavioral functions modeling of animals (in particular, the modeling and implementation of the conditioned reflex) is considered. The analysis of the current state of neural networks with the possibility of structural reconfiguration is carried out. The modeling is carried out by means of neural networks, which are built on the basis of a compartmental spiking model of a neuron with the possibility of structural adaptation to the input pulse pattern. The compartmental spike model of a neuron is able to change its structure (the size of the cell body, the number and length of dendrites, the number of synapses) depending on the incoming pulse pattern at its inputs. A brief description of the compartmental spiking model of a neuron is given, and its main features are noted in terms of the possibility of its structural reconfiguration. The method of structural adaptation of the compartmental spiking model of the neuron to the input pulse pattern is described. To study the work of the proposed model of a neuron in a network, the choice of a conditioned reflex as a special case of the formation of associative connections is justified as an example. The structural scheme and algorithm of formation of a conditioned reflex with both positive and negative reinforcement are described. The article presents a step-by-step description of experiments on the associative connection’s formation in general and conditioned reflex (both with positive and negative reinforcement), in particular. The conclusion is made about the prospects of using spiking compartmental models of neurons to improve the efficiency of the implementation of behavioral functions in neuromorphic control systems. Further promising directions for the development of neuromorphic systems based on spiking compartmental models of the neuron are considered.
Modern information technologies provide text manipulation processes with high efficiency. First of all, this means storing, editing, and formatting texts and their components. Having achieved significant success in developing tools for content-free computer text processing, researchers faced problems with their content processing. Therefore, further steps in this direction are associated with the creation, among other things, of methods for automated purposeful manipulation of texts, taking into account their content. The analysis of works devoted to the study of the problems of formal presentation of texts and their subsequent use is carried out. Despite a number of successful projects, the challenges of solving the problem of the relationship between the content of the text and its meaning remain relevant. It seems that formalization of a General-purpose text while preserving its semantics is not feasible at this stage. However, there are types of texts that can be formalized while preserving their semantics. One of them is a regulatory text type, which is essentially a verbally expressed algorithm for a sequence of targeted actions. It is distinguished by logic and accuracy (lack of allegories), coherence and integrity, clarity, understandability (due to the lack of emotional coloring and figurative means), accessibility (due to the use of specific terminology). In other words, when developing regulatory texts, they usually try to display the mechanisms of the described actions as clearly as possible. Purpose: development of a method for formalizing a regulatory text while preserving its semantics. Methods: structural linguistics, representation of objects in the form of an ontology, constructive algorithms. The use of this method is demonstrated by describing the solution of a system of algebraic equations. Results: method for constructing a mathematical model of a regulatory text. Practical relevance: the application of the developed method makes it possible to develop software systems for building libraries of individual subject areas, develop tools for evaluating regulatory texts for their certainty, completeness, connectivity and other characteristics, as well as simulators and self-learning tools.
A phase enlargement of semi-Markov systems that does not require determining stationary distribution of the embedded Markov chain is considered. Phase enlargement is an equivalent replacement of a semi-Markov system with a common phase state space by a system with a discrete state space. Finding the stationary distribution of an embedded Markov chain for a system with a continuous phase state space is one of the most time-consuming and not always solvable stage, since in some cases it leads to a solution of integral equations with kernels containing sum and difference of variables.
For such equations there is only a particular solution and there are no general solutions to date. For this purpose a lemma on a type of a distribution function of the difference of two random variables, provided that the first variable is greater than the subtracted variable, is used.
It is shown that the type of the distribution function of difference of two random variables under the indicated condition depends on one constant, which is determined by a numerical method of solving the equation presented in the lemma.
Based on the lemma, a theorem on the difference of a random variable and a complicated recovery flow is built up. The use of this method is demonstrated by the example of modeling a technical system consisting of two series-connected process cells, provided that both cells cannot fail simultaneously. The distribution functions of the system residence times in enlarged states, as well as in a subset of working and non-working states, are determined. The simulation results are compared by the considered and classical method proposed by V. Korolyuk, showed the complete coincidence of the sought quantities.
As a result of the analysis, it was revealed that social networks (Vkontakte, Facebook), thematic communities in microblogging networks (Twitter), resources for travelers (TripAdvisor), transport portals (Autostrada) are a source of up-to-date and operational information about the traffic situation, the quality of transport services and passenger satisfaction with the quality of levels of transport services. However, the existing transport monitoring systems do not contain software tools capable of collecting and analyzing traffic information located in the Internet environment. This paper discusses the task of building a system for automatically retrieving and classifying road traffic information from transport Internet portals and testing the developed system for analyzing the transport networks of Crimea and the city of Sevastopol. To solve this problem, an analysis of open source libraries for thematic data collection and analysis was carried out. An algorithm for extracting and analyzing texts has been developed. A crawler was developed using the Scrapy package in Python3, and user feedback from the portal http://autostrada.info/ru was collected on the state of the transport system of Crimea and the city of Sevastopol. For texts lemmatization and vector text transformation, the tf, idf, tf-idf methods and their implementation in the Scikit-Learn library were considered: CountVectorizer and TF-IDF Vectorizer. For word processing, Bag-of-Words and n-gram methods were considered. During the development of the classifier model, the naive Bayes algorithm (MultinomialNB) and the linear classifier model with optimization of the stochastic gradient descent (SGDClassifier) were used. As a training sample, a corpus of 225,000 labeled texts from the Twitter resource was used. The classifier was trained, during which the cross-validation strategy and the ShuffleSplit method were used. Testing and comparison of the results of the pitch classification were carried out. According to the results of validation, the linear model with the n-gram scheme [1, 3] and the vectorizer TF-IDF turned out to be the best. During the approbation of the developed system, the collection and analysis of reviews related to the quality of transport networks of the Republic of Crimea and the city of Sevastopol were conducted. Conclusions are drawn and prospects for further functional development of the developed tools are defined.
Currently, the coordinated use of autonomous underwater vehicles groups seems to be the most promising and ambitious technology to provide a solution to the whole range of oceanographic problems. Complex and large-scale underwater operations usually involve long stay activities of robotic groups under the limited vehicle’s battery capacity. In this context, available charging station within the operational area is required for long-term mission implementation. In order to ensure a high level of group performance capability, two following problems have to be handled simultaneously and accurately – to allocate all tasks between vehicles in the group and to determine the recharging order over the extended period of time. While doing this, it should be taken into account, that the real world underwater vehicle systems are partially self-contained and could be subjected to any malfunctions and unforeseen events.
The article is devoted to the suggested two-level dynamic mission planner based on the rendezvous point selection scheme. The idea is to divide a mission on a series of time-limited operating periods with the whole group rendezvous at the end of each period. The high-level planner’s objective here is to construct the recharging schedule for all vehicles in the group ensuring well-timed energy replenishment while preventing the simultaneous charging of a plenitude of robots. Based on this schedule, mission is decomposed to assign group rendezvous to each regrouping event (robot leaving the group for recharging or joining the group after recharging). This scheme of periodic rendezvous allows group to keep up its status regularly and to re-plan current strategy, if needed, almost on-the-fly. Low-level planner, in return, performs detailed group routing on the graph-like terrain for each operating period under vehicle’s technical restrictions and task’s spatiotemporal requirements. In this paper, we propose the evolutionary approach to decentralized implementation of both path planners using specialized heuristics, solution improvement techniques, and original chromosome-coding scheme. Both algorithm options for group mission planner are analyzed in the paper; the results of computational experiments are given.
The paper considers the optimization problem of tone approximation for monochrome (for example: in grayscale palette) images. The procedure of tone approximation implies the reduction of approximated image’s number of tones, which are used in image displaying, compared to number of tones in the original image. The point of the procedure optimization consists of minimization of visual quality loses that estimated according to total or mean deviation between the same pixels of original image and approximated one. As a tool of the optimization the hybrid algorithm is used. It was developed and investigated by authors. The hybrid algorithm combines heuristic and deterministic algorithms of searching the best structure of approximating palette according to criterion of deviations minimization. The heuristic algorithm is based on evolutionarily-genetic paradigm. The main goal of heuristic stage is the reduction of search area of approximating palette’s structures that are the closest to optimum. Such role for heuristic stage was defined according to its fast computational time. The goal of deterministic algorithm of directed exhaustive search is to find the nearest extreme for the result that was obtained by previous algorithm. The developed hybrid algorithm allows to provide dual optimization of tone approximation. It means that the algorithm provides a result, in which two different criteria become optimal relative to each other. The current investigation is devoted to consideration of possibility to increase the effectiveness of hybrid algorithm on the level of heuristic stage. The possibility of implementation the parallel model of evolutionarily-genetic algorithm with different settings is considered. The results of initial experiments are discussed and compared with known algorithm of tone approximation.
Modern video coding standards have high coding efficiency, but the encoding performance has to be improved to meet the growing multimedia applications. The paper deals with the entropy encoding methods and algorithms in video coding standard H.264/AVC and H.265/HEVC. Context-based Adaptive Variable Length Coding (CAVLC) for the H.264/AVC standard was originally designed for lossy video coding, and as such does not yield adequate performance for lossless video coding. Context-Adaptive Binary Arithmetic Coding (CABAC) is a method of entropy coding first introduced in H.264/AVC and now used in the standard H.265/HEVC. While it provides high coding efficiency, the data dependencies in H.264/AVC CABAC make it challenging to parallelize and thus, limit its throughput. Accordingly, during the standardization of entropy coding for HEVC, both coding efficiency and throughput were considered. Based on an analysis of their advantages and disadvantages, a method called the entropy coding algorithm using the enumerative coding of the hierarchical approach is proposed. The proposed algorithm consists of the Context-Adaptive Binary Arithmetic Coding algorithm and the enumerative coding algorithm with a hierarchical approach. The proposed algorithm is tested in the Visual C ++ development environment on various test video sequences. The results of the experiments showed a greater efficiency of coding of multimedia data (the proposed one reduces on average up to 15% of the storage volume compared to the traditional CABAC method), while the method requires a longer coding time (approximately twice). The proposed method can be recommended for use in telecommunication systems for storage, transmission and processing of multimedia data, where a high degree of compression is required first.
We propose an algebraic approach to constructing a multiple control system for nonlinear multidimensional objects with chaotic regimes. The purpose of control is to output an object to an analytically prescribed stable state. Two variants of the continuous and discrete description of control objects are considered. The controlled objects are presented as a system of ordinary non-linear differential or difference equations with chaotic behavior in the case of certain combinations of parameters. Part of the objects description may be unknown. The suggested control algorithm is realized on the basis of methods of nonlinear control on manifolds, Lyapunov functions and the algebraic approach to the synthesis of correct algorithms. Two applied problems of economic orientation with continuous and discrete models of description are considered. Numerical modeling was carried out on real data of small enterprises. The results of the paper are expected to be used in the system of economic decision-making support.
This paper is devoted to feature selection and evaluation in an automatic text-independent speaker verification task. In order to solve this problem a speaker verification system based on the Gaussian mixture model and the universal background model (GMM-UBM system) was used.
The application sphere and challenges of modern systems of automatic speaker identification were considered. Overview of the modern speaker recognition methods and main speech features used in speaker identification is provided. Features extraction process used in this article was examined. Reviewed speech features were used for speaker verification including mel-cepstral coefficients (MFCC), line spectral pairs (LSP), perceptual linear prediction cepstral coefficients (PLP), short-term energy, formant frequencies, fundamental frequency, voicing probability, zero crossing rate (ZCR), jitter and shimmer.
The experimental evaluation of the GMM-UBM system using different speech features was conducted on a 50 speaker set and a result is presented. Feature selection was done using the genetic algorithm and the greedy adding and deleting algorithm.
Equal error rate (EER) equals 0,579 % when using 256 component Gaussian mixture model and the obtained feature vector. Comparing to standard 14 MFCC vector, 42,1 % of EER improvement was acquired.
The purpose of this paper is to develop an algorithm for analytical design of consecutive compensator for the control system with delay based on typical polynomial dynamical models modification. A formula relating to the characteristic frequency and the cutoff frequency of a transfer function of open loop of the desired polynomial dynamic model was derived. Using this formula a modification of polynomial models was made taking into account a value of delay element of a plant.
Control of the plant with delay using the consecutive compensator has several advantages: it requires a minimum amount of measuring data and eliminates the need to introduce observer; there is no problem of non-zero initial conditions, which may arise during short-term disruption of a normal functioning of the system; a simple construction of consecutive compensator procedures for SISO and MIMO systems.
A conventional Smith predictor presents poor stability when controlling systems with time-varying delay. In this paper, an improved adaptive PID-Smith predictor is proposed. It uses a PID controller as the primary controller as well as the estimator for unknown time delay. The goal is to ensure system stability and resistance to modeling errors.
This article discusses two structures of the estimator unit - based on a neural network and on a fuzzy controller. In the first variant, the genetic algorithm is used to find the optimal parameters of the estimator in the autonomous mode. In the second variant, the fuzzy controller of the Takagi – Sugeno type uses a variety of models with different delay time. At each time point the error of output is calculated for all models. The output signal of the estimator is formed by the rule of defuzzification. Simulation results show the effectiveness of the proposed modification of the Smith predictor.
The objective of the paper is to show the relationship among numbers, belonging to known sequences of prime numbers, and quasi-orthogonal matrices, existing for orders equal to these numbers, and the relationship among such matrices based on calculating algorithms. Methods: analysis of the sequences of quasi-orthogonal matrices with absolute and local maximum of its determinant, detecting of the structural invariants in the matrices, matching algorithms for calculating these matrices. Results: known sequences of natural numbers are considered, the definition of a matrix, associated with a natural number, is formulated. Sequences of numbers with proved existence of quasi-orthogonal matrices associated with them are presented. It is suggested that associated matrices exist for all positive natural numbers. Properties of these types of matrices, their relationships based on calculating algorithms are considered. There are modified algorithms and general key chains of Euler and Mersenne matrices presented, a sequence of orders of which are systemically important. Practical value: quasi-orthogonal matrices of absolute and local maximum of determinant have immediate practical value for error-correcting coding tasks, video compression and masking. Their diversity allows developers of technical systems greatly facilitate a matrix selection, optimal one for a particular task.
To reduce the complexity of the task of structural synthesis, it is divided into stages, during each of which the researcher, conducts (with the help of decision support systems) the synthesis and analysis of model systems for the given input requirements and restrictions. Structural optimization, in this context, is reduced to finding the extremum of a certain objective function, whose value is controlled by specified design parameters depending on the type of task.
To demonstrate how the simulation model works, the functional synthesis of the structure of information enterprise management system is considered, where the functional elements are the automated business processes, and the structural elements — automation facilities. A test case is made, where typical processes of budgeting, marketing, purchasing and sales, production and human resources are described as the functional elements.
The use of the developed model of functional synthesis is exemplified by the task of choosing software for the design of corporate information systems. On the basis of a series of experiments we have determined the set of possible solutions with the greatest value of the fitness function. It is established that the function value is affected by the number of redundant functions that contain selected structural elements.
This paper discusses the problems of application and choice of cryptographic standards taking into account user requirements and preferences. User profiles are created by means of the ontology apparatus. On the basis of user profiles and document features an appropriate set of documents is formed, the elements of which are then arranged according to the degree of compliance to user requirements. Various filtration methods, such as collaborative filtering, content analysis and filtering, as well as hybrid methods combining both approaches, are used. Thus, a recommender system for choosing cryptographic standards and algorithms is built. If there are several user selection criteria, it is reasonable to apply an integral index of object’s relevance to user preferences. This index is defined as the weighed sum of the particular indices.
The problem of movement trajectories formation of a vehicle robots group, functioning in the two-dimensional environment with motionless obstacles, is considered. For the solution of this task, it is possible to use graphic-analytical methods. These methods are based on Dijkstra's algorithms, Bellman-Ford and A *. We carry out experiment including 100 iterations of computer modeling. The results of modeling are data on time of vehicle robots group movement on trajectories developed by means of the algorithms. On the basis of the modeling results was made a comparison of methods. This comparison has allowed revealing the most optimum of methods.
In the article, the model and implementation feature of the program shell of parallelizing are considered. Results of the comparative performance evaluation of solving the task of access restoring to data on various hardware, using for these purposes, both a sequential algorithm of computing, and implementation on the basis of the program shell of parallelizing are given.
The goal of this work is to propose an algorithm to automatically determine whether the word used ironically or literally using semantic similarity. Previous approaches to this problem are reviewed. The term “irony” is defined. Two sets are collected, the first of them contains statements with words used ironically, the second contains statements with the same words used literally. Different methods of semantic similarity measuring are studied. A semantic similarity based algorithm to automatically determine whether the word used ironically or literally is proposed.
OCR results of archival documents have to be corrected in order to improve accuracy. An algorithm that takes into account peculiarities of the Russian language and allows handling large volumes of text corpus in fully automatic mode is described. The correction process is divided into stages of analysis of the entire corpus of texts, preparation of data structures, the selection of word candidates and their final ranking. Using rank-rating model for generating text corrections allows handling texts containing specific terminology from different subject areas.
Methods and techniques of software design as one of the important stages of software development are described in the paper. The method of software design with using of UML with Petri nets for analyzing of dynamic properties of set UML diagrams is described. Authors offer improved method of using of integration of UML diagrams and Petri nets. The offered method was used for designing of software of automated process control system (APCS) of pumping station: designing of use case, class, object diagrams and sequences diagram that was transformed to Petri net with help of formal rules. Some incorrect states that occurred after pumps enabling/disabling by operator were identified by analysis of Petri net. Reachability tree of the system was gotten by analysis of the Petri net (the value of the tree is about 10 6 of nodes). Testing of offered system was showed on example of APCS of pump station.
A problem of construction of a level description of classes with objects characterized by properties of their elements and relations between them is under consi\-de\-ration in the paper. The problems of recognition and analysis of such objects are NP-hard, but if descriptions of classes contain short enough and frequently occurred sub-formulas then it is possible to build a level description of classes essentially decreasing an exponent in upper bounds of steps for an algorithm solving the problemr. Usually an extracting of these sub-formulas is leaved to the investigator will. An approach to their automatic extraction is proposed in the paper.
The need for efficient algorithms for processing strings arises in many practical problems. One of the most universal approaches is the use of suffix trees. However, this data structure has high memory requirements, which limits area of its application. In this article we consider a way to partially eliminate this disadvantage and give an example of solving the problem of the longest symmetric substring. The described method can be also be used for other problems too.
Fundamental properties of the angular-spatial symmetry of radiation fields in the uniform slab of a finite optical thickness are used for improvement of the numerical methods and algorithms of the classical radiative transfer theory. A new notion of so called photometrical invariants is introduced. The basic boundary-value problem of the radiative transfer theory is reformulated in new terms for the subsequent simplification of algorithms of numerical modeling methods such as spherical harmonics, discrete ordinates, Gauss-Seidel, Case and Hunt-Grant methods. This simplification leads to two-fold decrease of the ranks of linear algebraic equations with simultaneous reduction of numerical modeling intervals connected with angular and spatial variables.
The method of an acceleration of algorithms for hierarchic image segmentation is proposed. The algorithm is applied when functionality of the decision rule does not need segments features recalculation on an each iteration.
At the present time harsh climate change is noted in the Arctic region. This is due to significant changes in the environment – the atmosphere, the ocean, the ice cover. The research institute of Rosgidromet, territorial administration and territorial centers of data receipt and processing, monitoring system of different type of situation, including ice situation, are regularly observed for state of environment of the Arctic region. Receiving data needs in effective processing and analysis to study, control and predict of states of the environment. The paper discusses methods of meteorological and oceanographic data adaptive processing and analysis for proposes of two stages – data verification and regularization. The description of processing algorithms for these stages is provided.
The article discusses the paradigm shift from traditional mathematical models of control theory to the A.N. Kolmogorov's algorithmic theory of computer science. A comparison between the identifiable object information and Shannon's ensemble (entropy) information. The proposed algorithmic models based on the respective approximations of space filling curves also regarded as self-similar recursive structure (fractal approach).
The meaning of terms "bioinformatics", "medical informatics", "biomedical informatics" is discussed as applied to the latter's goals, problems and methods. Justification is given to the definition of biomedical informatics that is in our opinion most complete. The milestones in the history of Russian biomedical informatics are listed as well as the main scientific schools within this line of investigation headed by Russia’s leading scientists. The article provides an examination of the activity of the laboratory of biomedical informatics and the characteristics of solving the problems of biomedical informatics at SPIIRAS
The paper is devoted to the investigation of Prolog language family possibility for its use in the solving of the display image recognition. Diculti appeared in the Prolog implementation of the approach are marked out . It is shown the way in which the number of deduction algorithm steps allows to overcome the appeared diculties. Examples of the Prolog programs implementation to the separation of an etalon image from a complex one are done. Peculiarities of the use of dierent recognizable image formats are analized.
The paper describe key points in algebraic bayesian network knowledge pattern implementation on C++ programming language. Knowledge pattern implemented as class that handle and store estimation for knowledge pattern elements. It also provide a couple of methods for processing knowledge pattern such as consistency update and a posteriori inference.
This paper proposes a method of constructing modal regulator by transfer function for closed-loop system in case of presence of setting and disturbance influence. The method is simple and allows to express the coefficients of the transfer function of the regulator in terms of coefficients of the wishful polynomial closed-loop system. On the basis of this feature an optimization algorithm of these coefficients on the criterion of maximum robustness is presented.
The paper addresses the problem of grouping vector objects around potential centers with respect to restrictions imposed on the group structure. A method for transforming vector restictions into the restictions of equivalent integer programming problem is proposed. Polynomial algorithms for some special cases are suggested.
Some artificial intelligence problems including such ones as pattern recognition, medical diagnostics, market analysis is reduced to the proof of satisfi\-abi\-li\-ty of predicate calculus formulas with a symple structure. Some algorithms solving such problems are regarded and the upper bounds of their steps are proved.