Received: 9 October, 2017 Accepted: 13 March, 2018 Abstract—The use of cloud computing to impleme... more Received: 9 October, 2017 Accepted: 13 March, 2018 Abstract—The use of cloud computing to implement business processes is becoming increasingly important because users can benefit from the economic and technical advantages of this technology. The concept of the Business Process as a Service (BPaaS) is a new solution in the use of specific business processes as a medium for aligning information technology and business. However, managing and deploying business processes on heterogeneous Cloud providers is still a challenge for organizations due to interoperability concerns. This paper suggests an algorithm for optimizing the resource allocation of the business process in Extended Federated BPaaS model in accordance with the requirements of the user’s policy. The developed model has been compared with other popular models supporting the service/business process policy and shows that the proposed model can effectively execute business processes with regards to infrastructure and data tr...
Identifying topics and concepts associated with a set of documents is a critical task for informa... more Identifying topics and concepts associated with a set of documents is a critical task for information retrieval systems. One approach is to associate a query with a set of topics selected from a fixed ontology or vocabulary of terms. The core idea of this research is using Wikipedia articles and associated pages to make a topic ontology for this purpose. The benefit of this method is that Wikipedia is an online free-content encyclopedia which is developed through a social process and kept current by the Wikipedia community. In this paper the Persian Wikipedia has been analyzed in accordance to its articles and the category link graphs to extract a Persian pseudo-ontology. Thereafter, the created ontology has been applied through a query expansion algorithm to improve the performance of an information retrieval system. Our experiments show that it is possible to improve the precision of the information retrieval system by queries expansion based on Wikipedia.
There are many automatic classifi cation methods and algorithms that have been propose for conten... more There are many automatic classifi cation methods and algorithms that have been propose for content-based or context-based features of web pages. In this paper we analyze these features and try to exploit a combination of features to improve categorization accuracy of Persian web page classifi cation. In this work we have suggested a linear combination of different features and adjusting the optimum weighing during application. To show the outcome of this approach, we have conducted various experiments on a dataset consisting of all pages belonging to Persian Wikipedia in the fi eld of computer. These experiments demonstrate the usefulness of using content-based and context-based web page features in a linear weighted combination.
The 7th International Conference on Digital Content, Multimedia Technology and its Applications, 2011
Statistical n-gram language modeling is applied in many domains like speech recognition, language... more Statistical n-gram language modeling is applied in many domains like speech recognition, language identification, machine translation, character recognition and topic classification. Most language modeling approaches work on n-grams of words. In this paper, we employ language models classifier based on word level n-grams for Persian text classification. The presented approach computes the occurrence probability on word sequence in training data. Then by extracting the word sequence in test data, it can predict the highest probability for related class to given news text. We show that statistical language modeling can significantly cause high classification performance. The experimental results on Hamshahri corpus show satisfactory results and n-grams of length 3 are the most useful for Persian text classification.
2010 5th International Symposium on Telecommunications, 2010
The problem of spam detection is a crucial task in the web information retrieval systems. The dyn... more The problem of spam detection is a crucial task in the web information retrieval systems. The dynamic nature of information resources as well as the continuous changes in the information demands of the users makes the task of web spam detection a challenging topic. So far many different methods from researchers with different backgrounds have been proposed to tackle with spam web pages problem. In this research, we study feature space of web spam detection to recognize most effective and discriminative features. Thereafter, we design a spam detection system that employs a minimum set of features and at the same time its performance is the same or very close to a system with the complete feature set. The experimental results show that we can reduce the number of features in a clever way while the accuracy of the system is intact or even improved.
Critical infrastructure systems are complex networks of adaptive socio-technical systems that pro... more Critical infrastructure systems are complex networks of adaptive socio-technical systems that provide the most fundamental requirements of the society. Their importance in the smooth conduct of the society has made their role more and more prominent. A failure in any of these important components of today's industrial society can well affect the lives of millions of people. It is not only their individual break down that raises serious concerns, but their mutual reliance (interdependency) is even more threatening. Although interdependency in these infrastructure systems provides many benefits for their operation, a failure in one can ripple down to the others and cause a catastrophic irremunerable event. In this paper, we have introduced a simulation suite for analysing the behaviour of interdependent critical infrastructure systems. The simulation suite focuses on the types of services that are provided by infrastructure components. Each infrastructure system component is model...
Since Word Wide Web contains large set of data in different languages, retrieving language specif... more Since Word Wide Web contains large set of data in different languages, retrieving language specific information creates a new challenge in information retrieval called language specific crawling. In this paper, a new approach is purposed for language specific crawling in which a combination of some selected content and context features of web documents have been applied. This approach has been implemented for Persian language and evaluated in Iranian web domain. The evaluation results show how this approach can improve the performance of crawling from speed and coverage points of view.
Today, Organization's daily operations rely on automatic business processes running on IT inf... more Today, Organization's daily operations rely on automatic business processes running on IT infrastructures. On the other hand, fast-changing business environments along with the costly process of in-house development of business process management systems drives the businesses to outsource their business processes. Cloud computing paradigm has become increasingly important for executing business processes as users can exploit the economic and technical advantages of this computing. The Business Process as a Service (BPaaS) paradigm is a new concept in using specific business processes as an intermediary to align business and information technology. The management and deployment of business processes on the existing heterogeneous Cloud providers, however, is still a challenge for the organizations due to interoperability concerns. This paper suggests a Federated BPaaS model, a service aggregation concept characterized by interoperability specifications, to address the integration ...
In recent years, researchers have introduced many different mechanisms to improve resource alloca... more In recent years, researchers have introduced many different mechanisms to improve resource allocation in the cloud. One of these resource allocation methods is market-based resource allocation which exploits different models used in exchanging goods and services. In this research, a two-way auction model is used for allocating cloud resources based on the market model. In the case of federated clouds, as the providers may face a shortage of resources during their operation; therefore, the continuous double auction model is suggested to create a cloud federation environment to support a suitable resource allocation among different providers. In our experiment 1, fixed pricing with ReputationAware Continuous Double Auction, Continuous Double Auction, and Market-Driven Continuous Double Auction models will be executed for resource allocation. It shows that both the resource efficiency and the income of the providers are improved in the federated clouds using these models. In experiment...
Phishing plays a negative role in reducing the trust among the users in the business network base... more Phishing plays a negative role in reducing the trust among the users in the business network based on the E-commerce framework. therefore, in this research, we tried to detect phishing websites using data mining. The detection of the outstanding features of phishing is regarded as one of the important prerequisites in designing an accurate detection system. Therefore, in order to detect phishing features, a list of 30 features suggested by phishing websites was first prepared. Then, a two-stage feature reduction method based on feature selection and extraction were proposed to enhance the efficiency of phishing detection systems, which was able to reduce the number of features significantly. Finally, the performance of decision tree J48, random forest, naive Bayes methods were evaluated{cke_protected_1}{cke_protected_2}{cke_protected_3}{cke_protected_4} on the reduced features. The results indicated that accuracy of the model created to determine the phishing websites by using the t...
Twitter has provided a convenient platform to express feelings and opinions in different areas. O... more Twitter has provided a convenient platform to express feelings and opinions in different areas. Opinion mining in Twitter can be considered as studying the overall sentiment of a tweet. There are two general categories of sentiment analysis methods in the Persian language, linked-base methods and, content-based methods. In this study, we implement a new link-based method for improving opinion classification in the Persian language. To compare with the content-based method, we implement a content-based method using Naive Bayes Method with two different weighting Methods: TF/IDF and Chi-Square. The TF/IDF method has good results in previous Persian language studies. The Chi-Square method has not been used in the Persian language researches, but the accuracy is fairly good in English. The results show that the improvement in the language-independent methods is remarkable and is in accordance with this research, the precision of the proposed algorithm for positive and negative comments ...
Due to the expansion of social networks and media such as Tweeter, Facebook, LinkedIn, and differ... more Due to the expansion of social networks and media such as Tweeter, Facebook, LinkedIn, and different weblogs, and the great increase in information sharing and comments, Which typically are in the form of text data, big enough to be recognized as big data., and with respect to the importance of these data for the analysis of customers’ priorities, needs and their attitudes toward different products, finding and extracting data from their comments, are the primary goals of this research. To serve this purpose, this research has used deep learning approach, and multilayer neural network methods in order to extract the polarity of customers’ opinions and comments in two domains of products/services ranging from restaurant to laptop.The findings of this study indicate that the proposed model using the potencies of the long short-term-memory networks, is able to determine the comments’ polarity with 85 % and 84.62 % precision for restaurant and laptop domains respectively, in such a way ...
With the daily increasing development of the Internet of Things, Internet of Things platform as a... more With the daily increasing development of the Internet of Things, Internet of Things platform as a service, come into the world arena with different structures and characteristics. This leads to make the balance of their advantages and disadvantages, so that we can choose the appropriate platform to apply it for advancement of our aims. Data management, data monitoring, no loss data, speed, low latency and other criteria play very important role to select a good platform. Therefore, in this research, while investigating the quantitative and qualitative criteria of different platforms, we have tried to find a framework for evaluating them. In order to achieve this goal, some platforms such as Thingspeak, Xively and AWS IoT have been introduced and according to be mentioned criteria; we have implemented and tested different scenarios in the equal environment and conditions, and evaluated how the platforms behave according to the existing criteria.
2010 6th International Conference on Advanced Information Management and Service (IMS), 2010
Automatic document classification due to its various applications in data mining and information ... more Automatic document classification due to its various applications in data mining and information technology is one of the important topics in computer science. Classification plays a vital role in many information management and retrieval tasks. Document classification, also known as document categorization, is the process of assigning a document to one or more predefined category labels. Classification is often posed as a supervised learning problem in which a set of labeled data is used to train a classifier which can be applied to label future examples [1]. Document classification includes different parts such as text processing, feature extraction, feature vector construction and final classification. Thus improvement in each part should lead to better results in document classification. In this paper, we apply machine learning methods for automatic Persian news classification. In this regard, we first try to exert some language preprocess in Hamshahri dataset [2], and then we e...
The volume of Farsi information on the Internet has been increasing in recent years. However, mos... more The volume of Farsi information on the Internet has been increasing in recent years. However, most of this information is in the form of unstructured or semi-structured free text. For quick and accurate access to the vast knowledge contained in these texts, the information extraction methods are essential to generate knowledge bases. In recent years, relation extraction as a sub-task of information extraction has received much attention. While many of these systems were developed in English and other well-known languages, the systems for information extraction in Farsi have received less attention from researchers. In this systematic research for semi-automatic relation extraction, Persian Wikipedia articles were presented as reliable and semi-structured sources. In this system, the relation extraction is performed with the assistance of patterns that are automatically obtained with an approach based on distant supervised. In order to apply the distant supervised, the vast knowledge...
International Journal of Pervasive Computing and Communications
Purpose The concept of business process (BP) as a service is a new solution in enterprises for th... more Purpose The concept of business process (BP) as a service is a new solution in enterprises for the purpose of using specific BPs. BPs represent combinations of software services that must be properly executed by the resources provided by a company’s information technology infrastructure. As the policy requirements are different in each enterprise, processes are constantly evolving and demanding new resources in terms of computation and storage. To support more agility and flexibility, it is common today for enterprises to outsource their processes to clouds and, more recently, to cloud federation environment. Ensuring the optimal allocation of cloud resources to process service during the execution of workflows in accordance with user policy requirements is a major concern. Given the diversity of resources available in a cloud federation environment and the ongoing process changes required based on policies, reallocating cloud resources for service processing may lead to high comput...
Received: 9 October, 2017 Accepted: 13 March, 2018 Abstract—The use of cloud computing to impleme... more Received: 9 October, 2017 Accepted: 13 March, 2018 Abstract—The use of cloud computing to implement business processes is becoming increasingly important because users can benefit from the economic and technical advantages of this technology. The concept of the Business Process as a Service (BPaaS) is a new solution in the use of specific business processes as a medium for aligning information technology and business. However, managing and deploying business processes on heterogeneous Cloud providers is still a challenge for organizations due to interoperability concerns. This paper suggests an algorithm for optimizing the resource allocation of the business process in Extended Federated BPaaS model in accordance with the requirements of the user’s policy. The developed model has been compared with other popular models supporting the service/business process policy and shows that the proposed model can effectively execute business processes with regards to infrastructure and data tr...
Identifying topics and concepts associated with a set of documents is a critical task for informa... more Identifying topics and concepts associated with a set of documents is a critical task for information retrieval systems. One approach is to associate a query with a set of topics selected from a fixed ontology or vocabulary of terms. The core idea of this research is using Wikipedia articles and associated pages to make a topic ontology for this purpose. The benefit of this method is that Wikipedia is an online free-content encyclopedia which is developed through a social process and kept current by the Wikipedia community. In this paper the Persian Wikipedia has been analyzed in accordance to its articles and the category link graphs to extract a Persian pseudo-ontology. Thereafter, the created ontology has been applied through a query expansion algorithm to improve the performance of an information retrieval system. Our experiments show that it is possible to improve the precision of the information retrieval system by queries expansion based on Wikipedia.
There are many automatic classifi cation methods and algorithms that have been propose for conten... more There are many automatic classifi cation methods and algorithms that have been propose for content-based or context-based features of web pages. In this paper we analyze these features and try to exploit a combination of features to improve categorization accuracy of Persian web page classifi cation. In this work we have suggested a linear combination of different features and adjusting the optimum weighing during application. To show the outcome of this approach, we have conducted various experiments on a dataset consisting of all pages belonging to Persian Wikipedia in the fi eld of computer. These experiments demonstrate the usefulness of using content-based and context-based web page features in a linear weighted combination.
The 7th International Conference on Digital Content, Multimedia Technology and its Applications, 2011
Statistical n-gram language modeling is applied in many domains like speech recognition, language... more Statistical n-gram language modeling is applied in many domains like speech recognition, language identification, machine translation, character recognition and topic classification. Most language modeling approaches work on n-grams of words. In this paper, we employ language models classifier based on word level n-grams for Persian text classification. The presented approach computes the occurrence probability on word sequence in training data. Then by extracting the word sequence in test data, it can predict the highest probability for related class to given news text. We show that statistical language modeling can significantly cause high classification performance. The experimental results on Hamshahri corpus show satisfactory results and n-grams of length 3 are the most useful for Persian text classification.
2010 5th International Symposium on Telecommunications, 2010
The problem of spam detection is a crucial task in the web information retrieval systems. The dyn... more The problem of spam detection is a crucial task in the web information retrieval systems. The dynamic nature of information resources as well as the continuous changes in the information demands of the users makes the task of web spam detection a challenging topic. So far many different methods from researchers with different backgrounds have been proposed to tackle with spam web pages problem. In this research, we study feature space of web spam detection to recognize most effective and discriminative features. Thereafter, we design a spam detection system that employs a minimum set of features and at the same time its performance is the same or very close to a system with the complete feature set. The experimental results show that we can reduce the number of features in a clever way while the accuracy of the system is intact or even improved.
Critical infrastructure systems are complex networks of adaptive socio-technical systems that pro... more Critical infrastructure systems are complex networks of adaptive socio-technical systems that provide the most fundamental requirements of the society. Their importance in the smooth conduct of the society has made their role more and more prominent. A failure in any of these important components of today's industrial society can well affect the lives of millions of people. It is not only their individual break down that raises serious concerns, but their mutual reliance (interdependency) is even more threatening. Although interdependency in these infrastructure systems provides many benefits for their operation, a failure in one can ripple down to the others and cause a catastrophic irremunerable event. In this paper, we have introduced a simulation suite for analysing the behaviour of interdependent critical infrastructure systems. The simulation suite focuses on the types of services that are provided by infrastructure components. Each infrastructure system component is model...
Since Word Wide Web contains large set of data in different languages, retrieving language specif... more Since Word Wide Web contains large set of data in different languages, retrieving language specific information creates a new challenge in information retrieval called language specific crawling. In this paper, a new approach is purposed for language specific crawling in which a combination of some selected content and context features of web documents have been applied. This approach has been implemented for Persian language and evaluated in Iranian web domain. The evaluation results show how this approach can improve the performance of crawling from speed and coverage points of view.
Today, Organization's daily operations rely on automatic business processes running on IT inf... more Today, Organization's daily operations rely on automatic business processes running on IT infrastructures. On the other hand, fast-changing business environments along with the costly process of in-house development of business process management systems drives the businesses to outsource their business processes. Cloud computing paradigm has become increasingly important for executing business processes as users can exploit the economic and technical advantages of this computing. The Business Process as a Service (BPaaS) paradigm is a new concept in using specific business processes as an intermediary to align business and information technology. The management and deployment of business processes on the existing heterogeneous Cloud providers, however, is still a challenge for the organizations due to interoperability concerns. This paper suggests a Federated BPaaS model, a service aggregation concept characterized by interoperability specifications, to address the integration ...
In recent years, researchers have introduced many different mechanisms to improve resource alloca... more In recent years, researchers have introduced many different mechanisms to improve resource allocation in the cloud. One of these resource allocation methods is market-based resource allocation which exploits different models used in exchanging goods and services. In this research, a two-way auction model is used for allocating cloud resources based on the market model. In the case of federated clouds, as the providers may face a shortage of resources during their operation; therefore, the continuous double auction model is suggested to create a cloud federation environment to support a suitable resource allocation among different providers. In our experiment 1, fixed pricing with ReputationAware Continuous Double Auction, Continuous Double Auction, and Market-Driven Continuous Double Auction models will be executed for resource allocation. It shows that both the resource efficiency and the income of the providers are improved in the federated clouds using these models. In experiment...
Phishing plays a negative role in reducing the trust among the users in the business network base... more Phishing plays a negative role in reducing the trust among the users in the business network based on the E-commerce framework. therefore, in this research, we tried to detect phishing websites using data mining. The detection of the outstanding features of phishing is regarded as one of the important prerequisites in designing an accurate detection system. Therefore, in order to detect phishing features, a list of 30 features suggested by phishing websites was first prepared. Then, a two-stage feature reduction method based on feature selection and extraction were proposed to enhance the efficiency of phishing detection systems, which was able to reduce the number of features significantly. Finally, the performance of decision tree J48, random forest, naive Bayes methods were evaluated{cke_protected_1}{cke_protected_2}{cke_protected_3}{cke_protected_4} on the reduced features. The results indicated that accuracy of the model created to determine the phishing websites by using the t...
Twitter has provided a convenient platform to express feelings and opinions in different areas. O... more Twitter has provided a convenient platform to express feelings and opinions in different areas. Opinion mining in Twitter can be considered as studying the overall sentiment of a tweet. There are two general categories of sentiment analysis methods in the Persian language, linked-base methods and, content-based methods. In this study, we implement a new link-based method for improving opinion classification in the Persian language. To compare with the content-based method, we implement a content-based method using Naive Bayes Method with two different weighting Methods: TF/IDF and Chi-Square. The TF/IDF method has good results in previous Persian language studies. The Chi-Square method has not been used in the Persian language researches, but the accuracy is fairly good in English. The results show that the improvement in the language-independent methods is remarkable and is in accordance with this research, the precision of the proposed algorithm for positive and negative comments ...
Due to the expansion of social networks and media such as Tweeter, Facebook, LinkedIn, and differ... more Due to the expansion of social networks and media such as Tweeter, Facebook, LinkedIn, and different weblogs, and the great increase in information sharing and comments, Which typically are in the form of text data, big enough to be recognized as big data., and with respect to the importance of these data for the analysis of customers’ priorities, needs and their attitudes toward different products, finding and extracting data from their comments, are the primary goals of this research. To serve this purpose, this research has used deep learning approach, and multilayer neural network methods in order to extract the polarity of customers’ opinions and comments in two domains of products/services ranging from restaurant to laptop.The findings of this study indicate that the proposed model using the potencies of the long short-term-memory networks, is able to determine the comments’ polarity with 85 % and 84.62 % precision for restaurant and laptop domains respectively, in such a way ...
With the daily increasing development of the Internet of Things, Internet of Things platform as a... more With the daily increasing development of the Internet of Things, Internet of Things platform as a service, come into the world arena with different structures and characteristics. This leads to make the balance of their advantages and disadvantages, so that we can choose the appropriate platform to apply it for advancement of our aims. Data management, data monitoring, no loss data, speed, low latency and other criteria play very important role to select a good platform. Therefore, in this research, while investigating the quantitative and qualitative criteria of different platforms, we have tried to find a framework for evaluating them. In order to achieve this goal, some platforms such as Thingspeak, Xively and AWS IoT have been introduced and according to be mentioned criteria; we have implemented and tested different scenarios in the equal environment and conditions, and evaluated how the platforms behave according to the existing criteria.
2010 6th International Conference on Advanced Information Management and Service (IMS), 2010
Automatic document classification due to its various applications in data mining and information ... more Automatic document classification due to its various applications in data mining and information technology is one of the important topics in computer science. Classification plays a vital role in many information management and retrieval tasks. Document classification, also known as document categorization, is the process of assigning a document to one or more predefined category labels. Classification is often posed as a supervised learning problem in which a set of labeled data is used to train a classifier which can be applied to label future examples [1]. Document classification includes different parts such as text processing, feature extraction, feature vector construction and final classification. Thus improvement in each part should lead to better results in document classification. In this paper, we apply machine learning methods for automatic Persian news classification. In this regard, we first try to exert some language preprocess in Hamshahri dataset [2], and then we e...
The volume of Farsi information on the Internet has been increasing in recent years. However, mos... more The volume of Farsi information on the Internet has been increasing in recent years. However, most of this information is in the form of unstructured or semi-structured free text. For quick and accurate access to the vast knowledge contained in these texts, the information extraction methods are essential to generate knowledge bases. In recent years, relation extraction as a sub-task of information extraction has received much attention. While many of these systems were developed in English and other well-known languages, the systems for information extraction in Farsi have received less attention from researchers. In this systematic research for semi-automatic relation extraction, Persian Wikipedia articles were presented as reliable and semi-structured sources. In this system, the relation extraction is performed with the assistance of patterns that are automatically obtained with an approach based on distant supervised. In order to apply the distant supervised, the vast knowledge...
International Journal of Pervasive Computing and Communications
Purpose The concept of business process (BP) as a service is a new solution in enterprises for th... more Purpose The concept of business process (BP) as a service is a new solution in enterprises for the purpose of using specific BPs. BPs represent combinations of software services that must be properly executed by the resources provided by a company’s information technology infrastructure. As the policy requirements are different in each enterprise, processes are constantly evolving and demanding new resources in terms of computation and storage. To support more agility and flexibility, it is common today for enterprises to outsource their processes to clouds and, more recently, to cloud federation environment. Ensuring the optimal allocation of cloud resources to process service during the execution of workflows in accordance with user policy requirements is a major concern. Given the diversity of resources available in a cloud federation environment and the ongoing process changes required based on policies, reallocating cloud resources for service processing may lead to high comput...
Uploads
Papers by Alireza Yari