Theses
(2016) Berkan DEMİREL (MS)
Attribute Based Classifiers for Image Understanding
Advisor: DOÇ. DR. NAZLI İKİZLER CİNBİŞ
[Abstract]
Attributes are mid-level semantic concepts which describe visual appearance, functional affordance or other human-understandable aspects of objects and scenes. In the recent years, several works have investigated the use of attributes to solve various computer vision problems. Examples include attribute based image retrieval, zero-shot learning of unseen object categories, part localization and face recognition. This thesis proposes two novel attribute based approaches towards solving (i) top-down visual saliency estimation problem, and, (ii) unsupervised zero-shot object classification problem. For top-down saliency estimation, we propose a simple yet efficient approach based on Conditional Random Fields (CRFs), in which we use attribute classifier outputs as visual features. For zero-shot learning, we also propose a novel approach to solve unsupervised zeroshot object classification problem via attribute-class relationships. However, unlike other attribute-based approaches, we require attribute definitions only at training time, and require only the names of novel classes of interest at test time. Our detailed experimental results show that our methods perform on par with or better than the state-of-the-art.
(2017) Özlem YAVANOĞLU MİLLETSEVER (MS)
PERFORMANCE ASSESSMENT OF ARTIFICIAL NEURAL NETWORKS FOR AUTHOR ATTRIBUTION BY USING STYLISTIC FEATURES: TURKISH ARTICLES
Advisor: PROF. DR.EBRU AKÇAPINAR SEZER
[Abstract]
One of the main opportunities that the internet provides today is the rapidity of media resources, anonymity and accessibility from anywhere. Since individuals do not have to use their real identity in places like websites, forums, and e-mails, such places require good or bad intentions to distinguish between real and non-real identity use at the same time. This is sometimes for the solutions of crimes, sometimes for conceptional rights, sometimes for simply the name similarities. By examining a text that contains a crime element and by analyzing people's writing habits or styles (forms), author identificiation efforts help us to know about the true authors of those messages. In literature, author recognition is expressed as the process of determining the author of an article whose author is not known or whose author is suspected. Different works have been carried out on this field from day to day. Author recognition is considered as a classification problem and is expressed as the process of identifying the most appropriate author from the group of potential suspects. Within the scope of this thesis, it is aimed to develop author identification models in order to respond to different needs. Different tests have been carried out to assess the success of the models obtained. Accuracy values obtained from these tests vary between 99% and 74%. In addition, the success of the author features used in the author recognition study in determining the text type is evaluated and a different model for text type recognition is proposed. The proposed model shows whether a text belongs to the fields of 'Life', 'Politics' or 'Economy'. The accuracy of the proposed ANN (Artificial Neural Networks) models are between 88% and 70%. In this thesis, we also propose a hybrid ANN model which recognizes both writer and writing type in order to answer different needs and show the determination of the recommended author recognition and writing type recognition models.
(2017)Rima Al Washahi (MS)
Topic Model Based Recommendation Systems Retailers
Advisor: YRD. DOÇ. DR. Gönenç ERCAN
[Abstract]
Nowadays, sellers need very good strategy to keep their customers’ loyalty and to attract new customers to their shops. One of the important ways to accomplish this task is to present new and interesting items to their customers. In this thesis, we propose a new recommender system (RS) which recommends new items to sellers that they did not sell previously in their shop. Most of the RSs, recommend items to customers; unlike traditional RSs, proposed model is designed to suggest new items to sellers. In order to build the model we adopted generative models that are used in text mining domain. Specifically, the probabilistic latent semantic analysis (pLSA) techniqueis extendedto build the proposed RS . Several experiments are conducted using a real world dataset to validate the model. Furthermore, Collaborative Filtering (CF) method is used as a baseline algorithm to compare the performance of the proposed algorithm to state-of-the-art.Our experiments suggest that the proposed recommender system is more efficient than the pure CF algorithm for this task.
(2017)Tohid TAGHİZAD GOGJEH YARAN (MS)
Reliability Oriented Embedded System Design Method
Advisor: DOÇ. DR. SÜLEYMAN TOSUN
[Abstract]
Combinational circuits have become more vulnerable to soft errors (SEs) in each CMOS technology generation. Most of the prior studies use hardware redundancy in an attempt to harden the circuits against errors. However, redundancy increases the area and power consumption. Furthermore, the design constraints may not allow adding redundant resources to the final circuit. In this paper, we present a genetic algorithm (GA)-based design method to increase the reliability of combinational circuits. In this method, we use different versions of the same resources, each having different area, latency, and reliability values. The goal of GA-based optimizer is to allocate the best available resources to the application nodes to maximize the reliability of the design under tight area and latency constraints. Our experimental results show that we achieve up to 19.90% (14.50% on average) reliability improvement against a heuristic method with no additional area overhead.
(2017) Arash Barzinmehr(MS)
Energy Aware Application-Specific 3d Network -On-Chip Design
Advisor: DOÇ. DR. SÜLEYMAN TOSUN
[Abstract]
Network-on-Chip (NoC) is a promising approach for supporting a heavy communication demand between all parts of high-performance modern nanoscale System-on-Chips (SoCs). Three-dimensional (3D) IC integration has become popular by reducing latency and energy consumption due to replacing long global interconnects with short vertical through silicon via (TSV) interconnects between different dies. Combining NoCs with 3D technology seems a good choice for achieving better performances than 2D. Although there exist good synthesis methods for designing energy- and communication-aware 2D-NoCs, there is still needs for 3D alternative. In this paper an energy-aware application-specific topology generation method for 3D-NoCs is proposed. This method is based on a heuristic optimization algorithm that partitions the application nodes among layers of the NoC architecture for an attempt to minimize the dynamic energy consumption. Proposed 3D method tested against a 2D alternative through several NoC benchmarks. Simulation results show that our approach outperforms its 2D counterpart in terms of energy and area.
(2017) Efsun Sefa SEZER (MS)
KALABALIK GÖZETLEME ORTAMLARINDA ANOMALİ TESPİTİ
Advisor: DOÇ. DR. AHMET BURAK CAN
[Abstract]
Camera survaillance systems are effective security methods with a wide range of uses. Videos obtained from these systems are examined by the security personnel in order to determine the dangerous situations and take the necessary precautions. Increasing technological developments in recent years have led to reductions in the cost of cameras and an increase in the use of surveillance systems and the amount of video data being acquired. Processing these data manually is very hard and time consuming. The visual attention module of the human brain is limited and thus, human attention shows a great decline after a certain period of time. This is the serious problem in manual analysis of large amounts of data. Intelligent video surveillance systems reduce the need for human power and enable to obtain meaningful information from large amount of video data. One of the important purpose of intelligent video surveillance systems is to analyse videos effectively to distinguish between normal and abnormal conditions and to alert the relevant operator about abnormal events. Although various methods are used to design intelligent surveillance systems, general approach is modeling normal events and identifying abnormal situations that do not fit into the model. The reasons for this approach are that the anomaly definition varies according to the content, namely, situations considered abnormal for a particular scene may be considered normal in another scene and the difficulties in finding the abnormal training samples. In this study, multi-scale histogram of optical flow features (MHOF) and log-Euclidean covariance matrices are used in automatic anomaly detection with single class classification methods. Log-Euclidean covariance matrices are used for the first time to detect anomalies. Unlike traditional methods, which utilize gradient-based or optical flow-based features for motion representation, two important types of features that encode motion and appearance cues are combined with the help of covariance matrix. Covariance matrices are symmetric positive definite (SPD) matrices which form a special model of the Riemannian manifold and are not suitable for traditional Euclidean operations. Most of the computer vision algorithms are developed for data points located in Euclidean space. For this reason, covariance matrices are mapped to Euclidean space by utilizing log-Euclidean framework. The model building process, which is the first step in the detection of abnormal situations, is performed by using features obtained from normal events with single class classification methods (Support Vector Machines, Support Vector Data Description). In the detection process, dissimilar events meaning that do not fit the model are marked as abnormal. Experiments carried out on an anomaly detection benchmark dataset and comparisons made with previous studies within the scope of the study show that successful results are obtained in detecting abnormal situations. Keywords: Anomaly detection, multi-scale histogram of optical flow, log-Euclidean covariance matrices, crowd motion analysis, one class classification.
(2017) Necva BÖLÜCÜ (MS)
UNSUPERVISED JOINT PART-OF-SPEECH TAGGING AND STEMMING FOR AGGLUTINATIVE LANGUAGES
Advisor: YRD. DOÇ. DR. BURCU CAN
[Abstract]
Part of Speech (PoS) tagging is the task of assigning each word an appropriate part of speech tag in a given sentence regarding its syntactic role such as verb, noun, adjective etc. Various approaches have already been proposed for this task. However, the number of word forms in morphologically rich and productive agglutinative languages is theoretically infinite. This variety in word forms causes sparsity problem in the tagging task for agglutinative languages. In this thesis, we aim to deal with this problem in agglutinative languages by performing PoS tagging and stemming simultaneously. Stemming is the process of finding the stem of a word by removing its suffixes. Joint PoS tagging and stemming reduces sparsity by using stems and suffixes instead of words. Furthermore, we incorporate semantic features to capture similarity between stems and their derived forms by using neural word embeddings. In this thesis, we present a fully unsupervised Bayesian model using Hidden Markov Model (HMM) for joint PoS tagging and stemming for agglutinative languages. The results indicate that using stems and suffixes rather than full words outperforms a simple word-based Bayesian HMM model for especially agglutinative languages. Combining semantic features yields a significant improvement in stemming.
(2017) Metehan ÜNAL(MS)
Kültürel Miras Alanları İçin Uzaktan Artırılmış Gerçeklik Sistemi
Advisor: DOÇ. DR. SÜLEYMAN TOSUN
[Abstract]
The concept of Augmented Reality can be described as the combination of real-world imagery with sound, text, images, or 3D models produced on a computer. This concept, which emerged in the 1990s, while addressing only the academic field in those years, appealed to all segments of the society after the spread of mobile devices. Studies of Augmented Reality for cultural heritage is important issue to enhance interest in historical sites. It's a troublesome and costly task to reconstruct historical buildings that have traces of ancient civilization, ruined by the destructive effect of time, and few survived. With Augmented Reality, It's possible to virtually overlay the historical buildings in situ, without requiring physical reconstruction. In this thesis an mobile location based augmented reality application prototype was developed and a distant augmented reality system was presented for Roman Baths in Ankara. Within the scope of this project, all the work was done in Ankara University Gölbaşı Campus because the required permissions could not obtained to work in the area of Ankara Roman Baths. Firstly, location operations were performed on mobile devices and location data was tested. It's decided to filter the location data with sliding window model. Secondly, 3D model was obtained and restorations were made on the model. Later augmented reality software development kits were tested for marker based applications. Applications were developed with software development kits which supports location based systems but it's decided to design the project with only using Unity 3D game engine. After the location based application developed with Unity, the distant augmented reality system was introduced. Finally, an Android application was developed for taking the drones video stream. With the help of the application, the video from drone's camera is taken and recorded. The recorded video was transferred to Unity and the 3D model of bath was overlaid on top of the video.
(2017) Nebi YILMAZ (MS)
AÇIK KAYNAK YAZILIMLARDA BAKIM YAPILABİLİRLİĞİ VE GÜVENİLİRLİĞİ ÖLÇMEK İÇİN İKİ BOYUTLU DEĞERLENDİRME METODU
Advisor: YRD. DOÇ. DR. AYÇA TARHAN
[Abstract]
Increased popularity of open source software (OSS) has led to a considerable proliferation of alternative software. However, this being the case, an evident lack of studies that would contribute to evaluation of OSS by organizations has turned the process of selecting the most suitable product into an appealing research problem. In this study, a method to evaluate reliability and maintainability of OSS products by using both code-based and community-based aspects have been obtained from the synthesis of existing studies in the literature and with our contribution. In order to perform code-based evaluation, some internal attributes of the most recently quality model, ISO/IEC 25010, have been selected and object-oriented C&K metrics have been employed in an attempt to measure these attributes. To perform community-based evaluation, metrics derived from historical data such as e-mailing lists, program reports, frequently asked questions, and etc. have been utilized to identify and satisfy information needs as conformant to ISO/IEC 15939 standard for software measurement process. The proposed method has been used to evaluate the maintainability and reliability of three Java program build tools written in Java, and results of evaluation have been presented and discussed.
(2018) Cumhur Yiğit ÖZCAN (PhD)
Investigation On Usage Types And Contributions Of Global Navigation Information Of The Crowd In The Local Simulation Models
Advisor: PROF. DR.EBRU AKÇAPINAR SEZER
[Abstract]
Crowd navigation is one of the quite challenging problems in crowd simulation. Local navigation methods that consider each agent in crowd individually and plan their short term movements have been initially proposed for the navigation problem. Later, global navigation methods that approach the crowd as a whole and hybrid navigation methods that aim combining successful elements of these two types have been developed. Although hybrid navigation methods are more costly as they need to do more calculations, provided that the cost can be kept at a reasonable degree they can produce much more successful results than local and global methods. Performance of navigation methods is dependent on the agents in simulation to reach their destinations by making the least effort possible. For this purpose, congestions that are caused by agents who try to move in different directions should be minimalized and the crowd should move in a flow. To achieve this goal global path plans of agents should to be made by considering not only the static obstacles in the simulation environment but also other agents’ instantaneous positions and their future plans. Within this thesis study two new navigation methods have been proposed in which the global navigation information is extracted and stored on simulation environment in a way to include their movement direction and then used by other agents in global path planning phase. In the Global Path Planning Using Potential Information Method, which is the first one of these methods, global navigation information is represented by potential values. Certain heuristic decisions are made while global path plans are being transformed to potential information and while this information is being used in global path planning. Global path plans that have been made are used in combination with a local navigation method. Comparative tests with a system that uses only a local navigation method have shown that utilizing global navigation information as a guide in local navigation method improves navigation performance remarkably. Besides, this method considerably reduces operation cost of the local navigation method as it minimalizes number of possible collisions by creating a flow in simulation environment. Second method that is developed is the Time Based Global Path Planning Method. Since global path plans are made on time basis in this method, both extraction of global navigation information of the crowd and usage of this information are carried out in a much more deterministic way in comparison to the potential based method. Machine learning methods are utilized to be able to make global path plans on time basis. Learning data to be used in machine learning methods is collected from a micro simulation environment and then the models that are trained by using whole of this data are tested on simulation scenarios in medium and macro scale. Therefore a new application schema that differs from traditional machine learning approach about the way learning based methods are used in global path planning is also proposed within this thesis study. Performance of the time based method is evaluated by comparative tests with the potential based method. Results show that time based method is a better navigation system although it requires slightly more processing power.
(2017) Güler KOÇ (MS)
Analysis Of Software Development Models In Terms Of Security And A Model Proposal For Secure Software Development
Advisor: YRD. DOÇ. DR. MURAT AYDOS
[Abstract]
Software development process models focus on ordering and combination of phases to develop the intended software product within time and cost estimates. However, commonness of software vulnerabilities in the fielded systems shows that there is a need for more stringent software development process that focuses on improved security demands. Meanwhile, there are some reports that demonstrate the efficiency of existing security enhanced conventional processes and success of agile projects over conventional waterfall projects. Based on this finding and the demand for secure software, we propose a security enhanced Scrum model (Trustworthy Scrum) by taking advantages of both security activities and Scrum framework which has fast adaptation and iterative cycle. While enhancing Scrum with security activities, we try to retain agile and security disciplines by considering that conventional security approach conflicts with agile methodologies. It is shown through statistical test that the proposed model increases the applicability of security activities with agile methods.
(2018) Öner BARUT (PhD)
Investigation of Agent Number and Movement Variations in Convincing Crowd Simulation
Advisor: PROF. DR.EBRU AKÇAPINAR SEZER
[Abstract]
Nowadays, computer games and movies, which constitute a significant part of the entertainment industry, are among the main application areas that use crowd simulations extensively. In the majority of such entertainment applications, crowds are used to create a background within the existing scene. In these simulations named as ambient crowds, there is no need to visualize the individuals forming the crowd with the highest possible quality or to perform costly complex operations to navigate these individuals. Within the scope of this thesis study, a new approach has been proposed to create crowd simulations having maximum density and minimum movement variety on 2D simulation areas without compromising plausibility for the purpose of creating non-interactive ambient crowds in real-time and simulating these crowds with the lowest possible navigation cost. According to this, instead of navigating crowd members with the navigation models based on collision detection and avoidance maneuvers requiring a high computational power and operation time, individuals are navigated on trajectories that are defined from their initial positions to their goal positions and are guaranteed to be collision-free. Three different steering-free navigation methods have been developed for the creation of the trajectories that will be assigned to the individuals who will participate in the simulation. These methods have been compared with one of the state-of-the-art agent-based navigation techniques existing in the literature and each of these methods have been shown to have much lower navigation costs than the technique they compared. In addition, it has been shown that the steering-free navigation approach that has been developed can create individuals that move on more consistent and smooth trajectories and crowds that exhibit navigation behaviors having a higher level of perceived realism.
(2018) Ahmed NESSAR (PhD)
MULTILEVEL SENTIMENT ANALYSIS IN ARABIC
Advisor: PROF. DR.EBRU AKÇAPINAR SEZER
[Abstract]
Sentiment analysis has a great necessity to classify sentences like review, news, blog, etc. in order to hold the overall sentiment (i.e. negative, positive or neutral) embedded in them. The vast majority of studies focused on sentiment analysis for English texts, while there is small number of researches has focused on other texts such as Arabic, Turkish, Spanish and Dutch. In this study, we aimed at improving the performance results of Arabic sentiment analysis in the level of document by: firstly, investigating the most successfully Machine Learning (ML) methods to classify sentiments, at the same time rules have been implemented to create new vector formats for representation of inputs with ML based modeling process. Secondly, applying Lexicon Based (LB) approach in both term and document levels by using different formulae based on aggregating functions like maximum, average and subtraction. However, the rules have been applied in the experiments. Performance results of LB approach have been used to identify the best formulae can be used with term level and document level of lexicon based SA at Arabic Language, also the effectiveness of using rules in both levels has been illustrated. As a final point, employed methods of the two different approaches (i.e. ML and LB) have been tried to create a combined method with considering rules. The OCA corpus was used in the experiments and a sentiment lexicon for Arabic sentiments (ArSenL) was used to resolve the challenges of Arabic Language. Several experiments have been performed as followed: Firstly, features have been selected for both term and document levels of the OCA corpus independently. Secondly, different linear ML methods such as Decision Tree (D-Tree), Support Vector Machine (SVM), and Artificial Neural Network (ANN) have been applied on both of OCA corpus levels with considering applying and not applying rules on both levels of the corpus. Thirdly, LB approach have been applied on the document level with considering applying rules to each term in a document. And finally comparisons between the results have been done to identify the best way to classify sentiment Arabic documents. The most successful results in the study are as follows: (i) In ML approach, ANN classifier has been nominated as best classifier in the term level and in the document level of Arabic SA. Furthermore, the average of F-score achieved in the term level for positive testing classes is 0.92, and also in negative classes is 0.92, however, in the document level, the average of F-score for positive testing classes is 0.94, while in negative classes is 0.93. (ii) In the LB approach, it is concluded that the best results have been achieved by applying rules for each term, then computing each sentence score by DMax_Sub formula, and finally, using first sentence score formulae for document score computing. In general, the results of the ML approach are better than the results of the LB approach.
(2018) Yasin KAVAK (PhD)
LEARNING VISUAL SALIENCY FOR STATIC AND DYNAMIC SCENES
Advisor:DOÇ. DR. İBRAHİM AYKUT ERDEM
[Abstract]
The ultimate aim in visual saliency estimation is to mimic human visual system in predicting image regions which grab our attention. In the literature, many different features and models have been proposed, but still one of the key questions is how different features contribute to saliency. In this study, we try to get a better understanding of the integration of visual features to build more effective saliency models. Towards this goal, we investigated several machine learning techniques and analyze their saliency estimation performance in static and dynamic scenes. First, multiple kernel learning is employed in static saliency estimation, which provides an intermediate level fusion of features. Second, a thorough analysis is carried out for saliency estimation in dynamic scenes. Lastly, we proposed a fully unsupervised adaptive feature integration scheme for dynamic saliency estimation, which gives superior results compared to the approaches that use fixed set of parameters in fusion stage. Since the existing methods in the literature are far behind in accomplishing human level saliency estimation, we believe that our approaches provide new insights in this challenging problem.
(2018) Gizem KAHVECİ (PhD)
LARGE-SCALE ARABIC SENTIMENT CORPUS AND LEXICON BUILDING FOR CONCEPT-BASED SENTIMENT ANALYSIS SYSTEMS
Advisor: PROF. DR. HAYRİ SEVER
[Abstract]
Within computer-based technologies, the usage of collected data and its size are continuously on a rise. This continuously growing big data processing and computational requirements introduce new challenges, especially for Natural Language Processing NLP applications. One of these challenges is maintaining massive information-rich linguistic resources which are fit with the requirements of the Big Data handling, processing, and analysis for NLP applications, such as large-scale text corpus. In this work, a large-scale sentiment corpus for Arabic language called GLASC is presented and built using online news articles and metadata shared by the big data resource GDELT. The GLASC corpus consists of a total number of 620,082 news article which are organized in categories (Positive, Negative and Neutral) and, each news article has a sentiment rating score value between -1 and 1. Several types of experiments were also carried out on the generated corpus, using a variety of machine learning algorithms to generate a document-level Arabic sentiment analysis system. For training the sentiment analysis models different datasets were generated from GLASC corpus using different feature extraction and feature weighting methods. A comparative study is performed, involving testing a wide range of classifiers and regression methods that commonly used for sentiment analysis task and in addition several types of ensemble learning methods were investigated to verify its effect on improving the classification performance of sentiment analysis by using different comprehensive empirical experiments. In this work, a concept-based sentiment analysis system for Arabic at sentence-level using machine learning approaches and a concept-based sentiment lexicon is also presented. An approach for generating an Arabic concept-based sentiment lexicon is proposed and done by translating the recently released English SenticNet_v4 into Arabic and resulted in producing Ar- SenticNet which contains a total of 48k of Arabic concepts. For extracting the concept from the Arabic sentence, a rule-based concept extraction algorithm called semantic parser is proposed and performed, which is generates the candidate concept list for an Arabic sentence. Different types of feature extraction and representation techniques were also presented and used for building the concept-based Sentence-level Arabic sentiment analysis system. For building the decision model of the concept-based Sentence-level Arabic sentiment analysis system a comprehensive and comparative experiments were carried out using variety of classification methods and classifier fusion models, together with different combinations of the proposed features sets. The obtained experiment results show that, for the proposed machine learning based Document-level Arabic sentiment analysis system, the best performance is achieved by the SVM-HMM classifier fusion model with a value of F-score of 92.35% and by the SVR regression model with RMSE of 0.183. On the other hand, for the proposed conceptbased sentence-level Arabic sentiment analysis system, the best performance is achieved by the SVM-LR classifier fusion model with a value of F-score of 93.92% and by the SVM regression model with RMSE of 0.078.
(2017) Gizem KAHVECİ (MS)
A Proxy Method For Estimating Personal Software Test Effort In Banking Domain And Its Case Study
Advisor: PROF. DR. MEHMET ÖNDER EFE
[Abstract]
In this study, a new method was developed to support the software project managers of a private local bank in estimating the required functional test effort/duration of the developed software units. This method allows the project manager to predict the required test run effort (hence duration) of a test analyst for an assigned software unit at the initial test stages. On the basis, the test analyst models and measures her personal test process (PTP) as described in Humphrey's Personal Software Process (or PSP) and generate a personal estimation database by analyzing these measurements. For the development and testing of this method, real data from the bank's software development projects were used. The feasibility and sensitivity of the developed method were evaluated by comparing the estimations made at the earlier stages of the test process with the actual values of test effort. It has been found that if certain parameters related to environment and application remain stable, prediction errors do not exceed 12% band, and in most cases much smaller.
(2018)AHMET ŞENOL (PhD)
Resilient Image Watermarking: Block-Based Image Watermarking Analysis, Using Vector Image As Watermark And Improving Authentication Purpose Watermarking
Advisor: PROF. DR. HAYRİ SEVER
[Abstract]
As we live in a digital World, protecting our digital property and to be sure that the data we receive is the same as original has become more important. Digital watermarking emerged as a discipline to ensure copyright ownership and authenticating digital data. The image is transformed into another domain, watermarked in this new domain and restransformed into pixel domain by applying inverse transform in most of the copyright protection and authentication type of watermarking algorithms. In the scope of this thesis, it is searched if it makes a difference between transforming an image to the new domain as a whole or dividing the image into blocks and transforming each block to new domain separately. It is examined if using block-based approach affects watermarking performance for different block sizes for DWT-based watermarking. It is revealed by this study that dividing the image into blocks beforehand, transforming each block to new domain separately, and then watermarking the blocks improves robustness drastically. It is also revealed by this study that as block size decreased, robustness increased with the cost of extra cpu time needed. In most of the previous image watermarking studies, an image digest, a pseudo random number sequence, a binary image logo etc is inserted as a watermark. To the best of our knowledge a vector image is not used as a watermark before. A vector image is different from a binary image in that it does not consist of pixels but consists of points, circles, polygons, lines, beziers etc. All those items have their own attributes. For example, a circle has center point coordinates (x,y), diameter, line color, line width etc. Vector images’ quality does not suffer from scaling operations. In this thesis a vector image is embedded as a watermark in a robust way in DWT domain that survived jpeg compression, histogram equilization, 3x3 low-pass filter except cropping and rotation attacks. The type of watermarking that pursues proving the image’s geunineness is image authentication type of watermarking. Fragile type of authentication purpose image watermarking is sensitive to every type and amount of change and does not discriminate the changes as ill purposed or innocent. In the scope of this thesis, a new fragile DWT-based authentication type of image watermarking algorithm is introduced. The method detects the changed region of the image successfully and it is easy to implement. The ideal authentication type of watermarking is expected to be robust against innocent type of changes applied to the image and to be fragile against ill-purpose changes performed on the image. Lossy image compression applied to the image, scaling, sharpening, blurring, histogram equilization that affect all of the image can be given as examples of innocent type of operations to be performed on an image. Removing an existing person from an image, changing one’s face, changing a car’s licence plate, perform “man in the middle attack” can be given examples for the ill-purpose operations that can be performed on an image. The semi-fragile authentication watermarking method will approach the ideal form as the type of ill-purpose attacks it detects increases and authenticates the images that are subjected to innocent operations. In the scope of this thesis, a new semi-fragile authentication image watermarking method is built up that uses DCT and DWT domains, that embeds two watermarks to the image for copyright protection and authentication purposes. The built-up method authenticates %75 quality jpeg compressed images and in addition to the existing methods, authenticates images that are subjected to histogram equilization, intensity adjustment and gamma correction. The new method is also immune to collage attacks.
(2018)BEHNAM ASEFISARAY (PhD)
Uçtan-Uca Konuşma Tanıma Modeli: Türkçe'deki Deneyler
Advisor: PROF. DR. HAYRİ SEVER
[Abstract]
With the technological age we are in, technological devices are an indispensable part of our life. Every day, variety of devices, applications and number of users are also increasing. All these increases cause a huge grow of data produced and so the variety of data. The volume and the variety of the produced data is so increased that it is no longer possible for single machine to handle alone. On the other hand, requirements force us to process data in real-time. Therefore, cluster of machines is used for high efficiency, fault-tolerant and robust systems. By using a cluster, we aim to process all data as soon as possible by distributing the data to all nodes in the cluster. In order to achieve this, the data or the load should be distributed to the machines as equally as possible. Unbalanced distribution of the load means that a number of machines will work more intensively than others, and thus each machine will not be used efficiently. Reduced productivity and efficiency leads the increase of latency and decrease of throughput. Therefore, the system cannot produce real-time results. In such systems, balancing the load to the machines is directly connected to the contents of the data. The more homogeneous the data, the more balanced the load is, the more skewed it will be distributed unevenly. Shuffle Grouping (SG) is the best option when there is no relation between the data incoming. On the other hand, when there is a relation between the incoming data, the best option is Key Grouping (KG) which assigns the incoming data to the target machines by examining the data content. While the data is distributed to machines randomly by using Round-Robin with Shuffle Grouping, the data is distributed to machines by calculation a hash value for each data with Key Grouping. Therefore, related data can be gathered in the same machine and there is no need to aggregate from several machines. However, both of the grouping methods may become useless and inefficient depending on the data. When there is a relation between the data incoming, Shuffle Grouping cannot be used efficiently, because Shuffle Grouping does not care about the content of incoming data. On the other hand, when the data is skewed, Key Grouping routes the skewed data to the only one machine and most of the load will be concentrated on single machine. In other words, the more the data is skewed, the more inefficient load balancing occurs. This also leads inefficiency, more latency, less throughput and non-real-time results. Partial Key Grouping (PKG), on the other hand, specifies two target machines by calculating two different hash values and chooses the less loaded one for efficient load balancing. With Partial Key Grouping, every data can be distributed to two different machines. Even if the data is slightly skewed, system may have better performance and better load balancing than Key Grouping. However, if the data is so skewed and some of the data is recurring so many times, even Partial Key Grouping may show bad performance. Highly skewed load would be distributed to only two machines, thus, other machines in the cluster would have far less load to process and this leads to inefficiency and performance issues. On the other hand, this method, Partial Key Grouping, does not guarantee the two calculated hash values for one data will be different. In all these conditions, the efficiency and performance of the system must be consistently high regardless of the content of the data. In this study, Dynamic Key Grouping (DKG) method is proposed to distribute the load to the machines at all times regardless of the data content. With this method, skewed data are detected and can be distributed to more servers. Moreover, improvements were observed in the throughput and latency of the system, especially when the data is highly skewed and very successful results were obtained.
(2018)Orhun DALABASMAZ (MS)
Dynamıc Key Groupıng: A Load Balancıng Algorıthm For Dıstrıbuted Stream Processıng Engınes
Advisor: DOÇ. DR. AHMET BURAK CAN
[Abstract]
With the technological age we are in, technological devices are an indispensable part of our life. Every day, variety of devices, applications and number of users are also increasing. All these increases cause a huge grow of data produced and so the variety of data. The volume and the variety of the produced data is so increased that it is no longer possible for single machine to handle alone. On the other hand, requirements force us to process data in real-time. Therefore, cluster of machines is used for high efficiency, fault-tolerant and robust systems. By using a cluster, we aim to process all data as soon as possible by distributing the data to all nodes in the cluster. In order to achieve this, the data or the load should be distributed to the machines as equally as possible. Unbalanced distribution of the load means that a number of machines will work more intensively than others, and thus each machine will not be used efficiently. Reduced productivity and efficiency leads the increase of latency and decrease of throughput. Therefore, the system cannot produce real-time results. In such systems, balancing the load to the machines is directly connected to the contents of the data. The more homogeneous the data, the more balanced the load is, the more skewed it will be distributed unevenly. Shuffle Grouping (SG) is the best option when there is no relation between the data incoming. On the other hand, when there is a relation between the incoming data, the best option is Key Grouping (KG) which assigns the incoming data to the target machines by examining the data content. While the data is distributed to machines randomly by using Round-Robin with Shuffle Grouping, the data is distributed to machines by calculation a hash value for each data with Key Grouping. Therefore, related data can be gathered in the same machine and there is no need to aggregate from several machines. However, both of the grouping methods may become useless and inefficient depending on the data. When there is a relation between the data incoming, Shuffle Grouping cannot be used efficiently, because Shuffle Grouping does not care about the content of incoming data. On the other hand, when the data is skewed, Key Grouping routes the skewed data to the only one machine and most of the load will be concentrated on single machine. In other words, the more the data is skewed, the more inefficient load balancing occurs. This also leads inefficiency, more latency, less throughput and non-real-time results. Partial Key Grouping (PKG), on the other hand, specifies two target machines by calculating two different hash values and chooses the less loaded one for efficient load balancing. With Partial Key Grouping, every data can be distributed to two different machines. Even if the data is slightly skewed, system may have better performance and better load balancing than Key Grouping. However, if the data is so skewed and some of the data is recurring so many times, even Partial Key Grouping may show bad performance. Highly skewed load would be distributed to only two machines, thus, other machines in the cluster would have far less load to process and this leads to inefficiency and performance issues. On the other hand, this method, Partial Key Grouping, does not guarantee the two calculated hash values for one data will be different. In all these conditions, the efficiency and performance of the system must be consistently high regardless of the content of the data. In this study, Dynamic Key Grouping (DKG) method is proposed to distribute the load to the machines at all times regardless of the data content. With this method, skewed data are detected and can be distributed to more servers. Moreover, improvements were observed in the throughput and latency of the system, especially when the data is highly skewed and very successful results were obtained.
(2018)Osman ALper ÖCAL (MS)
GÖZ İZLEME DONANIMI KULLANILARAK WEB UYGULAMALARININ KULLANILABİLİRLİK TESTLERİ İÇİN İÇERİK DUYARLI YAKLAŞIM
Advisor: DOÇ. DR. KAYHAN İMRE
[Abstract]
Eye tracking technology provides eye movement details by tracking movements of pupil of the eye. Eye tracking devices use infrared sensors and specialized cameras to track movements of the pupil. This technology is developed with the development of lens and camera technologies. It has also become quite popular in lots of areas because of ease of use. Currently eye tracking technologies are actively used in advertisement, sales, marketing etc. With the help of this technology human behaviours are determined and required actions are taken according to research area. Web application usability testing is another area that eye tracking technology can be used. Purpose of this test is to determine how efficient and effective is the web application used and also to determine how pleasured the users are. The variety of the web applications and the counts of them increase rapidly. This rise also increased the importance of the usability test and its results. According to the results of the test, improvements may be applied to the application to increase the usability of the application. Web usability testing using eye tracking technology might be a compelling process. First of all, convenient laboratory envorinment is required. Then participants need to attend this laboratory for measurement processes. After the measurement processes the researcher needs to evaluate the results in person. This process may cause losing serious amount of time and losing focus on the research. Besides these possible effects, most of the time researcher does not have any information about content of the application. Lack of content information may end up with common evaluations. Openness to the human error is also another gap for web usability testing using eye tracking technologies. These possible problems and deficiencies form the basis of this research. With this research a content aware approach for web application usability testing is provided. Using this approach, experimental studies are made to verify the approach. Results of the studies showed that with this approach whole process will take less time than before. Measurement process will be easier to apply and easy to evaluate results. Evaluation process will be more human error prone. Besides all of these gains, work load of researcher will be reduced significantly, and researcher will have chance to focus at his/her research.
(2018) Yiğitcan NALCI(MS)
Mapping Methods For Three Dimensional Network-On-Chip (3d-Noc) Architectures
Advisor: DOÇ. DR. SÜLEYMAN TOSUN
[Abstract]
The number of cores in the chip has shown a rapid increase with the advancement of technology and the increased needs of applications. This led designers to invent new communication technologies such as Network-on- Chip (NoC) paradigm. Advances in integrated circuit fabrications even allowed three-dimensional NoC (3D-NoC) implementations. 3D-NoC architectures have more advantages than its 2D counterpart. 3D-NoCs have a lower area, higher efficiency and performance and lower energy consumption. However, they lack the design automation algorithms. An important design problem for a given application is mapping it on 3D-NoC topology. In this thesis, we propose a heuristic mapping algorithm, called CastNet3D, for mesh-based 3D-NoCs. The algorithm tries to utilize vertical links for communicating nodes as much as possible since they are faster and less energy consuming than horizontal ones. Simulated annealing based algorithm (SA3D) for the mapping problem is also proposed to compare the heuristic method with the metaheuristic method. CastNet3D has been compared against SA3D and two 2D-NoC algorithms on several benchmarks. The results show that CastNet3D obtains better mappings in terms of energy consumption most of the time in a very short time.
(2018) Ferhat KURT (PhD)
Analysis Of The Effects Of Hyperparameters In Convolutional Neural Networks
Advisor: PROF. DR. MEHMET ÖNDER EFE
[Abstract]
In this study, literature review and experimental study were carried out on the hyperparameters constituting the structure, working system and network of the irregular neural networks that gained popularity in the definition of the day-to-day picture in 2012 within the scope of IMAGE-NET contest. In the study, ILSVRC2012 training dataset consisting of 50 classes and 600 samples and different option values for convolutional neural network hyperparameters were determined and trainings were conducted on the learning structure of the deep learning client, parameter and evaluation server included in the supercomputers. Consequently, these trainings, the model performances were evaluated through diagrams and charts and new hyper parameter values were created and additional trainings were made. As a result of 410 separate trainings in total, it has been determined that preprocessing of data sets, learning rate selection in accordance with optimizer, packet normalization and use of dropout process, increases model performance.
(2018) Mustafa KARLI (PhD)
Air combat implementation using adaptive neuro fuzzy inference system
Advisor: PROF. DR. MEHMET ÖNDER EFE
[Abstract]
Autonomous control of aerial vehicle is a non-linear problem that requires low level robust control of many parameters. There are solutions to control an unmanned aircraft to follow a flight path, take-off and landing. In case of close range air combat there are additional objectives like preserving aircraft energy, getting to an advantageous position over the opponent, consider controlling of relative variables, instantly changing trajectory requirements. This makes control problem domain specific and very complex. With fully autonomous aerial vehicles man-kind can also take advantage of using UAV for air combat. While air combat is a dangerous, expensive and difficult activity, computers can be trained by using human combat fighters experience with machine learning techniques. Computers can also be used to support pilot training process. In this thesis, a step by step methodology is proposed to train an unmanned aircraft to fight on behalf of human. An abstraction stack is defined to isolate low level robust control of aircraft from flight intelligence. This approach allows us to focus on air combat problem independent from flight control techniques. A technique is defined to decompose complicated and hard to process flight information into machine and human readable and easily understandable format. This technique eases processing of huge amount of flight data, decreases the number of control parameters and brings a common understanding of aircraft maneuvering. The technique also includes indexing and search mechanism on flight language. It is shown how to compose air combat maneuvers using flight language. A machine learning corpus data is composed from real F-16 flight information including relative geometry and maneuver identification methodology. The sample data is not a complete solution to train widely used combat maneuvers. But as a proof of concept it is shown that the technique works fine for sample scenarios. A comparison and brief information on machine learning techniques specific to close range air combat problem is given and an ANFIS design is applied as example.
(2018) Hüseyin TEMUÇİN (PhD
Design Of Real-Time Scheduling And Communicatıon Management Algorithms On Multicore Architecture
Advisor: DOÇ. DR. KAYHAN İMRE
[Abstract]
The reach of processor architectures to physical boundaries has made parallel systems mandatory in all computer systems used in commercial and academic processes. Parallel systems are the computer systems in which a large number of processors are brought together in a topology to serve one purpose. That purpose may be resolving a problem identified or distributing tasks that a system provides. The processors on these systems can use distributed or shared resources and in distributed memory systems the processors shares data between them over a system area network. In the general approach, the most basic performance measures of computer systems can be defined as accuracy and effectiveness. In a generalized system task, accuracy is only measured by the accuracy of outputs, and it is acceptable to delay the completion of tasks according to the instantaneous load of the systems. Real-time systems are specialized computer systems that operate on a timely basis and all tasks on these systems are expected to be completed before their deadlines. Failure to perform a real-time task on time in these systems causes the system to operate improperly and may results apocalyptic results, depending on the system's domain. Real-time systems are computer systems that are used in many critical processes, primarily defense and health, and today, changing trends are increasing the needs of such systems. However, the complexity of real-time systems with increasing and changing needs also increases the processing power required for systems. In the thesis study, a chip architecture and nework-on-chip structure with multi-processor and distributed memory compatible with real-time systems is proposed and a deterministic and predictable set of communication patterns to be operated on the proposed network structure is defined. In the study communication processes are also considered as real-time tasks and a time-based task scheduling algorithm that manages both communication and transaction tasks has been put forward by taking advantage of the predictable iv communication patterns. In the communication layer of the proposed system, today's new and promising technologies such as photonic networks on chip have been utilized. In order to measure the performance of the system, theoretical and simulation studies on the proposed system have been carried out problems selected real time systems. The results of the study show that the proposed system architecture and the communication and management algorithms described above provide a suitable and high performance approach for real-time systems.
(2018) Sinan GÖKER (MS)
Neural Text Normalization for Turkish Social Media
Advisor: YRD. DOÇ. DR. BURCU CAN
[Abstract]
Social media has become a rich data source for natural language processing tasks with its worldwide use; however, it is hard to process social media data directly in language studies due to its unformatted nature. Text normalization is the task of transforming the noisy text into its canonical form. It generally serves as a preprocessing task in other NLP tasks that are applied to noisy text and the success rate gets higher when studies are performed on canonical text. In this study, two neural approaches are applied for Turkish text normalization task: Contextual Normalization approach using distributed representations of words and Sequence-to-Sequence Normalization approach using encoder-decoder neural networks. As the conventional approaches applied to Turkish and also other languages are mostly domain specific, rule-based or cascaded, they are already becoming less efficient and less successful due to the change of the language use in social media. Therefore the proposed methods provide more comprehensive solution that are not sensitive to the language change in social media.
(2018) Esra ŞAHİN (MS)
Spam-ham e-mail classification using machine learning methods based on bag of words (bow) technique
Advisor: YRD. DOÇ. DR. MURAT AYDOS
[Abstract]
Nowadays, we frequently use e-mails, which is one of the communication channels, in electronic environment. It plays an important role in our lives because of many reasons such as personal communications, business-focused activities, marketing, advertising, education, etc. E-mails make life easier because of meeting many different types of communication needs. On the other hand they can make life difficult when they are used outside of their purposes. Spam emails can be not only annoying receivers, but also dangerous for receiver’s information security. Detecting and preventing spam e-mails has been a separate issue. In this thesis, spam e-mails have been studied comprehensively and studies which is related to classifying spam e-mails have been investigated. Unlike the studies in the literature, in this study; the texts of the links placed in the e-mail body are handled and classified by the machine learning methods and the Bag of Words Technique. In this study, we analyzed the effect of different N grams on classification performance and the success of different machine learning techniques in classifying spam e-mail by using accuracy, F1 score and classification error metrics. On the other hand, the effect of different N grams is examined for machine learning success rate of over %95. As a result of the study, it has been seen that Decision Trees Algorithms show low success in spam classification when Bayes, Support Vector Machines, Neural Networks and Nearest Neighbor Algorithms show high success. On the other hand, 5 grams were found to provide the best contribution for performance.
(2018) Feyza Nur KILIÇASLAN (MS)
A bayesian q-learning based approach to improve believability of fps game agents
Advisor: YRD. DOÇ. DR. AYÇA TARHAN
[Abstract]
Many organizations have started to change their working process from plan-driven to agile in order to leverage the benefits of agile transformation, so it is frequently discussed how adopting agile methodologies affect software organizations. Measuring effects of agile transformation is important in terms of evaluating and understanding to what extent agile methods contribute to software organizations. In this study a set of information needs and metrics, which are designed to measure the effects of agile transformation in a medium-sized software organization that has been going through agile transformation, are described and applied in the organization and the measurement results are shared. While defining the base of measurement, firstly related studies in literature are examined, and metrics derived from these studies are grouped by the measured entities of business, process, product, and resource. The information needs for measuring effects of agile transformation are determined by the guidance of ISO 15939 standard for Software Measurement Process, and then are aligned with the metrics compiled from the iterature. Information indicators and associated measurement constructs (derived and base metrics, measurement functions etc.) that enable measuring impacts of agile transformation are described from business, process, product and resource perspectives. As a result, the effects of agile transformation in the software organization using these information needs are measured and the results of measurement are evaluated. The evaluation results show that the agile transformation affected the software organization positively in general.
(2018) Osman YILMAZ (MS)
A bayesian q-learning based approach to improve believability of fps game agents
Advisor: YRD. DOÇ. DR. UFUK ÇELİKCAN
[Abstract]
One of the goals of modern game programming is adapting the life-like characteristics and concepts into games. This approach is adopted to offer game agents that exhibit more engaging behavior.Methods that prioritize reward maximization cause the game agent to go into same patterns and lead to repetitive gaming experience, as well as reduced playability. In order to prevent such repetitive patterns, we explore a behavior algorithm based on Q-learning with a Naïve Bayes approach. The algorithm is validated in a formal user study in contrast to a benchmark. The results of the study demonstrate that the algorithm outperforms the benchmark and the game agent becomes more engaging as the amount of gameplay veri, from which the algorithm learns, increases.
(2018) Bahar GEZİCİ (MS)
Quality in the evolution of mobile applications
Advisor: YRD. DOÇ. DR. AYÇA TARHAN
[Abstract]
Mobile applications are becoming complex software systems as they rapidly evolve and grow constantly to meet user requirements. However, satisfying these requirements may lead to poor design choices known as ‘antipatterns’ that can degrade software quality and performance. Therefore, perception and monitoring of the characteristics of mobile applications are important activities to facilitate maintenance and development, so that developers are directed to restructure their practices and upgrade their qualifications. The quality of mobile applications is of great importance for developers, users and application stores. Although there are a number of mobile application specific techniques for analysing the quality of a mobile software product, there is no accepted and valid way to predict the potential success of a mobile application in a real app store. Thus, the monitoring of the quality of ever-evolving mobile software throughout the evolution has become an attractive problem to investigate. In this thesis, we aim to monitor the quality in the evolution of open source mobile applications in the light of existing studies in the literature. For this purpose, the development of internal (codebased) and external quality (community-based) features of open source mobile applications has been evaluated with an exploratory approach. For code-based evaluation, object-based design metrics are measured based on the most recent quality model ISO/IEC 25010. For community-based evaluation, a number of community based metrics which are extracted from data repositories such as iv Github and Sourceforge have been analyzed based on the DeLone and McLean model, which is a guideline for measuring the success of open source software. As a result of the analysis, it was observed that the internal quality increased generally during the releases, while the external quality decreased. In addition, the relationship between internal and external qualities of the applications was analyzed by Spearman correlation analysis and no significant relation was observed between them.
(2018) Burcu YALÇINER (MS)
Mapping scrum method building elements according to CMMI level 2 requirements within the scope of a case study
Advisor: YRD. DOÇ. DR. ADNAN ÖZSOY
[Abstract]
The rapid rate of change in information technology is causing increasingly widespread heavy plans, specifications and other documentation based on contractual inertia and maturity model compatibility criteria. In order to solve these problems, most of the software organizations have begun to adopt Scrum, which is one of the most used agile software development methods. The use of Scrum as a software development process is beneficial for software organizations developing a new software development process based on both CMMI and Scrum. This thesis study focuses on a software group which meets the definition of a small organization working in a technology company whose product portfolio covers a variety of products including embedded components and remote monitoring systems for heavy machinery exports. A case study is presented where in the organization has embarked on a software improvement process for one of their projects to conform to CMMI Level 2 requirements while simultaneously transitioning the same team to implement Scrum. The purpose of this thesis is to draw on a real-world use case to model a practical mapping between the building elements of Scrum and the goals and practices of CMMI Level 2 in order to demonstrate that the software development processes as defined by the Scrum team fulfill the requirements of a CMMI Level 2 project and to be a good reference to practitioners on this area.
(2017) Burçak ASAL (MS)
A robust method to identify overlapping crowd motion patterns
Advisor: DOÇ. DR. İBRAHİM AYKUT ERDEM
[Abstract]
Due to recent advances in new camera technologies and the Internet, millions of videos can be easily accessed from any place at any time. A significant amount of these videos are for surveillance, and include actors such as humans and vehicles performing different actions in dynamic scenes. The goal of this study is to analyze human crowd motions in videos. More specifically, moving humans are tracked throughout a video sequence, and the collective crowd motions are then clustered using path similarities via the Dominant Sets method. While obtaining path similarities, they have also been subjected to many middle steps such as extracting a sparse subset, obtaining transitive closure between motions and selecting popular paths with respect to specific parameters. Hence, we ensure that we obtain a pure and accurate structure information for each cluster. Moreover, we calculate a scalar value, which represents a coherency information for crowds in different scenes. We also compare our method with another state of the art method and show our quantitative and qualitative results obtained from comparison.
(2016) ABDURRAHMAN BAYRAK (MS)
Embedded point cloud processing for UAV applications
Advisor: PROF. DR. MEHMET ÖNDER EFE
[Abstract]
This study presents an FPGA based synthesizable offline UAV local path planner implementation using Evolutionary Algorithms for 3D unknown environments. A Genetic Algorithm is selected as the path planning algorithm and all units of it are executed on a single FPGA board. In this study, Nexys 4 Artix-7 FPGA board is selected as the target device and Xilinx Vivado 2015.4 software is used for synthesis and analysis of HDL design. Local path planner is designed in a way that it has two flight modes: free elevation flight mode and fixed elevation flight mode. Designed FPGA based local path planner which exhibits 62% logic slice utilization, is tested in two different unknown environments generated by a LIDAR sensor. Results show that both genetic algorithm is an efficient path planning algorithm for UAV applications and FPGAs are very suitable platforms for flight planning periphery.
(2016) EZGİ ERTÜRK GÜLER (PhD)
Design and implementation of software fault prediction plugin by using soft computing methods: comprehensive metric assessment
Advisor: DOÇ. DR. EBRU SEZER
[Abstract]
Software Fault Prediction (SFP) means determination of the faulty modules in a software project by considering some software features. The detection of fault-proneness of software modules is quite important because the modules which require testing and refactoring activities can be determined earlier. The major objective of this study is to provide SFP implementation methodology to make it more beneficial in software development lifecycle. To achieve this, Fuzzy Inference Systems (FIS), a novel solution approach for SFP problem, is proposed as a first step and FIS approach is applied on SFP for the first time in here. Since FIS method is a rule-based approach and it does not require previously labeled data, it can be easily transformed to a tool. In order to prove validation of FIS on SFP, some experiments are performed on different metric and data sets. According to the achieved results, it has been proved that using of FIS is a reliable solution for SFP problem. Later, another novel solution method for SFP problem is proposed: Adaptive Neuro Fuzzy Inference System (ANFIS). Again, the first application of ANFIS for SFP problem is presented in this thesis. ANFIS method combines the advantages of Artificial Neural Networks (ANN) and FIS methods. Some comparative experiments are also performed to discuss the performance of ANFIS. According to the experimental results, while ANN has the best performance and SVM has the worst one; ANFIS is much more successful than SVM and it is capable of competing with ANN. After new solution suggestions for SFP problem are presented, an iterative SFP methodology is proposed in order to integrate SFP task into software development phase of software projects which are developed by using agile approaches. According to the methodology, rule-based approaches like FIS are used for SFP when no labeled data are available; after labeled data are accumulated, data-driven approaches which are trained with previous versions are preferred to detect faults of the versions being developed. The preferred data-driven approaches for the proposed iterative methodology are ANN and ANFIS methods. The methodology is tested on selected datasets from PROMISE repository. Obtained results show that the iterative SFP methodology is a successful approach. In other words, if SFP task is performed by considering this approach, it can serve in parallel with development phase and help project team during development phase. In order to prove that the proposed methodology can be used practically, the methodology is implemented as a plug-in and integrated into Eclipse development environment.
(2016) GULSHAT KESSİKBAYEVA (PhD)
Example based machine translation system between kazakh and turkish supported by statistical language model
Advisor: PROF. DR. İLYAS ÇİÇEKLİ
[Abstract]
Example Based Machine Translation System(EBMT) is a analogy-based type of Machine Translation(MT), where translation made according to aligned bilingual corpus. Moreover, there are a lot of different methodologies in MT and hybridization is also possible between these methods which focused on compounding the strongest sides of more than one MT approaches to provide better translation quality. There are two parts of Hybrid Machine Translation (HMT) such as guided part and information part. Our work is guided by EBMT and a hybrid example based machine translation system between Kazakh and Turkish languages is presented here. Analyzing both languages at morphological level, then constructing morphological processors is one of the most important part of the system. Their morphological processors are used to obtain the lexical forms of the surface level words and the surface level forms of translation results at lexical level. Translation templates are kept at lexical level and they translate a given source language sentence at lexical level to a target language sentence at lexical level. Our bilingual corpora hold translation examples at surface level and their words are morphologically analyzed by appropriate morphological analyzer before they are fed into the learning module. Thus, translation templates are learned at morphological level from a bilingual parallel corpus between Turkish and Kazakh. Translations can be performed at both directions using these learned translation templates. The system is supported by a statistical language model for the target language. Therefore, translation results are sorted according to both their confidence factors that are computed using the confidence factors of the translation templates used in those translations and statistical language model probabilities of those translation results. Thus, the statistical language model of the target language is used in the ordering of translation results in addition to translation template confidence factors in order obtain more precise translation results. Our main aim with our hybrid example based machine translation system is to obtain more accurate translation results by pre-gained knowledge from target language resource. One of the reasons that we propose this hybrid approach is that monolingual language resources are more widely available than bilingual language resources. In this thesis, experiments show that we can rely on the combination of EBMT and SMT approaches, because it produces satisfying results.
(2016) HANDAN GÜRSOY (MS)
Development of FPGA based automatic control systems
Advisor: PROF. DR. MEHMET ÖNDER EFE
[Abstract]
In this thesis, controller design is studied on a Field Programmable Gate Array (FPGA), which has the capabilities of reprogrammability and parallel processing. The PID and SMC have been chosen as controllers in this study, because many control studies are based on PID for years and the SMC is known as a robust control method. In this study, the PID and SMC have been applied on a cylindirical robot manipulator (RPP) whose nominal dynamic equation is known. To accomplish this, Matlab Xilinx System Generator toolbox plays an important role in control design on an FPGA device. In this thesis, FPGA-based PD and SMC controllers are designed by using Matlab Xilinx System Generator tool for the chosen robot system. Also, these results have been compared with those generated in Matlab/Simulink. Xilinx Artix-7 XC7A100T FPGA is selected as target model and Vivado 2014.4 software is utilized for synthesis. The tracking performances of the presented control schemes, implemented in Matlab/Simulink and implemented on the FPGA, are compared. Robustness and good trajectory tracking performance of the system on FPGA are demonstrated.
(2016) ÖMER MİNTEMUR (MS)
Analysis of attacks in vehicular ad hoc networks
Advisor: DOÇ. DR. SEVİL ŞEN AKAGÜNDÜZ
[Abstract]
Vehicular ad hoc networks, is a subbrunch of mobile ad hoc networks, and it is an emerging area. Vehicular ad hoc network is a network type that enables cars to communicate with each other. Cars could send information about traffic and road conditions, which are critical for traffic safety. Despite of having such advantages, one of the biggest disadvantages of this network is security. The networks in which cars could travel at high speeds and the network topology change very dynamically, are exposed to attacks. In this thesis, AODV routing protocol, that is used widely in mobile ad hoc networks and GPSR routing protocol, that uses geographic locations of nodes for packet transmission are used and their performances are analyzed in 4 different attacks. 35 cars are used for simulations and every simulation takes 200 seconds. In every simulation different connection patterns are used to get better results of simulations. Blackhole attack, flooding attack, packet dropping attack and bogus information iv attack are implemented for both protocols. Attackers numbers are increased in every simulation. Two different maps, which have two different density, namely Munich city center which has a high density and İstanbul Road which has a low density are used. Packet delivery ratio, throughput, end to end delay and overhead metrics are analyzed for both protocols. The simulation results showed that, in a network under no attack AODV routing protocol has a better performance in terms of packet delivery ratio in both maps than GPSR. Results also showed that in the network that has attackers, AODV is effected most by the blackhole attack while GPSR is effected almost equally by each attack type. As a result, it is shown that both protocol is open to attacks and both protocol should have advanced detection sytems against attacks.
(2016) MURAT ORUÇ (MS)
A graph mining approach for detecting design patterns in object-oriented design models
Advisor: YRD. DOÇ. DR. FUAT AKAL
[Abstract]
Object-oriented design patterns are frequently used in real-world applications. As design patterns are the common solutions for recurring problems which software developers confronted with, they help developers to implement the design easily. Design patterns also demonstrate the code reusability and strengthen the quality of the source code. Therefore, detection of design patterns is essential for comprehension of the intent and design of a software project. This thesis presents a graph-mining approach for detecting design patterns. The approach of detection process is based on searching sub-graphs of input design patterns in the space of model graph of the source code by isomorphic sub-graph search method. Within the scope of this thesis, 'DesPaD' (Design Pattern Detector) tool is developed for detecting design patterns. To implement the isomorphic search, open-source sub-graph mining tool, Subdue is used. The examples of 23 GoF design patterns in the book of "Applied Java Patterns" are detected and some promising results in JUnit 3.8, JUnit 4.1 and Java AWT open-source packages are obtained.
(2015) FARHAD SOLEİMANİAN GHAREHCHOPOGH (PhD)
Open domain factoid question answering system
Advisor: PROF. DR. İLYAS ÇİÇEKLİ
[Abstract]
Question Answering (QA) is a field of Artificial Intelligence (AI) and Information Retrieval (IR) and Natural Language Processing (NLP), and leads to generating systems that answer to questions natural language in open and closed domains, automatically. Question Answering Systems (QASs) have to deal different types of user questions. While answers for some simple questions can be short phrases, answers for some more complex questions can be short texts. A question with a single is known as a factoid question, and a question answering system that deals with factoid questions is called a factoid QAS. In this thesis, we present a factoid QAS that consists of three phases: question processing, document/passage retrieval, and answer processing. In the question processing phase, we consider a new two-level category structure using machine learning techniques to generate search engine from user questions queries. Our factoid QAS uses the World Wide Web (WWW) as its corpus of texts and knowledge base in document/passage retrieval phase. Also, it is a pattern-based QAS using answer pattern matching technique in answer processing phase. We also present a classification of existing QASs. The classification contains early QASs, rule based QASs, pattern based QASs, NLP based QASs and machine learning based QASs. Also, our factoid QAS uses two-level category structure which included 17 coarse-grained and 57 fine-grained Categories. The system utilizes from category structure in order to extract answers of questions consists of 570 questions originated from TREC-8, TREC-9 questions as training dataset and 570 other questions and TREC-8, TREC-9, and TREC-10 questions as testing datasets. In our QAS, the query expansion step is very important and it affects the overall performance of our QAS. When an original user question is given as a query, the amount of retrieved relevant documents may not be enough. We present an automatic query expansion approach based on query templates and question types. New queries are generated from query templates of question categories and the category of a user question is found by a Naïve Bayes classification algorithm. New expanded queries are generated by filling gaps in query templates with two appropriate phrases. The first phrase is the question type phrase and it is found directly by the classification algorithm. The second phrase is the question phrase and it is detected from possible question templates by a Levenshtein distance algorithm. Query templates for question types are created by analyzing possible questions in those question types. We evaluated our query expansion approach with two-level category structure with factoid question type's include in TREC-8, TREC-9 and TREC-10 conference datasets. The results of our automatic query expansion approach outperform the results of manual query expansion approach. After automatically learning answer patterns by querying the web, we use answer pattern sets for each question types. Answer patterns extracts answers from retrieved related text segments, and answer pattern can be generalization with Named Entity Recognition (NER). The NER is a sub-task of Information Extraction (IE) in answer processing phase and classifies terms in the textual documents into redefined categories of interest such as location name, person name, date of event and etc. The ranking of answers is based on frequency counting and Confidence Factor (CF) values of answer patterns. The results of the system show that our approach is effective for question answering and it accomplishes 0.58 values Mean Reciprocal Rank (MRR) for our corpus fine-grained category class, 0.62 MRR values for coarse-grained category structure and 0.55 MRR values for evaluation by testing datasets on TREC-10. The results of the system have been compared with other QASs using standard measurement on TREC datasets.
(2015) BEHZAD NADERALVOJOUD (MS)
Investigation of imbalance problem effects on text categorization
Advisor: DOÇ. DR. EBRU AKÇAPINAR SEZER
[Abstract]
Text classification is a task of assigning a document into one or more predefined categories based on an inductive model. In general, machine learning algorithms assume that datasets consist of almost homogeneous class distribution. However, learning methods can be tended to the classification which has poorly performance over the minor categories while using imbalanced datasets. In multiclass classification, major categories correspond to the classes with the most number of documents and also minor ones correspond to the classes with the lowest number of documents. As a result, text classification is the process which can be highly affected from the class imbalance problem. In this study, we tackle this problem using category based term weighting approach in combination with an adaptive framework and machine learning algorithms. This study first investigates two different types of feature selection metrics (one-sided and two-sided) as a global component of term weighting scheme (called as tffs) in scenarios where different complexities and imbalance ratios are available. tfidf as a traditional term weighting scheme is employed to evaluate the effects of tffs term weighting approach. In fact, the goal is to determine which kind of weighting schemes are appropriate for which machine learning algorithms on different imbalanced cases. Hence, four popular classification algorithms (SVM, kNN, MultiNB and C4.5) are used in the experiments. According to our achieved results, regardless of tfidf, term weighting methods based on one-sided feature selection metrics are more suitable approaches for SVM and kNN algorithms while two-sided based term weighting schemes are the best choice for MultiNB and C4.5 algorithms on the imbalanced texts. Moreover, tfidf weighting method can be more recommended for kNN algorithm in imbalanced text classification. Furthermore, Two category based functions named as PNF and PNF2 are proposed as a global component of term weighting scheme. To better evaluate the proposed approaches with the existing methods, an adaptive learning process is proposed. In fact, this algorithm learns a model which intensively depends on the term weighting schemes and can obviously show the performance of different weighting methods in classification of imbalanced texts. According to the experiments which were carried out on the two benchmarks (Reuters-21578 and WebKB), the proposed methods yield the best results.
(2015) SERVET TAŞCI (MS)
Content based media tracking and news recommendation system
Advisor: PROF. DR. İLYAS ÇİÇEKLİ
[Abstract]
With the increasing use of Internet in our life, amount of unstructured data, and particularly amount of textual data, has increased dramatically. Thinking that the access point of users to this data is Internet, reliability and accuracy of these resources stands out as a concern. Besides multitude of resources, most resources have similar content and it is quite challenging to read only the needed news among these resources in a short time. It is also needed that accessed resource really includes the required information and that it is confirmed by the user. Recommender systems assess different characteristics of the users and correlate the accessed content and user and then evaluate the content according to the specific criteria and recommends to the user. First recommender systems were using simple content filtering features, but current systems use much more complicated calculations and algorithms and try to correlate many characteristics of users and the data. These improvements allowed usage of recommender systems as decision support systems. This thesis aims at getting data from textual news resources, classification of data, summarization, and recommend the news by correlating the news with the characteristics of users. Basically, recommender systems mainly use three methods: content-based filtering, cooperative filtering, and mixed filtering. In our system, content-based filtering is used.
(2015) EMİNE GÜL DANACI (MS)
Analyzing the effects of low-level features for visual attribute recognition
Advisor: YRD. DOÇ. DR. NAZLI İKİZLER CİNBİŞ
[Abstract]
In recent years, visual attributes became a popular topic of computer vision research. Visual attributes are being used on various tasks including object recognition, people search, scene recognition, and so on. In order to encode the visual attributes, a common applied procedure for supervised learning of attributes is to extract low-level visual features from the images first. Then, an attribute learning algorithm is applied and visual attribute models are formed. In this thesis, we explore the effects of using different low-level features on learning visual attributes. For this purpose, we use various low-level features, which aim to capture different visual characteristics, such as shape, color and texture. In addition, we also evaluate the effect of the recently evolving deep features on the attribute learning problem. Experiments have been carried out on four different datasets, which were collected for different visual recognition tasks and extensive evaluations have been reported. Our results show that, while using the supervised deep features are effective, using them in combination with low-level features are more effective for visual attribute learning.
(2015) AYDIN KAYA (PhD)
Characterization of lung nodules with computer aided diagnosis system
Advisor: YRD. DOÇ. DR. AHMET BURAK CAN
[Abstract]
Lung cancer is one of the leading causes of cancer related deaths worldwide, especially in industrially developed countries. Major problems in diagnosis are caused by high volume of visual data and difficulty of detection of solitary/small pulmonary nodules. Computer aided diagnosis systems are expert systems to assist radiologists on these issues. In this thesis, classification approaches for predicting malignancy of solitary pulmonary nodules are presented. Publicly available Lung Image Database Consortium (LIDC) database which is proposed by USA National Cancer Institute is used in the study. LIDC contains malignancy and nodule characteristics evaluations of radiologists from four different institutions. The goal of this thesis is to examine the usefulness of radyographic descriptors in malignancy prediction. Dataset balancing approaches are used for addressing unbalanced class distribution of LIDC database. Classification methods basically consist of two phases. In the first phase, radyographic descriptors are determined from low level image features; and in the second step malignancy is predicted from these descriptors. Single classifiers, ensemble classifiers, fuzzy logic based and rule based methods are used in classification steps. Results are compared with prominent studies in literature and single classifiers which are trained with image features in the context of classification accuracy, specificity and sensitivity measures. The obtained results indicate that radyographic descriptors contribute malignancy prediction. Moreover, the majority of the presented methods' results are successful and comparable with the methods in literature.
(2015) RANA GÖNÜLTAŞ (MS)
Run-time measuring of cosmic functional size via measurement code instrumentation into java business applications
Advisor: YRD. DOÇ. DR. AYÇA TARHAN
[Abstract]
With the rapid development of information technologies in world, measuring functional size of software is an important issue must be considered in management of software projects. Functional size measurement provides a solid ground throughout software projects to estimate planning parameters and track progress. But, when functional size measurement is made manually, it is time-consuming and costly. Moreover, providing measurement from early development to end and measuring accurately is become difficult for complicated projects and measurement results may differ from person to person. For these reasons, automating the process of measurement has gained importance. In this study, it is aimed that the measurement of COSMIC functional size of three-tier Java business applications automatically. In this purpose, measurement is done at run-time according to user scenarios with installing created Measurement Library methods to application source code automatically. Proposed procedure is tested for measuring functional size of system, which is used active by one of the government-based organization in our country. To compare automatic measurement result accuracy, manual measurement is also made for application. We report that functional sizes measured manually and automatically were %96 convergent and that automatic measurement took 10 minutes which was 1/27 of manual measurement effort. Because of the fact that the developed model is the rare study in its field in terms of the method used for the functional size measurement of software automatically, it is expected that it may provide a basis for further studies. In future studies, proposed model may be extended and can use for different type of system in more extensive scope.
(2015) HASAN TUĞRUL ERDOĞAN (MS)
Sparsity-based discriminative tracking with adaptive cue integration
Advisor: YRD. DOÇ. MEHMET ERKUT ERDEM; YRD. DOÇ. İBRAHİM AYKUT ERDEM
[Abstract]
In this thesis, we present a novel tracking method which does not need to a target model on tracking. The proposed tracker associates the sparsity-based discriminative classifier with an adaptive scheme for multiple cue integration. In particular, our model combines visual cues by using reliability scores, which are calculated at each frame during tracking with respect to the current temporal and visual context dynamically . These reliability scores are used to determine the contribution of each cue within the sparsity based framework in the estimation of the joint tracking result. As a consequence, our method have more performance on overcoming occlusions, pose and appearance changes. To show the effectiveness and the performance of our algorithm, we take quantitative and qualitative results on video sequences which have challenging conditions.
(2015) FADİME İLİSULU (MS)
Development of a self-assessment tool for business process maturity
Advisor: YRD. DOÇ. DR. AYÇA TARHAN
[Abstract]
Business processes play an important role in achieving the business goals of organization and delivering quality products. Business process maturity assessment hasn't known widely yet, but studies published in recent years indicate that the issue has drawn attention. For increasing awareness and implementation, as well as information on the maturity models and assessment methods, the existence of tools to support this information sharing and evaluation is also important. The models referenced in the provision of business process maturity define knowledge areas and practices as required, but it doesn't identify the current situation and provide sufficient guidance to mature their business processes. Assessment methods used to evaluate the maturity of the processes remain abstract for a self-assessment that can be implemented easily by organization. At this point, a self-assessment tool that can be used practically by organization is required to provide process maturity. In this thesis, it is aimed to develop a self-assessment tool considering the problems encountered in the management of business processes and the current business process maturity models. Models proposed for business process maturity and assessment methods were examined in detail. According to the obtained data, it was defined necessary characteristics for business process assessment. In the next step, assessment tools in the scientific literature were analyzed according to criteria we have set and the tools were compared on the base of these criteria. The strengths and weaknesses of the assessment tools in literature were identified and then the main features of the tool to be developed were determined. According to the results, a self-assessment tool that can be used practically by assessors has been developed. Use of the developed tool was tested by case studies in two organization and made necessary updates. The results of pilot studies are incentive for the use of the tool to evaluate business processes. Due to lack of adequate studies in the literature, this tool developed for assessment of business process maturity is expected to basis for future studies in this area. The development of Turkish version of the tool and the repetition of pilot studies in various business areas, improvements made if required and improving the functionality are among the works planned for the future.
(2015) DILMUROD VAHABDJANOV (PhD)
Use of semantic model for access control and management in context-oriented role-based authorization: A healthcare case study
Advisor: PROF. DR. HAYRİ SEVER
[Abstract]
In today's, Information Technologies (IT) World security have become a very important and complicated problem. The possibility of unauthorized access to protected knowledge results in unexpected damages and outcomes. In this context, developing secure, flexible and easy to implement rules and policies have become of high priority among security researchers and engineers. Literature review on IT security shows that there have been research activities involving various access control models having certain superiorities based on different approaches. Among the proposed models, the most important one for secure access control operations is role-based access control model. In addition, different new approaches have been revealed on role based data processing and sharing mechanisms throughout distributed structures, the purpose being speeding up the processes of securely accessing to knowledge. At the beginning of 90's, pervasive computing approach have been suggested. The fundamental purpose of this approach is that it is a secure knowledge sharing capability enabling time and space independent service utilization from every point through new generation intelligent IT applications framework. One of the important aspects of pervasive computing is context-based systems. The main concept of context-based systems is to adapt behavioral processes, through detecting environmental conditions. In today's modern technologies, access to and sharing knowledge from every place and every time by individual users necessitate new security requirements and this has become very important. In this context, accessing to knowledge through adaptive service and intelligent systems must be secured and controlled by an effective Access control system. In this thesis, context-based access control models developed through policy-based approaches have been compared. Additionally one of the aims of thesis research is the design of a conceptual authorization model for medical processes and clinical applications. Proposed model uses a context-based knowledge sharing through web and "context-based security" integrated with web based knowledge sharing. Consequently, instead of a fixed authorization model produced through defined roles and rules, a more flexible and dynamic structure based on situational evaluation is considered. This consideration produces a new authorization model reflecting the issues stipulated above.
(2015) EZGİ EKİZ (MS)
A multi-instance based learning system for scene recognition
Advisor: YRD. DOÇ. DR. NAZLI İKİZLER CİNBİŞ
[Abstract]
Scene recognition is a frequently-studied topic of computer vision. The aim in scene recognition is to predict the general environment label of a given image. Various visual elements contribute to the characterization of a scene, such as its spatial layout, the associated object instances and their positions. In addition, due to the variations in photographic arrangements, similar scenes can be photographed from quite different angles. In order to capture such intrinsic characteristics, in this thesis, we introduce a multi-region classification approach for scene recognition. For this purpose, we first introduce a novel way of extracting large image regions, which are expected to be representative and possibly shared among the images of a scene. We utilize these candidate image regions within a multiple instance learning framework. In this way, we aim to capture the global structure of a given scene. This global representation is then combined with a local representation, where local structures are encoded using a discriminative parts approach. Furthermore, we use recently popular deep network structures to represent our large regions and encode these via both multiple instance learning and VLAD representation. In order to merge information from both global and local characteristics and also from different encodings, a supervised late fusion method is performed and shown to capture complementary information in the experiments performed on commonly used scene recognition datasets MIT-Indoor, 15-Scenes and UIUC-Sports.
(2015) ETHEM ARKIN (PhD)
Model-dri?ven software development for mappi?ng of parallel algori?thms to parallel computi?ng platforms
Advisor: YRD. DOÇ. DR. KAYHAN MUSTAFA İMRE; PROF. DR. BEDİR TEKİNERDOĞAN
[Abstract]
The current trend shows that the number of processors used for computer systems are dramatically increasing. By the year 2020, it is planned that supercomputers will have hundreds of thousands of processing units to compute at exascale level. The need for high performance computing together with this trend from single processor to parallel computer architectures has leveraged the adoption of parallel computing. To benefit from parallel computing power usually parallel algorithms are defined that can be mapped and executed on these parallel computing platforms. For small computing platforms with a limited number of processing units the mapping process can be carried out manually. However, for large scale parallel computing platforms such as exascale systems, the number of possible mapping alternatives increases dramatically and the mapping process becomes intractable. Therefore, an automated approach to derive feasible mappings and generate target code must be defined. In this thesis a model driven software development approach for mapping parallel algorithms to parallel computing platforms is provided. The approach includes several activities for modeling the algorithm decomposition and parallel computing platform, defining the reusable assets for modeling the mapping of the algorithm to parallel computing platform, generating feasible mappings, and generating and deploying the final code. For modeling to be possible, the metamodel for the parallel computing is defined and architecture viewpoints are adopted from the metamodel. The approach is evaluated using well-known parallel algorithms.
(2015) NİCAT SÜLEYMANOV (MS)
Developing a platform that supplies processed information from internet resources and services
Advisor: PROF. DR. İLYAS ÇİÇEKLİ
[Abstract]
Every day increasing information resources makes it harder to reach to the needed piece of information and users do not want 10 billion results from search engines, but they prefer 10 best matched answers, even if it exists they prefer the right answer. In this research,we presenta Turkish question answering system that extracts the most suitable answer from internet services and resources. During the question analyzing period, the question class is determined,from the lexical and morphological properties of words in the question certain expressions are predicted and our two-stage solution approach tries to get the answer. Furthermore, to increase the success rate of the system, WordNet platform is used. In information retrieval process, the system works over the documents using semantic web information instead of classic search engine retrieved documents. In order to reach easily to needed information among increasing resources, Tim Berner Lee's idea of semantic web information is used in this research. Dbpedia which extracts structural information from Wikipedia articles and also this structured information is accessible on the web. In our research, the matched subject-predicate-object triples with asked question is formulated to get answer in Turkish, for searching and getting Turkish equivalent of the information Wikipedia Search API and Bing Translate API is used.
(2014) Alaettin UÇAN (MS)
Automatic Sentiment Dictionary Translation and Using In Sentiment Analysis
Advisor: PROF. DR. HAYRİ SEVER; DOÇ. DR. EBRU AKÇAPINAR SEZER
[Abstract]
People want to decide in their daily and seasonal activities by referring other people's emotions, experiences or opinions. This tendency also means transfer of experience and expansion of communal memory. For instance, making a decision about a product or service to buy, a movie to watch or a candidate to elect can be counted in this manner. Other people's opinions on a particular subject is usually important and there is a necessity of having conversation with that person or reading his/her writings to learn it. On the other hand, accessing to writings is relatively easier. Moreover, accessing to massive content of crowds has become much more easier via todays web applications and social media. However, as the volume of this content rises, extraction and assessment of focused content becomes impossible by human effort in plausible time. Thus, this point particularly constitutes the main motivation of sentiment analysis besides content retrieval systems, text mining and natural language processing.Sentiment analysis is the automatic detection of emotion in textual contents (e.g. document, comment, e-mail) by utilization of computer. This is sometimes named as opinion mining or sentiment classification. The emotion which is being detected can contain author's mood and his/her ideas about subject. Furthermore, it can involve the emphasis points or the effect he wants to create.The sentiment analysis studies in last 15 years can be classified into two main groups: (1) "dictionary and collection", (2) "statistical or machine learning".For Turkish, in contrast to well known machine learning methods, it has been observed that there exists no study via sentiment dictionary. Thus, it is targeted to make Sentiment Analysis by directly utilizing dictionary.As there exist no Turkish dictionary which is designed for constituting inter relations between meanings and starting with the idea of "sentimental expressions are universal", an English Sentimental Dictionary has been automatically translated to Turkish. The main contribution of this study is to provide sentiment analysis by no use of prior training data and field dependence. Furthermore it requires low level complexity. The completeness of the dictionary is kept out of this study. The success of translations and the feasibility of created dictionary were evaluated with the experiments.The same experiments were also conducted with machine learning methods in order to compare results. The results which were generated by use of sentimental dictionary were compared with both machine learning based methods and the results of other dictionary studies using other languages. As a result, the proposed method is assessed as successful.
(2014) RAHEM ABRİ ZANGABAD (MS)
A new metric for adaptive routing in mobile ad hoc networks
Advisor: YRD. DOÇ. SEVİL ŞEN AKAGÜNDÜZ
[Abstract]
Mobile Ad-Hoc networks (MANETs) have become very popular for military applications, disaster recovery operations in which the fixed network infrastructure might not be available due to wars, natural disasters, and the like. One of the main research challenges in mobile ad hoc networks is designing adaptive, scalable and low-cost routing protocols for these highly dynamic environments. In this thesis, we propose a new metric called hop change metric in order to represent the changes in the network topology due to mobility. Hop change metric represents the changes in the number of hops in the routing table. It is believed that the change in the hop count is a good representative of the mobility. The high number of change in the hop count can be a sign of high mobility. This metric is implemented in two popular and main routing protocols. Hop change metric is firstly employed to the most popular reactive protocol AODV (Ad hoc On-Demand Distance Vector Routing). This approach called LA-AODV (Lightweight Adaptive AODV). The the main goal of LA-AODV is selecting a route with a low degree of mobility. LA-AODV uses the hop change metric for selecting better routes among valid route reply packets. Due to reflecting the change in the network, hop change metric helps to select a stable route to the destination. The results show that, LAAODV enhanced performance in all performance metrics. There are significant improvement on original AODV from the point of packet delivery ratio, end-to-end delay, network overhead and dropping rate. Secondly, we focus on the proactive protocols, especially DSDV (Destination-Sequenced Distance Vector Routing) protocol and aim to adapt periodic update time in this protocol. We determine a threshold value based on this metric in order to decide the full update time dynamically and cost effectively. The proposed approach called LADSDV (Lightweight Adaptive DSDV) is compared with the original DSDV and ns-DSDV. Simulation results show that our threshold-based approach improves the packet delivery ratio and the packet drop rate significantly with a reasonable increase in the end-to-end delay. Hop change metric represents a clear potential in order to represent changes in both proactive and reactive routing protocols.
(2014) FIRAT AKBA (MS)
Assesment of feature selection metrics for sentiment analysis: Turkish movie reviews
Advisor: DOÇ. DR. EBRU AKCAPINAR SEZER
[Abstract]
Achievements in the Internet services and the increase in the number of Internet users transformed our daily Internet activities to an upper level. People can easily access to their demand via a simple search on the Internet. Even these achievements enable users to query Internet information. Most part of the information presented on the Internet is open for feedbacks. User feedbacks have been captured from polls and forum web sites to be analyzed and produce new ideas. In fact it is hard to analyze them manually in a short time due to existence of huge amount of Internet users' reviews and so much processing time and effort required to evaluate these reviews. Sentiment Analysis concept has been discovered to solve problems occurred while classifying opinions by separating positive and negative reviews. Sentiment analysis concept was discovered at the point of assessing these reviews by means of classifying them into "positive" and "negative". In this thesis, sentiment analysis methods are investigated by considering their success rates. According to several experiment results, it is tried to develop a system that answers in a short time and needs less human efforts. The data used in the thesis was commented and rated by users of Turkish movie reviews web site. This data was rated by gaps between '0.5' and '5.0' points. Feature Selection metrics have been used frequently in the field of statistical. Upon gathering the data, how the use of discrimination feature of SVM and feature selection metrics in comments of various categories has contributed to SVM's success has been discovered. In the proposed system design, 83.9% F1 score is obtained while classifying only positive and negative reviews. While classifying both of positive, negative and neutral reviews, 63.3% F1 achievement score was reached. Based on findings in the literature review, proposed system design proves feature selection metrics can be used successfully in the sentiment analysis. We believe this proposed system design will bring a new perspective in the field of Sentiment analysis.
(2014) BEGÜM MUTLU (MS)
A method suggestion for transition of fuzziness between sublayers in hierarchical fuzzy systems
Advisor: DOÇ. DR. EBRU SEZER; YRD. DOÇ. DR. HAKAN AHMET NEFESLİOĞLU
[Abstract]
Hierarchical Fuzzy Systems are commonly used solutions where performing the fuzzy logic approach by only one fuzzy system is inapplicable for complex problems with a great number of input variables. This complexity is both related with the computational cost and the challenging fuzzy rule creation. In order to overcome these concerns the high dimensional single fuzzy system is separated into lower dimensional sub-systems and these sub-systems are linked by utilizing different design strategies. During the conventional Mamdani style hierarchical inference, the inference stages are employed on each sub-system and the provided crisp output is transferred to the higher layer. The crisp value in question is fuzzified again in the subsequent sub-system. Nevertheless the redundantly repeated defuzzification and fuzzification stages cause data loss since each defuzzification-fuzzification pair degenerates the fuzziness level of the transferred information. This situation prevents obtaining the same outputs with a single fuzzy system. In addition it is not resistant against the revisions in the hierarchical design strategy. Any altering in the hierarchical structure causes providing different outputs from the system even though the values of input variables are not revised. Therefore while utilizing a hierarchical system, it is compromised on the system's accuracy and stability. In this study it is emphasized that the data transmission during the conventional hierarchical inference flow is inaccurate and a new hierarchical flow, namely Defuzzification-Free Hierarchical Fuzzy Inference System, is proposed. In this approach the defuzzification stages are eliminated from the inference flow in the inner layers and the aggregation result is directly transformed to the upper layer. Since the input of the subsequent sub-system is already fuzzy, the fuzzification stage is also removed due to its redundancy. Thus the fuzzy information is propagated from the first layer to the topmost layer accurately. The experiments are employed on using different scenarios: logical cases contains 'AND', 'XOR' and Rock Mass Rating calculation. In these experiments the single fuzzy system and three types of hierarchical fuzzy systems are implemented in order to bring the most accurate solution to these scenarios. The comparisons between the hierarchical flows are employed by using the outputs of single fuzzy system for reference points. Because the most significant requirement of a hierarchical system is providing as much as equal outputs with related single fuzzy system. Results show that the most accurate data transmission is obtained independently from the hierarchical design strategy by using the proposed method since the closest behaviors to the single fuzzy system is procured by this type of hierarchical inference.
(2014) NAEEM YOUSİR (PhD)
Virtual private multimedia network published as saas (software as a service) in cloud computing environment
Advisor: DOÇ. DR. EBRU SEZER
[Abstract]
This work is dedicated to the design and implementation of private and interactive multi-media control over the cloud. The design and implementation are accomplished using concepts and tools of latest software technologies (i.e., HTML5 facilities and WebSocket), where WebSocket is considered as the fastest software technology that could be deployed to transfer streams over the internet due to the synchronization of data transfer adopted by this technology. The ultimate objective of this work is to build private stream player without the need to install more components to the internet browsers (e.g., flash player). This can be done by implementing Real Time Streaming Protocol (RTSP) on facilities provided only by the internet explorer (i.e., HTML5 technologies such as WebWorkers, primitive variables and Media Source API). To implement RTSP, it requires two transport protocols: UDP (i.e., Asynchronous transport protocol); where AJAX is used to implement Real Time Control Protocol (RTCP) and TCP (i.e., Synchronous transport protocol); where WebSocket is used to implement Real Time Protocol (RTP). Results are collected and an evaluation is conducted to reveal differences of the flash player embeddable component. The implemented streaming system has been tested in two environment: one is the laboratory environment where could latency is not involved in the testing infrastructure, the resultant values for the streaming metrics showed a high performance in delivering streams in realtime manner, also the interactivity is maintained for each individual user. the test in phase was done for 4, 8, and 16 users at the same time and every user managed to interact the streaming system independently. For second environment which is the implementation over the cloud, the system maintained the performance but network latency has been an important parameter for determining the overall performance; here 4 did not face cloud latency but 8 and 16 have been affected. The cloud environment showed a great potential to host realtime interactive multimedia interaction system which is supported by encryption methodologies to preserve the security of the stream. the performance of the presented secure gateway for multimedia streaming showed a strong proportion to the encryption algorithm especially with higher length for the encryption and decryption keys.
(2014) UĞUR ERAY TAHTA (MS)
Trust management in peer-to-peer networks using genetic programming
Advisor: YRD. DOÇ. DR. AHMET BURAK CAN; YRD. DOÇ. DR. SEVİL ŞEN
[Abstract]
Peer-to-peer systems are used commonly by virtue of enabling easy resource sharing and open access to every user. Peer-to-peer systems with large number of peers may also contain peers that use the systems maliciously. This situation makes the trust management necessary in the peer-to-peer systems. Many methods have been applied to remove existing malicious peers from the system. Main purpose of these methods is to identify the malicious peers and prevent interaction with them. Within the context of this thesis, trust management in the peer-to-peer systems is provided with a model which trains and improves itself according to the attackers. With the help of genetic programming, a model which evolves and removes malicious peers by detecting their characteristics is developed. Using the model based on peers' direct interactions with each other and recommendations from neighbors, attacks of malicious peers in the system are tried to be prevented. The model trained for different situations and attack types, is tested in various configurations and successful results are obtained.
(2014) LEVENT KARACAN (MS)
Image smoothing by using first and second order region statistics
Advisor: YRD. DOÇ. DR. İBRAHİM AYKUT ERDEM; DR. MEHMET ERKUT ERDEM
[Abstract]
Recent years have witnessed the emergence of new image smoothing techniques which have provided new insights and raised new questions about the nature of this well-studied prob- lem. Specifically, these models separate a given image into its structure and texture layers by utilizing non-gradient based definitions for edges or special measures that distinguish edges from oscillations. In this thesis, we propose an alternative yet simple image smoothing approach which depends on 1 st and 2 nd order feature statistics of image regions. The use of these region statistics as a patch descriptor allows us to implicitly capture local structure and texture information and makes our approach particularly effective for structure extraction from texture. Our experimental results have shown that the proposed approach leads to better image decomposition as compared to the state-of-the-art methods and preserves prominent edges and shading well. Moreover, we also demonstrate the applicability of our approach on some image editing and manipulation tasks such as edge detection,image abstraction, texture and detail enhancement, image composition, inverse halftoning and seam carving.
(2014) OĞUZ ASLANTÜRK (PhD)
Turkish authorship analysis with an incremental and adaptive model
Advisor: PROF. DR. HAYRİ SEVER
[Abstract]
Authorship Analysis is the analysis of a text to get information about the author of that text. It has a long history about 130 years with a wide range of studies, and is an important research topic for criminal, literary, commercial, and academic disciplines. Authorship Attribution is one of the distinct problems of Authorship Analysis and it deals with the identification of the author of a disputed text within a predefined set of candidate authors. Since it is basically a classification problem, machine learning techniques are widely employed for Authorship Attribution studies. However, although approximately 1000 stylistic features have been studied in different researches, there is still no consensus on which are the best and most distinctive. Stylistic features are very important for high prediction accuracies, as well as the resources needed to train the classifiers, because classification models become more complex when the size of input increased. On the other hand, changes of writing styles of authors in time may require to retrain the classifiers, or change the feature sets used. In this thesis, lexical and syntactical stylistic features were analyzed for Authorship Attribution in Turkish. As well as finding the most distinctive features for author detection, the smallest but distinctive sets of these features were investigated. Rough Set-based classifiers were constructed for this purpose, and all of the combinations of 6 feature groups defined from 37 features were analyzed with experiments which were performed using Time Dependent or Time Independent models for various periods of texts. By means of these models and periods, the effects of a possible temporal change on classifiers' performances were analyzed, as well as the distinctiveness of the features. Results of 1134 experiments performed on more than 12.000 articles pointed that the most distinctive feature sets for Authorship Attribution in Turkish are some of the punctuation marks (hyphen, underscore, slash, back slash, paranthesis, ampersand). Additionally, independently of the features selected to train the them, classifiers should be used for at most 1 year before they are retrained.
(2014) ANIL AYDIN (MS)
Investigating defect prediction models for iterative software development: A case study
Advisor: YRD. DOÇ. DR. AYÇA TARHAN
[Abstract]
One of the biggest problems that software organizations encounter is specifying the resources required and the duration of projects. Organizations that record the number of defects and the effort spent on fixing these defects are able to correctly predict the latent defects in the product and the effort required to remove these latent defects. The use of reliability models reported in the literature is typical to achieve this prediction, but the number of studies that report defect prediction models for iterative software development is scarce. In this thesis, we present a case study which was aimed to predict the defectiveness of new releases in an iterative, civil project where defect arrival phase data is not recorded. With this purpose, we investigated Linear Regression Model and Rayleigh Model, each having their specific statistical distributions, to predict the module level and project level defectiveness of the new releases of an iterative project. The models were created based on defect density data by using 29 successive releases for the project level and 15 successive releases for the module level. Both, the distributions of the defect densities and the comparison results of actual and predicted defect density values shows that at module level the Rayleigh Model and at project level the Linear Regression Model produces more reliable results. This thesis explains the procedures that were applied to generate the defectiveness models and to estimate the results predicted by these models. By sharing the lessons learned from the studies, it is enabled to provide a guideline for the practitioners who will study about the prediction process for iterative development.
(2014) ABDULKADİR YAŞAR (MS)
Multi-scheduling technique for real-time systems on embedded multi-core processors
Advisor: YRD. DOÇ. DR. KAYHAN M. İMRE
[Abstract]
Recent studies have shown that today's embedded systems require not only real-time ability but also general functionality. In order to provide these two functionalities on same system, many researches, techniques and frameworks have been developed. Integrating multiple operating systems on a Multi-core processor is one of the most favorite approaches for system designers. However, in this heterogeneous approach, failure in one of the operating systems can cause the whole system to come down. Moreover, in recent years many scheduling techniques such as external and partition-based scheduling have been developed to provide real-time ability for general purpose systems in single operating system without using heterogeneous approach. This thesis introduces Multi-scheduling method for Multi-core hardware platforms without running heterogeneous operating systems concurrently. In this technique, there are two schedulers in single operating system. One of them is for real-time applications and the other is for general or non-real-time applications. In heterogeneous operating systems approach, a real time operating system services real-time functionality such as low interrupt latency while a versatile operating system processes IT applications. Unfortunately, Real-time and IT applications are isolated and run on different operating system environments. This may cause some problems in system design and Inter-Process-Communication (IPC). In Multi-scheduling approach, Real-time and IT applications run in the same operating system environment so the implementation and maintenance of the system become easier. We implemented our work on Linux, widely used general purpose operating system for embedded and industrial systems. By modifying Symmetric-Multiprocessing (SMP) technique in Linux, two schedulers are enabled to run on same kernel and each of them runs on different CPU cores. Our proposed technique is tested by real-time de-facto test tools and programs accepted all over the world. The most important characteristic of a real-time application such as low interrupt latency and responsiveness were benchmarked. The results show that Multi-scheduling technique can be profitable to bring the real-time functionality to general operating system as in heterogeneous approach.
(2014) EMRE AYDOĞAN (MS)
Automatic generation of mobile malwares using genetic programming
Advisor: YRD. DOÇ. DR. SEVİL ŞEN AKAGÜNDÜZ
[Abstract]
The number of mobile devices has increased dramatically in the past few years. These smart devices provide many useful functionalities accessible from anywhere at anytime, such as reading and writing e-mails, surfing on the Internet, showing facilities nearby, and the like. Hence, they become an inevitable part of our daily lives. However the popularity and adoption of mobile devices also attract virus writers in order to harm our devices. So, many security companies have already proposed new solutions in order to protect our mobile devices from such malicious attempts. However developing methodologies that detect unknown malwares is a research challenge, especially on devices with limited resources. This study presents a method that evolves automatically variants of malwares from the ones in the wild by using genetic programming. We aim to evaluate existing security solutions based on static analysis techniques against these evolved unknown malwares. The experimental results show the weaknesses of the static analysis tools available in the market, and the need of new detection techniques suitable for mobile devices.
(2014) PELİN CANBAY (MS)
Anonymity in healthcare systems: An ideal data sharing model for distributed structures
Advisor: PROF. DR. HAYRİ SEVER
[Abstract]
Data obtained or recorded by healthcare institutions, present extraordinary opportunities to produce forward solutions in many fields. Sharing accurate (real-consistent) data is necessary to produce useful results in healthcare field. Because of personal health records that are held by health systems include sensitive attributes about individuals, sharing these records after removing information like name, surname and identity number without any editing causes privacy disclosure. In literature, a lot of privacy-preserving data sharing approaches are developed that aims keeping the benefit that can be taken from existing data in maximum level. Especially in recent years, Privacy-Preserving Data Mining and Privacy-Preserving Data Publishing approaches were studied comprehensively to protect personal or institutional privacy. In this thesis, Privacy-Preserving Data Mining and Privacy-Preserving Data Publishing approaches were summarized, evaluated within the framework of health records and a data distribution model that faciliates both protecting privacy and data sharing was proposed. In this work, an ideal system model was proposed that makes partitioning according to needs of recipient institutions and applies necessary anonymization criteria to the collected data from distributed health institutions, then presents this anonymous information to recipient institutions. The proposed model is a central data distribution system model that faciliates sharing of distributed data sets. In implementation of this model, horizontal and vertical partitioning techniques were used to decompose the data, then the decomposed data were evaluated by appliying k-anonymity and ?-diversity. The implementation processes were applied to two different models and results were compared. At the end, it was observed that the proposed model gave the ideal result in terms of both data loss and data privacy in comparision with likely models. The aim of the proposed model is keeping the balance between protecting privacy and data benefit in an ideal level.
(2014) NEVZAT SEVİM (MS)
Modeling of data communication in high level architecture-HLA for photonic network connected multi-core processors
Advisor: YRD. DOÇ. DR. KAYHAN İMRE
[Abstract]
Simulations that require complicated computational capability or massive calculations cannot be performed without the aid of multi-core processors. Increasing the number of cores in processors is the essential method to overcome such problems. Multi-core processors distribute the incoming processes to each single core and perform parallel computing in order to complete tasks more rapidly. Multi-core processors can be used effectively in the solution of various problems. Numerous parallel processing algorithms have been developed for the solution of these problems and many problems can be solved much faster by parallel processing; such as large-scale simulations, advanced mathematical problems and weather condition computations. HLA (High Level Architecture) that is recommended for development of distributed simulations by IEEE is a standard which can operate much faster by using parallel processing. High level architecture (HLA) is a standard that is used for distributed simulations. It facilitates the administration of complicated simulations. By courtesy of HLA, simulations that are running on different platforms can communicate with each other without having compatibility problems. The RTI mechanism which works on HLA provides all kinds of communications among federates. Any application in the simulation is called a federate. In this study, methods are proposed for declaration service which manage the RTI's subscription mechanism among federates and object management service which leads the data communication. In addition, object management service is implemented and tested. Furthermore, the method that we suggest is even compared with mesh method which is considered to be one of the effective methods for data distribution. Photonic networks is the networks where the data is transmitted in between cores by light signals instead of electrical ones Even in the laboratory yet, photonic networks can be used for data communication in between cores seated on the chip. These nework is shown as the future technology due to high performance and low energy consumption. The routing problem of data between cores can be solved efficiently with photonic networks. In this thesis scope, photonic networks has been used in the proposed methods. In this study, we have examined the advantages ans disadvantages of the pattern that we recommend for data communication in RTI. To reduce the disadvantages of the pattern, we explain how we use mesh method. By considering the system requirements, we have evaluated which method is more reasonable.
(2014) MUHAMMET ALİ SAĞ (MS)
Measuring functional size of java business applications from execution traces
Advisor: YRD. DOÇ. DR. AYÇA TARHAN
[Abstract]
During all phases of software development, software size measurement has great importance for project management processes that are gradually becoming more complex because of growing sizes of software. Due to individual effect of measurement specialist, manually operated size measurements might lead to different results in measured values. On the other hand, considering the cost and time problems and the lack of documentation to input to size measurement, the issues of automatizing process of measurement came to the front. In this study, it is aimed to automatically compute software functional size within the scope of COSMIC functional size methodology in case of detecting functional processes and measuring the size of Java business applications. The method measures functional size using UML Sequence Diagrams that are produced from software with dynamic analysis method at runtime. For dynamic analysis, AspectJ as an implementation of aspect oriented programming methodology on Java platform is being used. The method could be applied without any changes in software code. Functional processes are detected by interactions that are produced by using graphical user interfaces of the software. With the help of AspectJ, execution traces are converted to sequence diagrams which are in text format. Then, using that sequence diagrams, the COSMIC functional size is measured by using data movements. In order to support proposed method, a prototype tool called ' COSMIC Solver', was developed. Considering the goals, the proposed method's usefulness was tested with an example application and a case study. In the case study, in addition to using manual measurement, proposed method/tool was used to measure the size of a Java application that is chosen from a public library as conformant to the prototype, and results are evaluated. It is observed that the results of measurements are converging to each other and that the accuracy of the value calculated by the prototype exceeds the targeted value in this study (80%+). Moreover, it is observed that a natural result of automatizing is a great time saving which is about 10 times.
(2014) ALİ SEYDİ KEÇELİ (PhD)
Recognition of human actions using depth information
Advisor: YRD. DOÇ. DR. AHMET BURAK CAN
[Abstract]
Human action recognition using depth sensors is an emerging technology especially in game console industry. Depth information provides 3D robust features about environments and increases accuracy of action recognition in short ranges. This thesis presents various approaches to recognize human actions using depth information obtained from the Microsoft Kinect RGBD sensor. In the first studied approach, information about angle and displacement of joints is obtained from a joint skeleton model to recognize actions. Then actions are considered as temporal patterns and studied on Hidden Markov Models and time series. In the Hidden Markov Model based model, actions are converted into observation series by utilizing a vocabulary constructed from the features. Besides actions are considered as time series and actions are classified after applying dimension reduction on features extracted from the series. Then, in addition to features from the skeletal model, features are obtained from raw depth data to increase classification ratio. Finally, combining the experince from all studied methods, a low latency action recognition method is proposed. The constructed models are tested on our own HUN-3D dataset and MSRC-12, MSR-Action 3D datasets, which are widely used in the literature. The proposed approaches produce robust results independent from the dataset with simple and computationally cheap features.
(2014) SERKAN ÇAKMAK (MS)
Trust based incentive model in peer-to-peer networks
Advisor: YRD. DOÇ. DR. AHMET BURAK CAN
[Abstract]
Peer-To-Peer networks work by relying on involvement of peers on tasks like resource sharing, routing, and querying of resources. Power of peer-to-peer systems come from resource sharing. If some peers do not contribute to the system, efficiency and effectiveness of the system is degrades. This situation is expressed as free-riding problem. To cope with free-riding and encourage all peers to contribute, incentive models are developed. Incentive models basically aim to encourage all peers to contribute. Main purpose of incentive models is to prevent peers which do not contribute to system and only allow peers which contribute to system for getting services. Within the context of this thesis, a trust based incentive model is developed. Free-riding, white washing, and sybil attack are aimed to solve with this model which uses some metrics gathered as part of trust model. The proposed model has shown that trust models can be used to provide incentives. The model trained for different situations and attack types are tested in various configurations and successful results are obtained.
(2014) HAMİD AHMADLOUEİ(MS)
The impact of named entities on the performance of story link detection task using a turkish corpus of news items
Advisor: PROF. DR. HAYRİ SEVER
[Abstract]
This thesis aims to test the performance of the Story Link Detection (SLD) task as part of the Topic Detection and Tracking (TDT) program using different similarity functions and test their combinations performance on named entities, and find the one that provides the optimum precision/recall values. To do this, we used the Vector Space Model (VSM) as the main method which their performance is proven in TDT studies, and evaluate the impact of Named Entities on the VSM performance. In order to test the performance of methods, we used the BilCOL-2005 corpus after tagged named entites which were used to respond to who, where, when and etc questions with eight (who, where, when, organization, Money, percentage, date, unknown) different labels.
(2013) SİNAN ONUR ALTINUÇ (MS)
Semi-automated shoreline extraction in satellite imagery and usage of fractals as performance evaluator
Advisor: DOÇ. DR. EBRU SEZER
[Abstract]
Shoreline extraction has importance on geographic and geologic studies. It is an important objective in many subjects including land-sea segmentation, observation of erosion and land movements. In this thesis, a method for shoreline extraction in colored images is presented. It's based on the statistical properties of the image and proposes the usage of fractals as a performance evaluation method for shoreline detection. Filtering is applied to reduce the noise on the image. Then a grayscale image is generated using the saturation channel in HSV color space. Afterward by calculating variance map and thresholding, shorelines are roughly detected. Morphological binary image processing is applied. The shoreline extraction is completed with human interaction by selecting a point in the sea region. Performance evaluation is performed by comparing the fractal dimension of the hand drawn shoreline with the shoreline generated by the method. The difference between the fractal dimensions represents the error in the shore extraction process. Experiment is conducted on colored satellite images taken from Google Maps and Quicbird satellite. As a result it is observed that the shorelines are extracted successfully. In fractal dimension evaluation, for most cases very close, for some cases close fractal dimensions are achieved. It is concluded that shoreline can be extracted successfully by variance mapping depending on the resolution of the image and fractal dimension is a good measure for finding detail level errors.
(2013) SEYFULLAH DEMİR (MS)
Enhancing the cluster content discovery and the cluster label induction phases of the lingo algorithm
Advisor: PROF. DR. HAYRİ SEVER
[Abstract]
Search Results Clustering (SRC) algorithms are developed so that users can reach to the results that they search for easier. A good SRC algorithm is expected to correclty cluster the search results, and also to be able to generate representative, understandable and meaningful cluster labels for the produced clusters. The Lingo algorithm is a popular SRC algorithm that notice both two criterions. It is able to generate successful cluster labels as expected; however, it has some shortcomings about determining the cluster contents. As a consequence of its cluster content assignment strategy, semantically relevant documents that do not contain the terms of the cluster labels could not be assigned to the related clusters. Moreover, the method that is used to select final cluster labels results in clusters containing small number of relevant results. These shortcomings cause low recall values. In this thesis, two modification proposals that aim to overcome the shortcomings of the Lingo algorithm are presented. The first modification proposal is for the cluster content discovery phase, and the other is for the cluster label induction phase. The experiment results show that the proposed modifications improve the low recall values to quite higher values.
(2013) TURGAY ÇELİK (PhD)
Deriving feasible deployment alternatives for parallel and distributed simulation systems
Advisor: YRD. DOÇ. DR. KAYHAN M. İMRE
[Abstract]
Parallel and distributed simulations (PADS) realize the distributed execution of a simulation system over multiple physical resources. To realize the execution of PADS, different simulation infrastructures such as HLA, DIS and TENA have been defined. Recently, the Distributed Simulation Engineering and Execution Process (DSEEP) that supports the mapping of the simulations on the infrastructures has been proposed. An important recommended task in DSEEP is the evaluation of the performance of the simulation systems at the design phase. In general, the performance of a simulation is largely influenced by the allocation of member applications to the resources. Usually, the deployment of the applications to the resources can be done in many different ways. DSEEP does not provide a concrete approach for evaluating the deployment alternatives. Moreover, current approaches that can be used for realizing various DSEEP activities do not yet provide adequate support for this purpose. We provide a concrete approach for deriving feasible deployment alternatives based on the simulation system and the available resources. In the approach, first the simulation components and the resources are designed. The design is used to define alternative execution configurations, and based on the design and the execution configuration a feasible deployment alternative can be algorithmically derived. Tool support is developed for the simulation design, the execution configuration definition and the automatic generation of feasible deployment alternatives. The approach has been applied within two different large scale industrial simulation case studies.
(2013) FATİH MEHMET GÜLEÇ (PhD)
Applying rough set theory to literature based discovery
Advisor: PROF. DR. HAYRİ SEVER
[Abstract]
Science is a collection of academic studies to clarify the beening wondered. The results of academic studies are released a scientific writing style in order to share with other researchers and these publications are considered to be the most important output of the academic studies. Examination of these publications by other scientists with their present knowledge, new research ideas may occur. The acedemic studies which are examining the publication with information systems by simulating how the new research ideas ocur in the human brain are referred to as Literature Discovery Based. In the Literature Based Discovery studies, pieces of information are created by associating terms according to their common publications. By applying chain rule to these information pieces, establishing new ideas is aimed. The ABC model focuses on the user interested subject and finds terms "which are not directly related to the subject" but also "indirectly related through some common terms". These disjoint but indirectly-related terms are presented as new and novel ideas. Many studies conducted to date has focused on publications in the field of medicine. Similar manner, in the thesis, published articles in the field of medicine has been used as the main source of information. There are some commont problems of Literature Based Discovery Studies. The high noise level at the result set is the biggest problem. There are a great number of terms is medical terminology and lots of these terms has also a synonymous or similar term. Because the chain rule of ABC model, similar terms are combinationally amplifies the result set in the muddy way. In the thesis, concept are used which are representing more than one terms, to reduce the noise data. Both no being very general terms, and also not being very special terms are marked as the concepts and all analyzes were carried out using these concepts. In all studies conducted to date, the ABC model has been seen as the core function of the Literature-Based Knowledge Discovery. In the thesis, the relationship between A and C has been established by using clusters of academic publishing. The linking mission of B term are replaced by the paper clusters. For this purpose 16 million articles in PubMed database are clustered ve totaly 50.000 differens cluster are established. Average cluster size is 320 publications. This nwe approach was test with randomly selected 80 terms. The average precision is calculated of 0.16 and avarage recall is calculated of 0.52. According to both precision and recall, two times improvement is achieved compared with similar studies. Obtaining a noticeable success by improving the core function of Literature Based discovery, will lead the other reseachers to creative ideas. Literature Based Discovery dream to open the way to new discoveries by collating the knowledge of all humanity in scientific publications. Studies and the progress achieved in this field indicate that success of everyone senses will be achieved.
(2013) YASİN ŞAHİN (MS)
Desing and implementation of a software tool for comparing methods in outbreak detection based on time series
Advisor: PROF. DR. ALİ SAATÇİ
[Abstract]
After the 9/11 terrorist attacks, early outbreak detection systems against bio-terrorism have gained a lot of importance. In this context, EARS (Early Abberation Reporting System) has been developed in order to detect automatically bio-terror attacks which can be observed as an anomaly in public health data. Because methods used in EARS have shown weaknesses in detecting slowly propagated epidemics, much more elaborated statistical methods such as Cusum, EWMA, NBC have been developed to replace old methods. Studies have shown that the performance of the method used is closely related to the nature of the data set scanned. In this thesis a WEB based software tool is designed and implemented to compare methods in order to find the adequate one for a given set of data.
(2013) AHMET ATA AKÇA (MS)
Run-time measurement of cosmic functional size for java business applications
Advisor: YRD. DOÇ. DR. AYÇA TARHAN
[Abstract]
The issue of "Functional Size Measurement" is increasing in importance for the software project management. For the project management process, which is gradually complexifying because of the growing sizes of software, it is critical that the size of software should be measurable in every phase ranging from the first phases of development to the end of it.. Considering the fact that it is considerably time-consuming and costly when functional size measurement is made manually; automating the process of measurement came to the fore. In this study, it is aimed that the runtime measurement of COSMIC functional size shall be carried out automatically by ensuring functional processes, which are triggered via user interface of a three tier Java business application, to be discovered by using the Measurement Library that we developed and by monitoring the data movements which occurs during these functional processes. It is necessary to perform the code addition process that is specified in the "Measurement Library Manual" which is prepared to assist the user with the operation of the Library. The Library has been imported from a simple student registration system, and the application size has been measured automatically following the triggering of all functions which are operated on GUI at least once. In order to verify the results of the automatic measurement, the measurement made manually once more. It has been found that the COSMIC functional sizes measured automatically and calculated manually are 92% approximate. Results of the measurement show us that the "Measurement Library" is well designed and applicable. Subsequently, the costs of automatic and manual measurements are compared and it is focused on whether automatic measurement is cost effective or not. In order to decide on this, cost effectiveness analysis of the automatic measurement is carried out with three different case studies. Although the Measurement Library is not cost effective when library is integrated after the development phase and/or user is not familiar with the library, it can decrease costs up to %500 compared to the manual measurement processes when it is integrated during early development phases. Because of the fact that the "Measurement Library" is the first in its field in terms of the method used for the functional size measurement of software automatically, it is expected that it may provide a basis for further studies in this field. In future studies, scope of the Library may be improved. Also, the costs due to the integration of the Library after the development phase is planned to be decreased by automated code additions.
(2012) KAZIM SARIKAYA (MS)
Password-based client authentication for SSL/TLSprotocol using Elgamal and Chebyshev polynomials
Advisor: YRD. DOÇ. DR. AHMET BURAK CAN
[Abstract]
The security of information transmitted in Internet communications is a very important problem. Because of this problem, the SSL/TLS protocol is defined as a cryptography based standard to provide a secure channel over Internet.In an SSL/TLS connection, the server can request a client certificate to authenticate the client. However, getting certificates for all clients is a costly approach. Hence, in most applications, username-password pair is sent to the server for client authentication after the secure channel is established. Since username-password based authentication is needed in most applications, this technique is proposed as a part of SSL/TLS protocol. As an SSL/TLS protocol extension, TLS-SRP can authenticate clients based on username-password information.In this thesis, as an alternative to TLS-SRP protocol, five TLS extensions, which use Chebyshev Polynomials and ElGamal encryption algorithm and which does client authentication based on username and password in SSL/TLS protocol are developed. The security of developed extensions are studied and compared with TLS-SRP. Furthermore the extensions are implemented in OpenSSL library and their performances are compared with TLS-SRP.
(2012) TUĞBA GÜRGEN (MS)
An integrated infastructure for software process verification
Advisor: YRD. DOÇ. DR. AYÇA TARHAN
[Abstract]
In this thesis, it is aimed to develop an integrated infrastructure tosupport process mining method for software process verification. Withthis infrastructure, a specific software providing the utilization of the algorithmsused in process mining area is developed as a plugin to an open-sourceEPF tool which supports the management of software and systemengineering processes.The aim of the plugin is to verify an event- or state-based software processwhose flow is supported by a tool, by using process mining methods on realevent logs and data, based on the process model of the same process definedby EPF. In order to achieve this aim, a Conformance Analysis component forchecking process consistency, a Process Variants Miner component fordetecting different applications of a process, and a Statistical Analyzer component forassessing process performance and stability are integrated into theplugin.The integrated infrastructure is considered to provide a basisfor supporting the software process mining methods which can besubject to other studies in this area.
(2012) SEDA TANKIZ (MS)
Content based video copy detection
Advisors: PROF. HAYRİ SEVER ; YRD. DOÇ. DR. NAZLI İKİZLER CİNBİŞ
[Abstract]
The availability of digital media has grown tremendously with multimedia and communication technologies. This brings requirements in managing massive videos, analysis the content and controlling the copyright of the huge number of video. Content-based copy detection (CBCD) that is alternative to the watermarking approach is hot issue for both academia and industry.In this thesis, we propose a content based copy detection method. The proposed method mainly includes three stages: video segmentation, feature extraction and matching. Firstly videos are segmented. Then global and local features are extracted from each keyframes. Finally, copies are detected by voting based matching process.The proposed method is tested on TRECVID 2009 CBCD dataset. Results of features are compared with each other, the proposed method is promising.
(2012) KEREM ERZURUMLU (PhD)
A new block based method for embedded operating systems with performance objective and implementation of this method on thin clients
Advisor: PROF. DR. HAYRİ SEVER
[Abstract]
Operating systems, is one of the fundamental building blocks of computer science today. The basic building block is divided into two sub classes, as the end-user operating systems and embedded operating systems. Embedded operating systems are generally perform a specific job on the limited hardware components. In most cases, performances of these systems are critical to their operations. Although the embedded operating systems have wide area of usage, in the scope of this thesis, thin clients and embedded operating systems are the main subjects. By this way, the proposed new method could be tested with the widest possible field of applications and hardware support.In scope of thesis, a new operating system compression method is proposed and designed especially for thin-client architecture and also able to work with all types of computer architecture. Considering the negative performance effect of file system based compression methods, the compression method is designed and develeped on operating system level.In addition within this thesis project, the compression method that is developed, is compared with the common methods which are being used today within criteria of required storage space and performance.
(2012) MEHMET SÖYLEMEZ (MS)
Investigating defects root causes by using orthogonal defect classification and software development process context
Advisor: YRD. DOÇ. DR. AYÇA TARHAN
[Abstract]
In this thesis, for the purpose of improving the quality and reliability of software development process and product of TÜBİTAK - BİLGEM - UEKAE / Software and Data Engineering Department (G222), the method that has been proposed for investigating the root causes and preventing from reoccuring of software defects, recorded in software development process and having critical role in software product?s quality and reliability, will be explained.It is aimed to investigate software defects? root causes and develop precautions to prevent software defects in the following software development phases with this method. It is proposed to handle ODC (?Orthogonal Defect Classification?) method and the context information of software development process producing software defects together. In order to develop qualified and reliable software product by implementing this method, software defects will be investigated in terms of both software development process and technical aspects. The achievability of this method was shown on a case study which showed that software defects? root causes could be found by this method. Besides, precautions that needed to apply to prevent software defects from reoccurring in following software development phases, was determined. After implementing this method, measurement results were compared to the state of the software defects before implementing this method. Comparison results show that the proposed method can improve the development process and product quality.
(2011) SEDA GÜNEŞ (MS)
Quantative comparison of agile and incremental process performances in software development: A case study
Advisor: YRD. DOÇ. DR. AYÇA TARHAN
[Abstract]
Because of the low success rates of the software projects, the existing software development models have been questioned in 1990s and in the following years, the Agile models have been suggested as an alternative to the Traditinal models. Although the Agile models have been used in projects since 1990s, the number of studies that quantitatively reveal the effect of using these models on development performance is scarce. Among the important reasons for this are the difficulties in the measurement of the Agile methods and that the performance of the development models previously used in the organisation are not measured to form a base for the comparison.In this thesis, the performance of the Incremental and Agile Processes applied in projects in a medium-sized organisation is investigated from Design, Implementation, System Test and Customer Use Phase points of view and compared quantatively. Before the comparison, a systematic method is defined for the quantative analysis of the processes and applied to understand the performance and the product quality of the processes. The Goal-Question-Metric Framework is applied to determine the analysis goals and related metrics. A predefined assessment approach is used to capture process context and understand measure characteristics prior to quantitative analysis. The analysis results of the Incremental and Agile processes have been evaluated and compared in accordance with the goals defined by the GQM Framework.By following the defined method, different projects could be included in the analysis in the long term. It is thought that, this kind of effort will be beneficial if the analysis results are to be used throughout the organisation.
(2011) HÜSEYİN TEMUÇİN (MS)
Implementation of torus network simulation for multiprocessor architectures
Advisor: YRD. DOÇ. DR. KAYHAN M. İMRE
[Abstract]
Processor speeds reaching the natural limits and the increasing processing needs have made the parallel systems be necessary in the processor architectures. At the present time, all of the computer systems have transformed into multi-processor systems of large or small scales. The spread of architecture with multiple processors has increased the importance of both these architectures and the studies on them.Fields like computer systems, in which physical tests are costly and long-term increase the need for simulations which model systems. Simulations are massively used in all scientific and commercial areas.In the scope of thesis, the simulation of a multi-computer network has been implemented. The simulated network has torus topology and uses wormhole switching model. The simulated network supports deterministic and source-based routing algorithms. Simulation has been implemented as a parallel simulation and in the scope of discrete event simulation. The simulation has been developed with Java programming language. The details of developed simulation design have been explained and the results of tests have been discussed.
(2010) İBRAHİM TANRIVERDİ (MS)
A design and implementation of MPI (message passing interface) library for a graphical processing unit (GPU)
Advisor: YRD. DOÇ. DR. KAYHAN İMRE
[Abstract]
MPI is a message-passing interface specification designed for parallel programming. It has been defined to develop portable, practical, efficient and flexible parallel application. MPI, the important standard, is supported by parallel computing vendors, computer scientists and application developers.To constitute parallel computer systems to develop application and its maintenance is expensive. However, graphics card which is a minor device has many graphical processing units (GPU) which are specialized for intensive and highly parallel graphic computation. These parallel processors could be used to develop general purpose applications. Besides, graphics cards have better compute and memory access performance than many personal computers.In this thesis, most used MPI library functions are implemented for a graphical processing unit and, performance of the implemented library are tested with some parallel applications and MPI algorithms.
(2010) TAHİR BIÇAKCI (MS)
Using biomedical concepts for drug discovery
Advisor: PROF. DR. HAYRİ SEVER
[Abstract]
Researchers are required to follow digital libraries containing scientific publications to capture emerging advancements in their fields. Large volumes of scientific publications in biomedical field makes it difficult to keep track of new studies. Tools that provides easy access to sought information clustered in the large volumes are already presented. Beside to these tools, demand to additional tools, which are specialized according to the field of expertise and in ability to make intelligent decision, is increasing with every passing day.Literature Based Discovery Tools (LBD), aims to produce new hypotheses over large text-based information sources with previously acquired domain knowledge. LBD provides establishment of relations between concepts through papers prepared in the field of biomedical research and production of new hypotheses by examining those relations.In this study, by assembling the important common aspects of previous studies, a new LBD tool production has been achieved. Medline, the generally accepted as the most important academical paper data base in the biomedical field is maintained as the basic scientific knowledge source. Biomedical terms have been determined from textual data in this knowledge source and relationships between these terms are established by considering coexistance of them in this textual data.The performance of LBD tools are measured by using the information retrieval parameters, precision and recall. Tool compiled in this study exhibit difference from the similar tools with high precision and recall values.
(2010) YİĞİTCAN AKSARI (MS)
Groundwater analysis and modelling on graphical processing unit
Advisor: YRD. DOÇ. DR. HARUN ARTUNER
[Abstract]
Ground-water flow modelling is a subject which deals with partial differential equation systems. Many applications are developed to solve these equations; however these applications cannot model large ground-water systems with satisfactory performance.Algorithms can be implemented on parallel architectures to speed-up the modelling process. Execution time can be reduced by parallelization of the applications; however such implementations on computer clusters can reach to a little extent of end-users. By using relatively low cost Graphical Processing Units (GPU), which have vector computer-like architectures, it is possible to deliver speed-ups to the end-user.In this thesis, a widespread used ground-water flow simulation software, MODFLOW is examined. Preconditioned conjugate gradient method and two preconditioning algorithms are implemented on a GPU, which are the bottlenecks of the application. GPU implementation?s performance is compared with respect to the current application.
(2010) HAMDİ YALIN YALIÇ (MS)
Automatic recognition of traffic signs in Turkey
Advisor: YRD. DOÇ. DR. AHMET BURAK CAN
[Abstract]
Intelligent vehicles and driver support systems will take place more in our daily life with the development of information technology. This kind of vehicles can obtain a lot of information about roads from the traffic signs. Such information can be used to warn drivers, regulate traffic or control movements of the vehicle, as a result, more comfortable driving could be provided. Moreover, recognition of traffic signs at the right time and place is an important factor to ensure the safe journey. Automatic recognition of traffic signs in real-time will help to reduce accidents.This thesis introduces an approach to detect and match traffic signs in Turkish Highways automatically. In the detection phase, color information is used to find a sign in the scene and the region of sign is extracted by using image processing techniques. The features extracted from the regions of signs are matched with the features of traffic signs in the database to determine the type of signs. With the proposed solution, signs were recognized in various scenes, different scales, and viewing angles. Speed performance of the approach is sufficient to implement a real-time system in the future.
(2010) ÖNDER KESKİN (MS)
Design and implementation of a tool for ontology based source code querying
Advisor: DOÇ. DR. EBRU SEZER
[Abstract]
In software engineering, code comprehension is highly important for achievingprimary operations such as maintenance and analyze of source code on software.Source code querying tools enables us to get information at the advanced level,and comprehension of source code in a quick and efficient way by providingquerying facility, taking advantage of relations between code elements.In this thesis, an ontology based source code querying tool is developed forEclipse development enviroment as a plugin. In the implementation of the tool; anontology represented by OWL-DL (Web Ontology Language ? Description Logics)is used as knowledge base, SPARQL (SPARQL Protocol and RDF QueryLanguage) is used as the query language, and finally an Inference Engine is usedas the semantic reasoner.During the development stage of the tool; first, an ontology is designed for thesource code written using Java, a parser is developed that can otomatically buildontology instance for the Java project interested. Finally, a query view and a resultview, that the results of the processed query are listed, are designed for queryingthe project. It is observed that, code querying can be achieved in an efficient wayand desired results are obtained by using developed tool.
(2009) AHMET SELMAN BOZKIR (MS)
Implementation of a web based decision support system utilizing by OLAP and data mining technologies
Advisor: DOÇ. DR. EBRU SEZER
[Abstract]
Actually, companies collect much more data than they did in the past. Moreover this situation brings forth the need of analyzing that data and extracting meaningful information from it. Therefore, it is thought that there is increasing interest and requirement for decision support systems depending on the time. Data mining, a methodology that is based on discovering relations and hidden patterns in huge amount of data, has a very crucial role in aspect of decision support systems. Data mining, which is regarded as one of the top ten technologies that will change the future by MIT has been used within decision support systems progressively. From the point of view of multi-division corporations it is seen that Internet, the most important communication tool in this era, has been used frequently in decision process. In this study, a web based online decision support and reporting tool that is focused on data mining methodology is developed. With the help of this developed tool, users are enabled to analyze, query and report upon three types of data mining techniques; decision trees, clustering and association in web environment.
(2009) AYDIN KAYA (MS)
Developing a pattern tracking and registration method for laser eye surgery
Advisor: YRD. DOÇ. DR. AHMET BURAK CAN
[Abstract]
Eye trackers are primarily used on laser eye surgery and human-computer interaction; and can be also used in psychology, neuroscience, defence industry etc. In laser eye surgery, the accuracy of ablation depends on coherent eye tracking and registration techniques.Photoelectric, laser or image processing based eye trackers are used in laser eye surgery. Main approach used in image processing based eye trackers is extraction and tracking of pupil and limbus regions. For eye registration, image processing techniques are used because of precision concerns. Generally, iris region features that are extracted from infrared images are used for eye registration.In this thesis, two methods are presented for pattern tracking and registration for ablation. These methods differ from other techniques by using scleral blood vessels for feature extraction. During the surgery, scleral blood vessel structure does not change. This area is a good feature extraction region due to texturedness and resistance to pupil center shift problem.Although our methods proposed for laser eye surgery, they can be used for human-computer interaction. Additionally, scleral blood vessels can be used as feature region for biometric measurement.
(2009) ALİ SEYDİ KEÇELİ (MS)
Design and implementation of a software for automatic detection of white matter lesions in brain
Advisor: YRD. DOÇ. DR. AHMET BURAK CAN
[Abstract]
MR (magnetic resonance) images are frequently used for clinical diagnosis of a large group of diseases, especially cancer and MS (Multiple Sclerosis). In order to reduce observer inconsistency, manuel effort and analysis time causing from visual examination on MR images, computational methods are needed. Rapid increase in the number of patients and insufficient number of labor force in medical image analysis increase the need for these type of systems. Hence, there is an increasing interest and requirement on usage of image processing techniques in medical image analysis.This thesis introduces a system to detect white matter lesions in brain automatically. Our method tries to determine the regions of lesions unsupervisedly on MR images of a group of patients by using image processing techniques. As a first step, skull is removed from the brain tissue. Four different methods are used in skull-stripping. After skull-stripping, pixels in brain image are classified. Finally, the binary mask containing the regions of lesions is generated by region-growing. The system also enables volumetric calculation and 3D visualiziation of lesions. Additionally, a pefromance improvement is done by GPU based parallel image processing.
(2009) SEMA GAZEL (MS)
An ontology for ?CMMI-DEV? and supporting CMMI based process assessment with an ontology based tool
Advisor: DOÇ. DR. EBRU SEZER
[Abstract]
The premise the quality of a software product is highly influenced by the quality of the processes used to develop and maintain it? is widely accepted, and from that point of view, some reference models and standards have been developed for process improvement and assessment. Processes being used in an organization are important assets and knowledge for that organization. Process activities, such as monitoring the processes in accordance with reference models and standards, are important. To meet that need, in this thesis scope, an ontology has been developed for CMMI-Dev and an ontology-based tool using the CMMI-Dev ontology has been developed by extending the EPF, which is an existing process management tool. With that new tool, which is named OCMQ-E (Ontology-based CMMI Mapping and Querying - EPF), it is aimed to monitor the accordance between organization?s processes and CMMI-Dev, and to support data collection activities in a process assessment. To reach that aim, creating the ontologies of processes of an organization, mapping process ontologies and CMMI ontology and saving the mapping information as an ontology are provided. Thus, it is enabled that CMMI domain knowledge, processes knowledge, and information about CMMI-process mapping are queried by using ontology queries.
Hacettepe Üniversitesi Mühendislik Fakültesi
06800 Beytepe Ankara