Defective Ramsey Numbers in Graph Classes (MS/PhD Students in
Industrial Engineering and/or Mathematics - with TÜBİTAK fellowship)
Ramsey theory is about some structures that cannot be avoided as the number of vertices in a graph increases. This research branch, introduced in 1930, has been the focus of many famous mathematicians including Turan and Erdös.
We will consider defective Ramsey numbers which are obtained by generalizing the notion of independent sets and cliques in the classical Ramsey numbers by k-sparse and k-dense sets. A k-sparse i-set is a set S of i vertices of a graph G such that each vertex in S has degree at most k in G. A k-dense j-set is a set D of j vertices of a graph G such that each vertex in D has degree at most k in the complement of G; in other words, each vertex in D misses at most k other vertices in its neighborhood. The term k-defective set is used to denote a k-sparse or k-dense set. The defective Ramsey number R^G_k(i,j) is the smallest n such that all graphs on n vertices in the class G have either a k-dense i-set or a k-sparse j-set.
Defective Ramsey numbers form a relatively new and unexplored area of the
In this project, the student will consider the hard-to-compute defective Ramsey numbers in special graph classes. Such a systematic study has been recently initiated by T. Ekim, J. Gimbel and O. Şeker (Small 1-Defective Ramsey Numbers in Perfect Graphs, manuscript) where they started to study defective Ramsey numbers in the class of perfect graphs and pointed out several related research questions. In this thesis, formulas giving all defective Ramsey numbers in some graph classes such as P4-free graphs and threshold graphs will be developed, or if the case fails to raise, negative results certifying that it is in some sense impossible to find such formulas will be shown. The student will also find good lower bounds on the size of the largest k-sparse or k-dense sets and explore the use of these bounds in the computation of various defective Ramsey numbers. Such an example can be found in Ekim et al. (2018) where it is show that every C4-free cactus on n vertices contains a 1-sparse set of size at least (n+1)/2. This result is subsequently used in the computation of a specific 1-defective Ramsey Number in the class of perfect graphs.
In this study, the student will use the structural properties of graph
classes and direct proof techniques in Ramsey theory
The Selective Tree Problem (to be co-supervised with Z.Caner Taşkın)
Problems of selective nature, such as the selective coloring problem, have already been studied in the literature. In general, they bring more flexibility to the application areas of the problem under consideration.
Given a graph and a partition of its vertex set into clusters, the Selective Tree Problem is the following decision problem: is there a selection of exactly one vertex per cluster such that the graph induced by the selection is a tree? Noting that a tree is a minimally connected graph, this can be seen as the most economical way to connect all clusters.
If each cluster has one vertex then the problem boils down into deciding
whether the given
graph is a tree, thus a trivial question. However, as soon as clusters are allowed to contain 2 vertices, the problem becomes intractable even in restricted cases. More precisely, it is NP-complete to decide whether a given planar bipartite graph with clusters containing either 1 or 2 vertices admits a selection inducing a tree or not.
One can also think about the optimization version of the Selective Tree Problem where the objective is to find a selection (that is, exactly one vertex per cluster) such that the order or the total weight of an induced tree in this selection is maximized.
The aim of this project is to develop an Integer Programming Formulations and efficient solution algorithms for the Selective Tree Problem (both for the decision and optimization versions) and test its efficiency in a systematic way. Two defining properties of trees will be used in such a formulation: connectivity and the absence of cycles. A similar, yet different problem will also be considered from the same aspect: The Selective Forest Problem where the selection is still required to induce an acyclic graph, however the connectivity requirement is relaxed. It is not clear at first sight whether this relaxation makes the problem computationally easier or harder. This project is also expected to shed light on this question.
Stable Matchings and University Admission Problem (to be co-supervised
with Prof. Ahmet Alkan from Sabancı University)
(MS Student in Industrial Engineering and/or Computer Science)
Among several applications of the stable matchings, one of the most studied is the university admission problem. This can be seen as a one-to-one matching problem where institutions and applicants have preference lists on each other. The objective is to find a matching between institutions and applicants which is stable; that is, a matching where there is no pair of institution-applicant which are not matched to each other but which would be both better off if matched to each other. Gale and Shapley describe an algorithm to determine a stable matching in such a context. Gale-Shapley Algorithm generates two extreme stable matchings, one which is best for the applicants (thus worst for the institutions) and one which is best for the institutions (thus, worst for the applicants).
Consider all the stable matchings. For any agent, order all the possible matches (with possible multiplicities) in these stable matchings and take the median one. Then the median stable matching is such that all agents are matched to their median matches. Teo and Sethuraman established the existence of a median stable matching and proposed an algorithm to find it. One objective of this thesis is to suggest new algorithms to find median stable matchings and compare them with the ones in the literature both from theoretical and practical point of views. It is known that the number of all stable matchings may increase exponentially with the number of agents. A particular aim will be to find an algorithm that is polynomially bounded over an interestingly large domain.
The student will also be focused on an extension of the university admission problem where we seek a many-to-many matching (instead of one-to-one) upon which to schedule interviews: The students matched with a university in this matching (on the basis of student preferences and exam scores) will be those whom the university will interview for additional information to update its preferences. The interview lists here, per student and per university, should naturally be of "short" length since each interview has an associated cost. On the other hand, each interview brings an informational return that can be seen as a profit. The objective is to generate the most valuable list of interviews. Algorithms to find most valuable list of interviews will be proposed and empirically tested.
• Alkan and D. Gale, "Stable schedule matching under
revealed preference", Journal of Economic Theory, 2003
• T. Ekim, Eşleme Kuramı ve 2012 Nobel Ekonomi Ödülü, Matematik Dünyası, 2019.
• D. Gale and L.S. Shapley, "College admissions and the stability of marriage", Amer. Math. Monthly, 69:9–14, 1962.
• C.P. Teo and J. Sethuraman (1998): “The Geometry of Fractional Stable Matchings and Its Applications,” Mathematics of Operations Research, 23(4), 874–891.
Money Laundering Detection (to be co-supervised with Tolga Kurt from H3M.IO)
(MS/PhD Student in Industrial Engineering and/or Computer Science)
Problem description: Money laundering activities play a key role
in financing all illegal activities such as terrorism, drug trade, human
trafficking. We think that nearly 2 trillion USD is laundered every year
and only 2% of it can be detected. Anti-Money-Laundering (AML) software
used in financial institutions are mainly rule based systems. Two main
problems of these systems are the extra workload created for experts due
to a high percentage of false alerts, and their inefficiency in detecting
frauds because most rules are also known by illegal organizations.
A network representing money transactions can be constructed by representing each entity with a node and each transaction with an arc from the sender to the receiver. The goal of this project is to detect money laundering activities in this money transaction network. The main difficulty in achieving this goal is the huge size of such a network which typically contains 2 to 5 million nodes and 300 million arcs when transactions of 365 days are considered.
In the money transaction network, some nodes are already designated as suspicious. One of the tasks is to determine subgraphs containing suspicious nodes as well as the nodes they are related to in this network. Another objective is to detect subgraphs similar to those which are already designated as subgraphs representing money laundering activities, or subgraphs representing some anomalies.
Method: Graph theory techniques will be used in finding subgraphs corresponding to suspicious money transactions and transactions which are not expected to occur in usual financial activities. Among these approaches, minimum cut and node/arc reduction methods will be considered within this study. To cope with the capacity/complexity issues due to the size of a money transaction network, we aim at reducing/simplifying the network while limiting the loss of information. The student will develop and compare methods to determine the model yielding the most useful network with minimum loss of information.
Student profile: In this thesis, the student will collaborate with H3M.IO which develops solutions for AML applying big data and artificial intelligence techniques using Hadoop and Spark. He/she will also use applications of graph theory on big data. The candidate is therefore expected to conduct a study on the intersection of data science, big data and graph theory.