Publications

Results 1–25 of 45
Skip to search filters

MCS+: An Efficient Algorithm for Crawling the Community Structure in Multiplex Networks

ACM Transactions on Knowledge Discovery from Data

Laishram, Ricky; Wendt, Jeremy D.; Soundarajan, Sucheta

In this article, we consider the problem of crawling a multiplex network to identify the community structure of a layer-of-interest. A multiplex network is one where there are multiple types of relationships between the nodes. In many multiplex networks, some layers might be easier to explore (in terms of time, money etc.). We propose MCS+, an algorithm that can use the information from the easier to explore layers to help in the exploration of a layer-of-interest that is expensive to explore. We consider the goal of exploration to be generating a sample that is representative of the communities in the complete layer-of-interest. This work has practical applications in areas such as exploration of dark (e.g., criminal) networks, online social networks, biological networks, and so on. For example, in a terrorist network, relationships such as phone records, e-mail records, and so on are easier to collect; in contrast, data on the face-To-face communications are much harder to collect, but also potentially more valuable. We perform extensive experimental evaluations on real-world networks, and we observe that MCS+ consistently outperforms the best baseline-the similarity of the sample that MCS+ generates to the real network is up to three times that of the best baseline in some networks. We also perform theoretical and experimental evaluations on the scalability of MCS+ to network properties, and find that it scales well with the budget, number of layers in the multiplex network, and the average degree in the original network.

More Details

NetProtect: Network Perturbations to Protect Nodes against Entry-Point Attack

ACM International Conference Proceeding Series

Laishram, Ricky; Hozhabrierdi, Pegah; Wendt, Jeremy D.; Soundarajan, Sucheta

In many network applications, it may be desirable to conceal certain target nodes from detection by a data collector, who is using a crawling algorithm to explore a network. For example, in a computer network, the network administrator may wish to protect those computers (target nodes) with sensitive information from discovery by a hacker who has exploited vulnerable machines and entered the network. These networks are often protected by hiding the machines (nodes) from external access, and allow only fixed entry points into the system (protection against external attacks). However, in this protection scheme, once one of the entry points is breached, the safety of all internal machines is jeopardized (i.e., the external attack turns into an internal attack). In this paper, we view this problem from the perspective of the data protector. We propose the Node Protection Problem: given a network with known entry points, which edges should be removed/added so as to protect as many target nodes from the data collector as possible? A trivial way to solve this problem would be to simply disconnect either the entry points or the target nodes - but that would make the network non-functional. Accordingly, we impose certain constraints: for each node, only (1 - r) fraction of its edges can be removed, and the resulting network must not be disconnected. We propose two novel scoring mechanisms - the Frequent Path Score and the Shortest Path Score. Using these scores, we propose NetProtect, an algorithm that selects edges to be removed or added so as to best impede the progress of the data collector. We show experimentally that NetProtect outperforms baseline node protection algorithms across several real-world networks. In some datasets, With 1% of the edges removed by NetProtect, we found that the data collector requires up to 6 (4) times the budget compared to the next best baseline in order to discover 5 (50) nodes.

More Details

Crawling the community structure of multiplex networks

33rd AAAI Conference on Artificial Intelligence, AAAI 2019, 31st Innovative Applications of Artificial Intelligence Conference, IAAI 2019 and the 9th AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019

Laishram, Ricky; Wendt, Jeremy D.; Soundarajan, Sucheta

We examine the problem of crawling the community structure of a multiplex network containing multiple layers of edge relationships. While there has been a great deal of work examining community structure in general, and some work on the problem of sampling a network to preserve its community structure, to the best of our knowledge, this is the first work to consider this problem on multiplex networks. We consider the specific case in which the layers of a multiplex network have different query (collection) costs and reliabilities; and a data collector is interested in identifying the community structure of the most expensive layer. We propose MultiComSample (MCS), a novel algorithm for crawling a multiplex network. MCS uses multiple levels of multi-armed bandits to determine the best layers, communities and node roles for selecting nodes to query. We test MCS against six baseline algorithms on real-world multiplex networks, and achieved large gains in performance. For example, after consuming a budget equivalent to sampling 20% of the nodes in the expensive layer, we observe that MCS outperforms the best baseline by up to 49%.

More Details

Efficient transfer learning for neural network language models

Proceedings of the 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2018

Skryzalin, Jacek S.; Link, Hamilton E.; Wendt, Jeremy D.; Field, Richard V.; Richter, Samuel N.

We apply transfer learning techniques to create topically and/or stylistically biased natural language models from small data samples, given generic long short-term memory (LSTM) language models trained on larger data sets. Although LSTM language models are powerful tools with wide-ranging applications, they require enormous amounts of data and time to train. Thus, we build general purpose language models that take advantage of large standing corpora and computational resources proactively, allowing us to build more specialized analytical tools from smaller data sets on demand. We show that it is possible to construct a language model from a small, focused corpus by first training an LSTM language model on a large corpus (e.g., the text from English Wikipedia) and then retraining only the internal transition model parameters on the smaller corpus. We also show that a single general language model can be reused through transfer learning to create many distinct special purpose language models quickly with modest amounts of data.

More Details

An Example of Counter-Adversarial Community Detection Analysis

Kegelmeyer, William P.; Wendt, Jeremy D.; Pinar, Ali P.

Community detection is often used to understand the nature of a network. However, there may exist an adversarial member of the network who wishes to evade that understanding. We analyze one such specific situation, quantifying the efficacy of certain attacks against a particular analytic use of community detection and providing a preliminary assessment of a possible defense.

More Details

A dynamic model for social networks

Field, Richard V.; Link, Hamilton E.; Skryzalin, Jacek S.; Wendt, Jeremy D.

Social network graph models are data structures representing entities (often people, corpora- tions, or accounts) as "vertices" and their interactions as "edges" between pairs of vertices. These graphs are most often total-graph models -- the overall structure of edges and vertices in a bidirectional or directional graph are described in global terms and the network is gen- erated algorithmically. We are interested in "egocentrie or "agent-based" models of social networks where the behavior of the individual participants are described and the graph itself is an emergent phenomenon. Our hope is that such graph models will allow us to ultimately reason from observations back to estimated properties of the individuals and populations, and result in not only more accurate algorithms for link prediction and friend recommen- dation, but also a more intuitive understanding of human behavior in such systems than is revealed by previous approaches. This report documents our preliminary work in this area; we describe several past graph models, two egocentric models of our own design, and our thoughts about the future direction of this research.

More Details

Estimating users’ mode transition functions and activity levels from social media

Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2017

Link, Hamilton E.; Wendt, Jeremy D.; Field, Richard V.; Marthe, Jocelyn

We present a temporal model of individual-scale social media user behavior, comprising modal activity levels and mode switching patterns. We show that this model can be effectively and easily learned from available social media data, and that our model is sufficiently flexible to capture diverse users’ daily activity patterns. In applications such as electric power load prediction, computer network traffic analysis, disease spread modeling, and disease outbreak forecasting, it is useful to have a model of individual-scale patterns of human behavior. Our user model is intended to be suitable for integration into such population models, for future applications of prediction, change detection, or agent-based simulation.

More Details
Results 1–25 of 45
Results 1–25 of 45