Behind every complex system be it physical, social, biological, or man-made, there is an intricate network encoding the interactions between its components. Learning over large-scale networks is a challenging field, and practical methods must combine scalability in computations to cope with millions of nodes associated possibly with large amounts of meta-information; along with sufficient versatility to capture the elaborate structure and dynamics of the complex phenomena under study. There is also a need for modeling expressiveness to ensure accurate learning, along with transparency and interpretability that will shed light on the overall system understanding, and will provide valuable insights about its function. Approaches to learning over networks must also defend against adversarial behavior, thus remaining operational even under severely adverse circumstances. The contribution of the present thesis lies at addressing the aforementioned challenges by developing simple yet versatile algorithmic solutions focused on core graph-learning tasks. To tackle active sampling for semi-supervised node classification, a novel framework is proposed in order to guide the sampling of informative nodes. The proposed framework is tailored to Gaussian-Markov random fields, and relies on the notion of maximum expected-change to select the most informative node to be labeled. Interestingly, several existing methods for active learning are subsumed by the proposed approach. Focusing on the node classification task, a generalized yet highly scalable diffusion-based classifier is developed, where each class diffusion is adaptive to the graph structure and the underlying label distribution. Adaptability is further leveraged for the node embedding task. As node embedding is naturally viewed as a low-rank factorization of a node-to-node similarity matrix, a versatile approach is introduced to learn the similarity matrix of a given graph with minimal computational overhead, and in a fully unsupervised manner. Extensive experimentation using both synthetic graphs as well as numerous real networks demonstrates the effectiveness, interpretability and scalability of the proposed methods. More importantly, the process of design and experimentation sheds light on the behavior of different methods and the peculiarities of real-world data, while at the same time generates new ideas and directions to be explored.