The advent of high-speed Internet, modern devices and global connectivity has introduced the world to massive amounts of data, that are being generated, communicated and processed daily. Extracting meaningful information from this humongous volume of data is becoming increasingly challenging even for high-performance and cloud computing platforms. While critically important in a gamut of applications, clustering is computationally expensive when tasked with high-volume high-dimensional data. To render such a critical task affordable for data-intensive settings, this thesis introduces a clustering framework, named random sketching and validation (SkeVa). This framework builds upon and markedly broadens the scope of random sample and consensus RANSAC ideas that have been used successfully for robust regression. Four main algorithms are introduced, which enable clustering of high-dimensional data, as well as subspace clustering for data generated by unions of subspaces and clustering of large-scale networks. Extensive numerical tests compare the SkeVa algorithms to their state-of-the-art counterparts and showcase the potential of the SkeVa frameworks.