Very large data sets with large numbers of attributes are currently only able to be processed on expensive and complex computing clusters. The present technology is a new method which takes advantage of the modern large (1GB+) onboard GPU device memory as well as large onboard CPU RAM in order to map the building of a single randomized decision tree to a GPU, multiple GPUs or multiple CPUs. The technology can be applied to very large datasets of training data on commodity hardware for a fraction of the startup and runtime costs required to perform the same learning method on a cluster.