Rearranging objects on a tabletop surface by means of nonprehensile manipulation is a task that requires skillful interaction with the physical world. Usually, this is achieved by precisely modeling the physical properties of the objects, robot, and environment for explicit planning. In contrast, explicitly modeling the physical environment is not always feasible and involves various uncertainties. Therefore, we learn a nonprehensile rearrangement strategy with deep reinforcement learning based on only visual feedback. To do this, we model the task with rewards and train a deep Q-network. Our potential field-based heuristic exploration strategy reduces the number of collisions that lead to suboptimal outcomes, and we actively balance the training set to avoid bias towards poor examples. Our training process leads to quicker learning and better performance on the task compared to uniform exploration and standard experience replay. We demonstrate empirical evidence from simulation that our method leads to a success rate of 85%. We also show that our system can cope with sudden changes in the environment and compare our performance with human-level performance.