3D shape reconstruction from a single image is a highly ill-posed problem. Modern deep learning based systems try to solve this problem by learning an end-to-end mapping from image to shape via a deep network. In this paper, we aim to solve this problem via an online optimization framework inspired by traditional methods. Our framework employs a deep autoencoder to learn a set of latent codes of 3D object shapes, which are fitted by a probabilistic shape prior using Gaussian Mixture Model (GMM). At inference, the shape and pose are jointly optimized guided by both image cues and deep shape prior without relying on an initialization from any trained deep nets. Surprisingly, our method achieves comparable performance to state-of-the-art methods even without training an end-to-end network, which shows a promising step in this direction.