Pose variation is one of the tough challenges in the area of face alignment. In this paper, we showed how a framework based on convolutional neural networks (CNN) and 3D morphable models (3DMM), can explicitly handle pose variations for robust facial landmark localization. Since human faces are usually horizontally symmetric, a left-looking face (from the viewer's perspective) is equivalent to a right-looking face after a horizontal flip. Based on the symmetry, we focus on frontal and right-looking faces. We divided landmarks into two categories, SL (stable landmarks) and UL (unstable landmarks), according to their visibility across poses. A sophisticated CNN model was trained to directly estimate the SLs, whereas a following 3DMM model generated the remaining ULs. A series of experiments were conducted on popular datasets, such as 300-W, COFW, and AFLW. The results showed that the proposed method reduced errors for large-pose samples without degrading the performance of semi-frontal faces, thus demonstrating the superiority and robustness of our method.