Animal tracking and pose estimation are core topics in neuroscience. However, for monkeys, current deep learning based algorithms often fail to perform well on segmentation and dense keypoints estimation due to the lack of annotated training data. In this thesis, we address this challenge by developing transfer learning based deep learning algorithms without using fully-annotated monkey data. We develop a bootstrapping strategy to refine the pretrained segmentation model on monkey data annotated with 2D sparse landmarks. In addition, we implement a voxel-based visual hull reconstruction approach to recover the 3D monkey pose from the silhouettes. For dense keypoints estimation, we follow a similar bootstrapping strategy to refine a pretrained HRNet, which is then used to learn a dense keypoint detector by leveraging multiview consistency. Our methods outperform the baseline methods on in-the-cage and in-the-wild monkey data.