Traffic volume estimation at the city scale is an important problem useful to many transportation operations and urban applications. This paper proposes a hybrid framework that integrates both state-of-art machine learning techniques and well-established traffic flow theory to estimate citywide traffic volume. In addition to typical urban context features extracted from multiple sources, we extract a special set of features from GPS trajectories based on the implications of traffic flow theory, which provide extra information on the speed-flow relationship. Using the network-wide speed information estimated from a travel speed estimation model, a volume related high level feature is first learned using an unsupervised graphical model. A volume re-interpretation model is then introduced to map the volume related high level feature to the predicted volume using a small amount of ground truth data for training. The framework is evaluated using a GPS trajectory dataset from 33,000 Beijing taxis and volume ground truth data obtained from 4,980 video clips. The results demonstrate effectiveness and potential of the proposed framework in citywide traffic volume estimation.