Deep Learning-based chest Computed Tomography (CT) analysis has been proven to be effective and efficient for COVID-19 diagnosis. Existing deep learning approaches heavily rely on large labeled data sets, which are difficult to acquire in this pandemic situation. Therefore, weakly-supervised approaches are in demand. In this paper, we propose an end-to-end weakly-supervised COVID-19 detection approach, ResNext+, that only requires volume level data labels and can provide slice level prediction. The proposed approach incorporates a lung segmentation mask as well as spatial and channel attention to extract spatial features. Besides, Long Short Term Memory (LSTM) is utilized to acquire the axial dependency of the slices. Moreover, a slice attention module is applied before the final fully connected layer to generate the slice level prediction without additional supervision. An ablation study is conducted to show the efficiency of the attention blocks and the segmentation mask block. Experimental results, obtained from publicly available datasets, show a precision of 81.9% and F1 score of 81.4%. The closest state-of-the-art gives 76.7% precision and 78.8% F1 score. The 5% improvement in precision and 3% in the F1 score demonstrate the effectiveness of the proposed method. It is worth noticing that, applying image enhancement approaches do not improve the performance of the proposed method, sometimes even harm the scores, although the enhanced images have better perceptual quality.