Risk prediction using electronic health records (EHR) is a challenging data mining task due to the two-level hierarchical structure of EHR data. EHR data consist of a set of time-ordered visits, and within each visit, there is a set of unordered diagnosis codes. Existing approaches focus on modeling temporal visits with deep neural network (DNN) techniques. However, they ignore the importance of modeling diagnosis codes within visits, and a lot of task-unrelated information within visits usually leads to unsatisfactory performance of existing approaches. To minimize the effect caused by noise information of EHR data, in this paper, we propose a novel DNN for risk prediction termed as LSAN, which consists of a Hierarchical Attention Module (HAM) and a Temporal Aggregation Module (TAM). Particularly, LSAN applies HAM to model the hierarchical structure of EHR data. Using the attention mechanism in the hierarchy of diagnosis code, HAM is able to retain diagnosis details and assign flexible attention weights to different diagnosis codes by their relevance to corresponding diseases. Moreover, the attention mechanism in the hierarchy of visit learns a comprehensive feature throughout the visit history by paying greater attention to visits with higher relevance. Based on the foundation laying by HAM, TAM uses a two-pathway structure to learn a robust temporal aggregation mechanism among all visits for LSAN. It extracts long-term dependencies by a Transformer encoder and short-term correlations by a parallel convolutional layer among different visits. With the construction of HAM and TAM, LSAN achieves the state-of-the-art performance on three real-world datasets with larger AUCs, recalls and F1 scores. Furthermore, the model analysis results demonstrate the effectiveness of the network construction with good interpretability and robustness of decision making by LSAN.