Abstract:The accumulation of protein sequence and structure data allows researchers to obtain large amount of descriptive information, simultaneously it poses an urgent need for researchers to extract information from existing data efficiently and apply it to downstream tasks. Protein design enables the development of novel proteins that are no longer restricted by experimental conditions, which is of great significance for drug target prediction, drug discovery, and material design. As an efficient method for data feature extraction, deep learning can be used to model protein data, and further add a priori information to design novel proteins. Therefore, protein design based on deep learning has become a promising approach despite of many challenges. This review summarizes the deep learning-based modeling and design methods of protein sequence and structure data, highlighting the strategies, principle, scope of application and case studies, with the aim to provide a valuable reference for relevant researchers.