Summary of Programme:
In the Lianjia Rental Data Analysis Project:
· By studying source code, mastered the use of the Requests library to crawl API data and parse HTML pages, and learned regular expressions to extract key information.
· Learned non-relational data storage and management methods based on MongoDB, and mastered data deduplication techniques by setting unique indexes.
· Acquired the complete process of data cleaning and outlier handling using Pandas and NumPy.
· Learned to create statistical charts with Matplotlib and Seaborn, and mastered PyEcharts to build interactive visualizations.
In the Used Car Price Prediction Project:
· Managed a complex workflow; adapted based on feedback; delivered high-quality results under deadline.
· Applied data processing methods learned from the previous project to perform Box-Cox transformation and systematic cleaning on a training set of 150,000 records.
· Identified highly correlated variables using VIF and correlation analysis, and adopted a grouped modeling strategy to build a multiple linear regression model.
· Achieved prediction on a test set of 50,000 vehicle records, with the model’s Mean Absolute Error (MAE) controlled within 5,000 RMB.
Screenshot for Document:






Material Downloads:
