20250701: Internship at Inspur Cloud Technology Company

Summary of Programme:

· Planned daily tasks in advance; held bi-weekly progress reviews with colleagues; incorporated user feedback to refine the output.
· Collaborated in proofreading and finalising the English version of Inspur Enterprise Cloud Brochure, ensuring the professionalism and accuracy of international promotional materials.
· Developed Python multithreaded OCR processing system using Baidu Intelligent Cloud API to convert PDF medical books into structured text data.
· Designed text post-processing algorithms to automatically identify chapter headers, extract disease names, and reconstruct semantic paragraphs, implemented automated formatting correction for common OCR issues including mixed punctuation and hierarchical misalignment.
· Processed over 1000 pages of professional contents to generate high-quality data samples meeting AI training requirements.

Screenshot for Document 1:

Screenshot for Document 2:

Material Downloads: