Summary of Programme:
· Executed advanced data cleaning. Standardised product data using regular expressions and brand mapping tables to establish a high-quality data foundation for analysis.
· Applied the synthetic control method to match optimal control stores for test stores, and evaluated the impact of new shelf layouts on sales through statistical significance testing.
· Synthesized analytical findings to produce strategic recommendation reports, assisting the Category Manager in making data-driven business decisions.
What I have learned:
· When text fields such as product names contain structured information (e.g., brand, specifications, model), use regular expressions and text parsing rules to accurately and standardly extract key features such as brand and size.
· When the same business entity (such as a brand) has multiple variants such as abbreviations, full names, and spelling errors, identify these variants through frequency analysis and similarity matching, and construct an “original-standard” mapping table for batch replacement to achieve data unification.
· When it is necessary to evaluate the actual effect of interventions such as marketing campaigns or product updates, employ the synthetic control method to scientifically match control groups and combine it with statistical significance testing to quantitatively determine whether the activity is effective.
Screenshot for Document 1:





Screenshot for Document 2:









Screenshot for Document 3:










