summer_apple_1900x1200.jpg

Clustering Problems @ Apple

Designing and developing an application to easily visually digest machine learning outputs from a data set of seven million Apple support discussions.

Apple Customer Support Communities

I spent the Summer of 2015 in sunny Cupertino as an Engineering intern at Apple. In my role, I worked closely with a machine learning scientist to understand the dataset and then design and implement a Swift application that leveraged machine learning outputs to identify trending topics on a database of over a billion support discussion threads. My team managed the Apple Discussions site where the public can come to search, pose questions, explore communities and find solutions. 


Role

Software Engineering Intern

Focus

data visualization

Tools

Swift, adobe photoshop, Mallet

Company

Apple

Team

discussions Customer Support

discussions.apple.com

Time

Summer 2015


A friend and I were Intern iContest Finalists and had the opportunity to be one of a few teams to present a new idea to Apple executives at the end of the summer. It was a novel feature for maps with an innovative approach to searching with clusters.  Quite fun to be able to present our ideas and see the features of the future for Apple! 


Challenge 

Discussions.apple.com is a site where the public can come to search, pose questions, explore communities and find solutions. This site has been up since 2000 and as of 2015 had a corpus of over a billion discussions. Because of Apple’s massive growth, this data had not been used for anything but historical records. There was a need to help Apple and particularly the community managers of the various support communities better understand their users to improve the customer experience. 

Need - Optimize Taxonomy of Discussions sites

The categories and their respective possible connections.

How might we gain insight about problems and what people are talking about within discussions.apple.com ?


Process to Solution

I designed and developed a swift iPad application for understanding the vast amount of data on how users are interacting with the Apple Support Discussions site. My application enhances efficiency for optimal management of the discussions site by allowing community managers to see related topics or words, discussion threads that are linked, and also the ability to rename or reclassify topics.   

Solution - Display Taxonomy Element Candidates from the Discussions data in a Digestible manner

One of the major challenges as an intern was to jump in and create connections to understand the current workflow of managing the discussion site and how it could be improved. Below is the elaborate machine learning process we were able to come up with to output, after some testing, the connections of topics, words, and documents would be best utilized by the community managers. The early functional prototype to the right demonstrates the basic functionality that was possible early on with the application. This simple prototype was user tested with community managers to see if we were creating an application that solved their challenges.  

The first week of the summer was focused on learning the various languages and technologies of the team like Xcode, Swift, Jive, and Mallet machine learning to name a few. 

The documents and images I can share are limited and do not include anything further than the beginning stages of the functional prototype as those are owned by Apple.


Impact to Apple

The impact on Apple was tremendous. Not only had this data never been looked at in the aggregate but it had never been thrown into any machine learning algorithms. By optimizing the data through machine learning, the challenge is no longer for Community Managers to manually read the millions of messages but for my application to leverage the needs of apple customers.

  • Increase efficiency

  • pull out users’ relevant needs in real-time

    • (There was a two-week lag between events on the site to when community managers learned about site trends)

  • making informed decisions

  • auto-tagging relevant categories

  • auto fight spam by clustering for deletion

  • see avenues of need for geo-expansion

  • improve categories to not just predict but be reactive to user needs


For privacy, I have omitted and obfuscated confidential information in this case study. The information in this case study is my own and does not necessarily reflect the views of Apple.