Problem
Azure at Microsoft has developed several diagnostic tools for developers to troubleshoot their app service hosted in the cloud. App crashes usually happen because of deployment error or unexpected API change, potentially impacting thousands of users. In the past, Azure could detect app crashes for users. However, it couldn’t provide the necessary info for IT admins to troubleshoot. To address that problem, we designed a new developer tool called “crash monitoring tool” to actively monitor app crashes, collect crash data, call stack, and provide insights for IT admins to troubleshoot the crash problems.
My role
UX designer
Who I worked with
UX researcher
Engineers
PM
Tools I used
Figma
Design Process
Persona
To understand more about the users, I reached out to researchers and read previous research reports of how IT administrators use Azure to troubleshoot and their current pain points. Built upon that, I drew the persona to guide my later design.
Scenarios & User flow
Joe is an IT admin in company X who is to monitor app health hosted in Azure. He monitored a web app down event. Joe starts to troubleshoot the issue in the App service, and checks showed there were app crashes in the last 24 hours. He wants to find out what caused the app crashes and fix it.
Ideation
I started with three design questions covering the main user flow and ideate as many as drafts to diverge.
Design Question 1: How to guide users to enable and configure the tool?
Design Question 2: How to show collected data in “Analyze”?
Design Question 3: How to let users assign storage account or create a new one?
Wireframe
Usability Testing & Major Iterations
I did 1 round of usability testings on the early low-fi prototype with three internal users. Here are some major findings and iterations.
1. Users want more granularity in configuring the duration.
User quote: “I want to have more controls in the time duration.”
Iteration: Use “End time” instead of “Max hours”.
2. Users don’t know what is the stage of the data collections. Users want to see configurations after start monitoring.
User quote: “I don’t know whether the data monitoring session is done.”
Iteration: Add icons to clearly communicate with users what’s the current stage of monitoring. Use subtexts in accordion to show configuration info.
3. Users want to know where to see the past data after they start a new monitoring.
User quote: “Once I click on the restart a monitor, where can I find the historical data?“
Iteration: Use accordion and add “view history” session to store the past data.
Hi-fi Deliverables
1. Configure the tool
Before data collection, users can configure the tools by telling the system where, when, and how much data they want to collect.
2. View collected data
After users click on the “start monitor” button, the tool starts data collection. If crashes happen, users can see a grouped insights of detected crashes in real-time. Users can expand each insight to see details of crashes and view call stack information to troubleshoot.
3. Check data history
If users want to compare the current crash data to the historical data to see if there is a recurring pattern, they can click “view history” to see a list of crash data in the past 15 days.
Impact
The tool has been released in Aug 2020, and is actively used by thousands of Azure users.
Reflection
1. Receive feedback and always iterate.
I presented my work to internal design critique meetings during design to receive feedback in patterns and design language we use. I reached out to PM to get users to validate my design decisions. Besides, I also embraced developers’ feedback regarding technical constraints and figured out feasible solutions together. Receiving feedback and iterating help the project move forward and ensure a great user experience.
2. How to articulate design decisions is important.
In lots of presentations, design critiques with other designers, and sync-up meetings with stakeholders, I found how I communicate my design with others is critical. To help communicate efficiently, I need to prepare well before each meeting, think ahead about potential pushback, have all my design explorations/iterations at hand, and get ready to listen and respond effectively with their perspectives.
3. Involve engineers early to understand technical knowledge.
Since Azure is very technical, I proactively reach out to the developers to ask questions and learn the big picture before diving into the details. Their demonstration and explanations of all the technical terms are great intuitive resources to learn. It helps me wear their shoes, speak their language, and design better products for them.