deepset Ui Ux Reviewer Agent

Ui Ux Reviewer Agent

agentic-aiagenticagentsgenaiAIhaystack-cookbookgenai-usecaseshaystack-ainotebooksPythonragai-tools

alph-notebooks/haystack-cookbook / ui_ux_reviewer_agent.ipynb

Export

Run Notebooks

Contents

No cells yet

Add cells to see them here

Domain-Aware UI/UX Reviewer Agent: Custom Tools with Retry and Fallback

Notebook by Mayank Laddha

In this notebook, we'll build an agent to analyze a web page using both desktop and mobile views. We define simple, optional tools to obtain readability scores, HTML structure, and performance-related data as needed.

Additionally, we'll extend the Tool class to implement retry and fallback mechanisms for tools.

Our stack:

Haystack Agentic framework - to build our agent
Playwright - for taking screenshots of the website
Beautiful Soup - for scraping HTML text content
Textstat - for calculating text readability

Setting up the Environment

First, we'll install all the necessary dependencies:

[ ]

Add credentials for OpenAI API

[2]

Enter OpenAI API key:··········

[3]

Defining Tools and Utility Functions

Let's define a function to capture desktop and mobile screenshots along with the HTML using Playwright. We use Playwright's asynchronous (async) API when running in Google Colab, since the synchronous Playwright API can have issues in the Colab environment. We also define a function that returns the root url.

[4]

Let's now define the tool functionalities:

html_analyzer - to get the HTML structure
get_readability_score - to get readability score using textstat
get_performance - we intentionally force it to fail to see if retry and fallback work
get_performance_fallback - fallback functionality for the get_performance tool
image_weight_analyzer - get image count along with the size of the largest image

[5]

Creating a Custom Tool with Retry and Fallback Mechanism

This CustomTool extends Haystack’s Tool class to add retry logic and a fallback mechanism.

Tools can fail for many practical reasons (unstable networks, unreliable APIs)
Fallback functions often do not share the exact same signature as the main tool.
We want fallbacks to:
- Reuse tool inputs when possible
- Accept a subset of parameters or no parameters

[6]

Creating the UI/UX Reviewer Agent

Let's define our workflow:

Get desktop and mobile screenshots along with the HTML using playwright through the function we defined earlier
Use Tool class to convert functionalities into tools. Note that we have set a retry and a fallback function for the CustomTool responsible for fetching performance-related data
Define the agent, we use gpt4o-mini for testing

We keep max_agent_steps = 5 but feel free to increase it.

As the output, we'll receive a feedback based on:

Content (how suitable and effective the content is for the given domain)
Layout (layout quality, CTAs and ease of navigation)
Visuals and Performance

[7]

Testing the Agent

We can directly await in colab as it already has an active asyncio event loop running in the background by default. No need to worry about 'await only allowed within async function' error. This step will take some time ⏳

[9]

[TOOL CALL]
Tool: html_analyzer 
Arguments: {"url": "https://visitors.now"}

[TOOL CALL]
Tool: get_readability_score 
Arguments: {"url": "https://visitors.now"}

[TOOL CALL]
Tool: image_weight_analyzer 
Arguments: {"url": "https://visitors.now"}

[TOOL CALL]
Tool: performance 
Arguments: {"url": "https://visitors.now"}

[TOOL RESULT]
{"sections": {"header": true, "nav": true, "main": false, "footer": true}, "headings": {"h1_count": 1, "h2_count": 12, "h3_count": 18}, "ctas": {"buttons": 37, "links": 36, "primary_cta_present": true}, "forms": {"form_count": 0, "input_count": 0}, "media": {"images": 5, "videos": 0}, "complexity": {"max_dom_depth": 24, "div_ratio": 0.47}}

WARNING:__main__:
Tool `performance` failed after 2 attempts. Using fallback function. Error: HTTPSConnectionPool(host='www.invalid-url-for-testing.com', port=443): Max retries exceeded with url: /?url=https%3A%2F%2Fvisitors.now&category=performance (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x79ade0e40a70>: Failed to resolve 'www.invalid-url-for-testing.com' ([Errno -2] Name or service not known)"))

[TOOL RESULT]
Flesch-Kincaid Grade: 10.215740740740742
(This is a grade formula in that a score of 9.3 means that a ninth grader would be able to read the document.)

[TOOL RESULT]
{"image_count": 5, "largest_image_pixels(w * h)": 5184000}

[TOOL RESULT]
{"request_duration_ms": 138.15099999999998, "html_kb": 159.4765625}

[ASSISTANT]
# UI/UX Analysis of Visitors.now

## Domain Overview
Based on the URL and content, it appears that this website is focused on visitor analytics or tracking. This domain typically requires clear, concise information and effective data visualization to help users understand their visitor metrics.

## Content
### Positive Feedback
- **Clarity**: The content is generally clear and informative, which is essential for users looking to understand analytics.
- **Headings**: The use of multiple headings (1 H1, 12 H2s, and 18 H3s) helps in organizing information, making it easier for users to scan through.

### Negative Feedback
- **Readability**: The Flesch-Kincaid Grade level is around 10, which might be a bit high for some users. Simplifying the language could make it more accessible.
- **Content Density**: There’s a lot of information presented, which could overwhelm users. Consider breaking it down into smaller, digestible sections.

## Layout
### Positive Feedback
- **CTAs**: There are numerous CTAs (37 buttons and 36 links), which provide ample opportunities for user engagement.
- **Navigation**: The presence of a header and navigation elements suggests a structured layout, aiding in user navigation.

### Negative Feedback
- **Overwhelming CTAs**: While having many CTAs is good, it can also lead to decision fatigue. Prioritize key actions and reduce the number of less important links.
- **Visual Hierarchy**: The layout could benefit from a clearer visual hierarchy to guide users through the content more intuitively.

## Visuals
### Positive Feedback
- **Images**: The website includes 5 images, which can enhance understanding and engagement.
- **Responsive Design**: The mobile view appears to be well-structured, maintaining accessibility across devices.

### Negative Feedback
- **Visual Quality**: Ensure that images are optimized for faster loading times without sacrificing quality. The largest image size is quite large (5184000 pixels), which could slow down performance.
- **Consistency**: The design elements should be consistent throughout the site to create a cohesive experience.

## Performance
### Positive Feedback
- **Request Duration**: The request duration is relatively quick (138 ms), which is good for user experience.

### Negative Feedback
- **HTML Size**: The HTML size is about 159 KB, which is manageable but could be optimized further for faster loading, especially on mobile devices.

## Actionable Improvements
1. **Simplify Language**: Lower the readability grade by using simpler terms and shorter sentences.
2. **Prioritize CTAs**: Reduce the number of CTAs and focus on the most important actions to avoid overwhelming users.
3. **Enhance Visual Hierarchy**: Use size, color, and spacing to create a clearer visual hierarchy, guiding users through the content.
4. **Optimize Images**: Compress images to improve loading times without losing quality.
5. **Test Performance**: Regularly monitor performance metrics to ensure the site remains fast and responsive.

By addressing these areas, the website can enhance user experience, making it more effective for its intended audience.

By enabling streaming with streaming_callback=print_streaming_chunk, we can observe tool calls, tool outputs, and the agent's final response in real time.

Let's check if the taken screenshots were correct

[10]

Rendering the Feedback

Display the feedback properly as markdown.

[ ]

Inspecting the Tool Calls and Results

[11]

[[ToolCall(tool_name='html_analyzer',
           arguments={'url': 'https://visitors.now'},
           id='call_K0TwgDTAHvVhnBrQmqDc0z2w',
           extra=None),
  ToolCall(tool_name='get_readability_score',
           arguments={'url': 'https://visitors.now'},
           id='call_WOpxYRxtj0GwbzVc8q0zFNK9',
           extra=None),
  ToolCall(tool_name='image_weight_analyzer',
           arguments={'url': 'https://visitors.now'},
           id='call_UtppthsQewEYC99o0XdH26gn',
           extra=None),
  ToolCall(tool_name='performance',
           arguments={'url': 'https://visitors.now'},
           id='call_cN22KmIGSfWhdHRTAREmx98B',
           extra=None)]]

[12]

[[ToolCallResult(result='{"sections": {"header": true, "nav": true, "main": '
                        'false, "footer": true}, "headings": {"h1_count": 1, '
                        '"h2_count": 12, "h3_count": 18}, "ctas": {"buttons": '
                        '37, "links": 36, "primary_cta_present": true}, '
                        '"forms": {"form_count": 0, "input_count": 0}, '
                        '"media": {"images": 5, "videos": 0}, "complexity": '
                        '{"max_dom_depth": 24, "div_ratio": 0.47}}',
                 origin=ToolCall(tool_name='html_analyzer',
                                 arguments={'url': 'https://visitors.now'},
                                 id='call_K0TwgDTAHvVhnBrQmqDc0z2w',
                                 extra=None),
                 error=False)],
 [ToolCallResult(result='Flesch-Kincaid Grade: 10.215740740740742\n'
                        '(This is a grade formula in that a score of 9.3 means '
                        'that a ninth grader would be able to read the '
                        'document.)',
                 origin=ToolCall(tool_name='get_readability_score',
                                 arguments={'url': 'https://visitors.now'},
                                 id='call_WOpxYRxtj0GwbzVc8q0zFNK9',
                                 extra=None),
                 error=False)],
 [ToolCallResult(result='{"image_count": 5, "largest_image_pixels(w * h)": '
                        '5184000}',
                 origin=ToolCall(tool_name='image_weight_analyzer',
                                 arguments={'url': 'https://visitors.now'},
                                 id='call_UtppthsQewEYC99o0XdH26gn',
                                 extra=None),
                 error=False)],
 [ToolCallResult(result='{"request_duration_ms": 138.15099999999998, '
                        '"html_kb": 159.4765625}',
                 origin=ToolCall(tool_name='performance',
                                 arguments={'url': 'https://visitors.now'},
                                 id='call_cN22KmIGSfWhdHRTAREmx98B',
                                 extra=None),
                 error=False)]]

We see in the last ToolCallResult that the agent originally called the performance tool but since we forced that tool to fail, the input was passed to the fallback function get_performance_fallback that returned "request_duration_ms" and "html_kb" info instead.

That's it! 🎉 If you want to test this with a UI, check out the Domain Aware UI/UX Reviewer Agent demo