Tester's Digest

A weekly source of software testing news


ISSUE #70 - August 26, 2018

Application of machine learning / artificial intelligence techniques to the field of software testing is a hot area these days. Let’s try to look past the hype to specific practical uses of ML/AI in testing, and consider the promise and the limits of this approach.

Topic: AI For Testing

Deep theoretical analysis of how ML could be used to develop testing tools that learn. The first post covers supervised/unsupervised vs reinforcement learning and its reward model. The second one goes deeper into Markov decision process as applied to a “gridworld” of features and bugs. The author illustrates (with Pac-Man example and working Python code) limitations and reasonable expectations of AI-based testing. Worth a read!



Examples of practical applications of AI in visual testing by Applitools, using ML for test failure analysis at Dell, “spidering” tools like Mabl that auto-generate testcases after learning your app, and then run the tests by diffing their results from expected output which was learned (like Testim).



Diffblue is an AI-based testcase generator that uses reinforcement learning and solver search techniques to make the ML system go beyond the training set of test examples, generalizing so it can produce new unit tests, given a codebase (only Java for now). Another AI-based tool, Security Risk Detection from Microsoft, uses constraint solving ML to increase test coverage through “whitebox fuzzing” of inputs.


Newcomers like Test.AI promise AI-powered testing through the GUI that automatically identifies testcases for web page elements and user workflows, and can tolerate frontend changes. That’s a beautiful vision, let’s see if the well-funded startup can get there. Their older post (under Appdiff name) explains how they train ML to recognize app state labels from 300k labeled screenshots.



Gaming company uses AI to proactively detect bugs as developers write code:


Netflix uses data science technique of predictive modeling to locate content that’s more likely to be buggy (poor recording, audio/video mismatch, subtitle snafus) so it can be tested by humans:



My beloved crowdsourced testing service, RainforestQA, hasn’t written a recent public post about the details of ML that is powering their framework; they have, however, written about the deficiencies in QA process on the Death Star.


If you received this email directly then you’re already signed up, thanks! Else if this newsletter issue was forwarded to you and you’d like to get one weekly, then you can subscribe at http://testersdigest.mehras.net

If you come across content worth sharing, please send me a link at testersdigest@mehras.net