Machines are learning to see. In the field of computer vision, AI is making leaps in image recognition, powering exciting advances such as the cashier-less retail experience. On Episode 21 of AI at Work, we spoke with Matt Scott - CTO and co-founder at Malong Technologies - about product AI applications, the evolution of computer vision, an innovative and highly effective new approach to supervised learning, and why staying at the forefront of AI means cross-company collaboration.Founded in 2014, Malong Technologies is a computer vision company which focuses on retail and industrial areas where product recognition can come into play. Matt defines product AI as “a core technology for recognizing products without barcodes in videos or images.” Smart retail, he describes, “takes this type of technology into the offline shopping experience to enable unmanned shopping or to help prevent theft by double-checking the checkout experience.”
[In addition to smart retail, Malong Technologies also applies this technology to building smart cabinets - in essence AI refrigerators - which charge a person for whatever is taken out of them via the popular Chinese WeChat platform. They also apply product recognition technology to quality and defect detection in industrial quality cases.]
The evolution of computer vision
Matt Scott has watched the field of computer vision evolve over the past 15 years. We asked him - what’s changed? Mainly, the techniques involved in feature recognition have evolved. Features, Matt defines, are what essentially “the computer is looking at to describe an image so that you can apply additional machine learning techniques to.”
Prior to 2012, traditional computer vision techniques were based on hand-crafted features, meaning that a human being would interpret an image and represent it in numeric or feature vector terms, building “rules and if/else rules to pick out something interesting or differentiating in the image,” Matt explains, to which algorithms could then be applied to.
He describes that a revolution came in 2012, when it was decided “let's not handcraft these features. Let's use a machine to learn the features. Then, at the same time, let's also build models on top of those features. Build end-to-end networks where we're extracting features and learning from those features.” In essence, for modern techniques “instead of writing code to operate on images, we create a machine that writes its own code,”.
This type of deep learning has been revolutionary for performance. He mentions that often there is a lot of hype around AI, but that there is also something “very real here, which is the performance of these models on benchmarks. We were able to see before 2012, when we were using traditional techniques, and now, after 2012, where we're using deep learning techniques, a massive difference in the benchmarks. We've, essentially, approached human-level performance on a number of computer vision benchmarks and, in some cases, that they even exceeded human-level performance, reaching superhuman-level performance.”
Weakly supervised learning: a leap forward in scalable algorithm training with real world data
A significant challenge in the field of computer vision is that the datasets used to train AI algorithms are notoriously difficult to put together and the resulting trained algorithms are not always able to work on data that’s noisy and raw.
For instance, the ImageNet dataset, which is standard for computer vision, took 2 years and 50,000 people to label, curate, and balance. Such datasets are not necessarily representative of real-world computer vision challenges that a lot of companies might be interested in trying to solve. They may be good for benchmarking algorithms, but really can’t scale for needed applications.
Using deep learning, Malong technologies found a unique way to be able to train algorithms on representative, real world data, which they’ve dubbed weakly supervised learning. Matt defines weakly supervised learning as “the ability for the machine to learn from noisy data in such a way that it's as performant as learning from clean data. Using data naturally, as it is chaotic and random and sometimes wrong or imbalanced. There's a whole set of techniques in this field of learning strategy.”
Becoming a leader in AI means sharing your knowledge
Scott emphasizes that AI is different from past technical revolutions, where advances in technology were jealously guarded and siloed. Speaking about Malong’s leap forward with weakly supervised learning, he emphasizes that “we really want deep learning to work and land for everyone. We do encourage other companies to start taking up the weakly supervised learning approach. We did not just release a peer-reviewed paper-paper. We also released some code, and even some models, as well, to help people get started.”
The argument is that in order to differentiate yourself in the field of AI, you have to be at the forefront of cutting edge technology, and to be at the forefront you need to be in constant collaboration with others. His company’s philosophy - “putting our stuff out there to get the community involved can help us, and then help everyone else. We believe our differentiator is that we are continuously at the forefront, at the state of the art. That's where we have our differentiation.”
From cashierless stores to automated image-based quality control in production, computer vision is having an impact on our world, and making huge leaps in evolution in the past 15 years. As AI continues to take off, Scott shares some final advice for those who want to get started and bring AI into their companies: set your expectations properly, recognizing that machine learning rarely has a 100% accuracy, understand the technology and identify a concrete use case in your business, and work with experts to execute.